Data Format of Peak List Files: Difference between revisions
mNo edit summary |
No edit summary |
||
| Line 16: | Line 16: | ||
| nam | | nam | ||
| spectra id | | spectra id | ||
| | | char array | ||
| rowspan="33" style="background: #ffffff;" valign="top" | [[File:Peaklist-format-C-struc.jpg|250px|thumb|center|Matlab screenshot - format of a peak list file (*.pkf) demonstrating the general structure of the structure array 'C'. In this example the metadata of peak list #1 are shown.]] | | rowspan="33" style="background: #ffffff;" valign="top" | [[File:Peaklist-format-C-struc.jpg|250px|thumb|center|Matlab screenshot - format of a peak list file (*.pkf) demonstrating the general structure of the structure array 'C'. In this example the metadata of peak list #1 are shown.]] | ||
|- | |- | ||
| gen | | gen | ||
| genus information | | genus information | ||
| | | char array | ||
|- | |- | ||
| spe | | spe | ||
| species info | | species info | ||
| | | char array | ||
|- | |- | ||
| str | | str | ||
| strain info | | strain info | ||
| | | char array | ||
|- | |- | ||
| typ | | typ | ||
| type | | type | ||
| | | char array | ||
|- | |- | ||
| uid | | uid | ||
| Line 45: | Line 45: | ||
| gti | | gti | ||
| cultivation conditions: growth time | | cultivation conditions: growth time | ||
| | | char array | ||
|- | |- | ||
| tem | | tem | ||
| cultivation conditions: cultivation temperature | | cultivation conditions: cultivation temperature | ||
| | | char array | ||
|- | |- | ||
| air | | air | ||
| cultivation conditions: cultivation under aerobic or anaerobic conditions | | cultivation conditions: cultivation under aerobic or anaerobic conditions | ||
| | | char array | ||
|- | |- | ||
| med | | med | ||
| cultivation conditions: cultivation medium | | cultivation conditions: cultivation medium | ||
| | | char array | ||
|- | |- | ||
| spo | | spo | ||
| spore formers (YES or NO) | | spore formers (YES or NO) | ||
| | | char array | ||
|- | |- | ||
| con | | con | ||
| sample concentration | | sample concentration | ||
| | | char array | ||
|- | |- | ||
| trt | | trt | ||
| sample treatment | | sample treatment | ||
| | | char array | ||
|- | |- | ||
| ext | | ext | ||
| extra information | | extra information | ||
| | | char array | ||
|- | |- | ||
| las | | las | ||
| laser parameters (power, diameter, frequency, etc.) | | laser parameters (power, diameter, frequency, etc.) | ||
| | | char array | ||
|- | |- | ||
| cal | | cal | ||
| calibration info | | calibration info | ||
| | | char array | ||
|- | |- | ||
| met | | met | ||
| measurement method | | measurement method | ||
| | | char array | ||
|- | |- | ||
| cus | | cus | ||
| customer info | | customer info | ||
| | | char array | ||
|- | |- | ||
| tim | | tim | ||
| date and time of measurement | | date and time of measurement | ||
| | | char array | ||
|- | |- | ||
| pth | | pth | ||
| path to spectrum | | path to spectrum | ||
| | | char array | ||
|- | |- | ||
| pik | | pik | ||
| Line 117: | Line 117: | ||
| seq | | seq | ||
| sequence of preprocessing steps | | sequence of preprocessing steps | ||
| | | char array | ||
|- | |- | ||
| smo | | smo | ||
| Line 137: | Line 137: | ||
| red | | red | ||
| data reduction factor (spectral binning) | | data reduction factor (spectral binning) | ||
| | | char array | ||
|- | |- | ||
| cut | | cut | ||
| cut in the spectral domain | | cut in the spectral domain | ||
| | | char array | ||
|- | |- | ||
| mod | | mod | ||
| Line 149: | Line 149: | ||
| prm | | prm | ||
| parameters of peak detection | | parameters of peak detection | ||
| | | char array | ||
|- | |- | ||
| ccl | | ccl | ||
| Line 221: | Line 221: | ||
| ncl | | ncl | ||
| calibration info required to store the spectrum in a Bruker-specific data format | | calibration info required to store the spectrum in a Bruker-specific data format | ||
| | | char array | ||
|- | |- | ||
| ncr | | ncr | ||
| calibration info required to store the spectrum in a Bruker-specific data format | | calibration info required to store the spectrum in a Bruker-specific data format | ||
| | | char array | ||
|- | |- | ||
| bid | | bid | ||
| hardware id of the spectrum | | hardware id of the spectrum | ||
| | | char array | ||
|- | |- | ||
| org | | org | ||
| manufacturer info | | manufacturer info | ||
| | | char array | ||
|- | |- | ||
| tfu | | tfu | ||
| manufacturer info | | manufacturer info | ||
| | | char array | ||
|- | |- | ||
| tfu | | tfu | ||
| software info, required for compatibility issues | | software info, required for compatibility issues | ||
| | | char array | ||
|- | |- | ||
| spm | | spm | ||
| type of instrumentation | | type of instrumentation | ||
| | | char array | ||
|- | |- | ||
| stp | | stp | ||
| type of measurement (should be 'TOF') | | type of measurement (should be 'TOF') | ||
| | | char array | ||
|- | |- | ||
| acq | | acq | ||
| path to the original spectrum | | path to the original spectrum | ||
| | | char array | ||
|} | |} | ||
| Line 270: | Line 270: | ||
|- | |- | ||
| mem | | mem | ||
| | | expression defining if the current spectrum is a data base spectrum (1) or not (0) | ||
| string | | string | ||
| rowspan="5" style="background: #ffffff;" valign="top" |[[File:Array-spec-dbs.jpg|250px|thumb|center|Matlab screenshot - format of structure array C.dbs. C(1,17).dbs(1,1) contains information of mass spectrum #1 which was used with others to obtain data base spectrum #17, such as the id, taxonomic information, peak tables and the respective peak detection parameters).]] | | rowspan="5" style="background: #ffffff;" valign="top" |[[File:Array-spec-dbs.jpg|250px|thumb|center|Matlab screenshot - format of structure array C.dbs. C(1,17).dbs(1,1) contains information of mass spectrum #1 which was used with others to obtain data base spectrum #17, such as the id, taxonomic information, peak tables and the respective peak detection parameters).]] | ||
| Line 276: | Line 276: | ||
| ids | | ids | ||
| id of the individual mass spectrum used to create the data base spectrum | | id of the individual mass spectrum used to create the data base spectrum | ||
| | | char array | ||
|- | |- | ||
| tax | | tax | ||
| taxonomic info of the source spectrum | | taxonomic info of the source spectrum | ||
| | | char array | ||
|- | |- | ||
| pik | | pik | ||
| Line 288: | Line 288: | ||
| prm | | prm | ||
| parameters of peak detection | | parameters of peak detection | ||
| | | char array | ||
|} | |} | ||
| Line 303: | Line 303: | ||
|- | |- | ||
| mem | | mem | ||
| | | defines whether the current spectrum is an average spectrum (1) or not (0) | ||
| | | char array | ||
| rowspan="5" style="background: #ffffff;" valign="top" |[[File:Array-spec-avr.jpg|250px|thumb|center|Matlab screenshot - format of structure array C.avr. spec(1,18).avr(1,1) contains information of mass spectrum #1 which was used with others to obtain an average spectrum #18, such as the id, taxonomic information, peak tables and the respective peak detection parameters).]] | | rowspan="5" style="background: #ffffff;" valign="top" |[[File:Array-spec-avr.jpg|250px|thumb|center|Matlab screenshot - format of structure array C.avr. spec(1,18).avr(1,1) contains information of mass spectrum #1 which was used with others to obtain an average spectrum #18, such as the id, taxonomic information, peak tables and the respective peak detection parameters).]] | ||
|- | |- | ||
| ids | | ids | ||
| id of the individual mass spectrum used to create the avarage spectrum | | id of the individual mass spectrum used to create the avarage spectrum | ||
| | | char array | ||
|- | |- | ||
| tax | | tax | ||
| taxonomic info of the source spectrum | | taxonomic info of the source spectrum | ||
| | | char array | ||
|- | |- | ||
| pik | | pik | ||
| Line 321: | Line 321: | ||
| prm | | prm | ||
| parameters of peak detection | | parameters of peak detection | ||
| | | char array | ||
|} | |} | ||
Revision as of 13:23, 15 December 2024
Peak list files combine multiple peak lists in one single file. These files are stored in a Matlab™ specific data format and contain the peak lists as well as the respective metadata. In Matlab peak list files can be loaded by entering the following command at the Matlab command prompt:
>> load('ecoli-peaklist-oct16.pkf','-mat')
This command will open ecoli-peaklist-oct16.pkf, an example peak list file consisting of 16 individual peak lists from spectra of five different strains of E. coli. The file ecoli-peaklist-oct16.pkf can be downloaded here. If loading was successful, you will have access to a new Matlab variable C (structure array). Details of the structure of C are described next.
Fields of the structure array C:
| Fields | Description | Data type | |
|---|---|---|---|
| nam | spectra id | char array | File:Peaklist-format-C-struc.jpg Matlab screenshot - format of a peak list file (*.pkf) demonstrating the general structure of the structure array 'C'. In this example the metadata of peak list #1 are shown. |
| gen | genus information | char array | |
| spe | species info | char array | |
| str | strain info | char array | |
| typ | type | char array | |
| uid | taxonomy identification number for species as used by the NCBI (see [1]) | integer | |
| uie | taxonomy identification number for strains used by the NCBI (see [2]) | integer | |
| gti | cultivation conditions: growth time | char array | |
| tem | cultivation conditions: cultivation temperature | char array | |
| air | cultivation conditions: cultivation under aerobic or anaerobic conditions | char array | |
| med | cultivation conditions: cultivation medium | char array | |
| spo | spore formers (YES or NO) | char array | |
| con | sample concentration | char array | |
| trt | sample treatment | char array | |
| ext | extra information | char array | |
| las | laser parameters (power, diameter, frequency, etc.) | char array | |
| cal | calibration info | char array | |
| met | measurement method | char array | |
| cus | customer info | char array | |
| tim | date and time of measurement | char array | |
| pth | path to spectrum | char array | |
| pik | peak table, an array of the dimension [4 x npeaks] npeaks: number of peaks | float32 | |
| cls | class assignment (valid values are 0,1,2,3 and 4) | float32 | |
| lms | MALDI-TOF or LC-MS spectrum? (valid values are 0 [MALDI] and 1 [LC-MS]) | float32 | |
| lst | formatted text containing the peak table | char array | |
| seq | sequence of preprocessing steps | char array | |
| smo | the number of smoothing points (Savitzky-Golay smoothing) | float32 | |
| bas | number of intervals used for baseline correction | float32 | |
| nrm | normalization parameter (Yes:1, No:0) | float32 | |
| clb | calibration paarmeters (see below for details) | float32 | |
| red | data reduction factor (spectral binning) | char array | |
| cut | cut in the spectral domain | char array | |
| mod | original data modified by cut or red (Yes:1, No:0) | float32 | |
| prm | parameters of peak detection | char array | |
| ccl | calibration information (see below) | structure array | |
| dbs | data base spectrum (Yes:1, No:0) | structure array | |
| avr | average spectrum (Yes:1, No:0) | structure array |
Format of peak tables (C.pik):
| Fields | Description |
|---|---|
| C.pik(1,:) |
m/z positions of the peaks in the peak table |
| C.pik(2,:) |
absolute intensities of these peaks |
| C.pik(3,:) |
weighting factors (the sum of these factors equals 100) |
| C.pik(4,:) |
in case of single spectra, i.e. no database or average spectra: baseline-corrected absolute intensities of the peaks, in case of average or database spectra: the relative peak frequency |
Calibration Information (C.ccl):
| Fields | Description | Type | |
|---|---|---|---|
| cl1 | calibration constant 1 | float32 | File:Array-spec-ccl.jpg Matlab screenshot - format of structure array C.ccl containing the calibration info, such as calibration constants, delay time, number of spectra data points, etc. for spectrum #1. |
| cl2 | calibration constant 2 | float32 | |
| cl3 | calibration constant 3 | float32 | |
| del | delay time [ns] | float32 | |
| npt | number of data points | float32 | |
| res | time resolution [ns] | float32 | |
| ncl | calibration info required to store the spectrum in a Bruker-specific data format | char array | |
| ncr | calibration info required to store the spectrum in a Bruker-specific data format | char array | |
| bid | hardware id of the spectrum | char array | |
| org | manufacturer info | char array | |
| tfu | manufacturer info | char array | |
| tfu | software info, required for compatibility issues | char array | |
| spm | type of instrumentation | char array | |
| stp | type of measurement (should be 'TOF') | char array | |
| acq | path to the original spectrum | char array |
Data Base Spectrum (C.dbs):
A database spectrum is usually created from many (>3) individual mass spectra. The structure array C.dbs contains information (metadata, peak tables) on the mass spectra used to produce the given database spectrum. Details of the structure of C.dbs are given in the table below.
| Fields | Description | Type | |
|---|---|---|---|
| mem | expression defining if the current spectrum is a data base spectrum (1) or not (0) | string | File:Array-spec-dbs.jpg Matlab screenshot - format of structure array C.dbs. C(1,17).dbs(1,1) contains information of mass spectrum #1 which was used with others to obtain data base spectrum #17, such as the id, taxonomic information, peak tables and the respective peak detection parameters). |
| ids | id of the individual mass spectrum used to create the data base spectrum | char array | |
| tax | taxonomic info of the source spectrum | char array | |
| pik | peak table of the source spectrum | float32 | |
| prm | parameters of peak detection | char array |
Average Spectrum (C.avr):
An average spectrum is usually created from many (>3) individual mass spectra. The structure array C.avr contains information (metadata, peak tables) on the mass spectra used to produce the given avarage spectrum. Details of the structure of C.avr are given in the table below.
| Fields | Description | Type | |
|---|---|---|---|
| mem | defines whether the current spectrum is an average spectrum (1) or not (0) | char array | File:Array-spec-avr.jpg Matlab screenshot - format of structure array C.avr. spec(1,18).avr(1,1) contains information of mass spectrum #1 which was used with others to obtain an average spectrum #18, such as the id, taxonomic information, peak tables and the respective peak detection parameters). |
| ids | id of the individual mass spectrum used to create the avarage spectrum | char array | |
| tax | taxonomic info of the source spectrum | char array | |
| pik | peak table of the source spectrum | float32 | |
| prm | parameters of peak detection | char array |