Format of Spectral Multifiles
Spectra multifiles combine multiple spectra in one single file. These files are stored in a Matlab™ specific data format and contain the spectral as well as the respective metadata. Spectral multifiles can be loaded in Matlab by entering the following command:
>> load('ecoli-filelist-oct16.muf','-mat')
This command will open ecoli-filelist-oct16.muf, an example multifile containing 16 individual MALDI-TOF mass spectra acquired from five different strains of E. coli. The file ecoli-filelist-oct16.muf can be downloaded here. If loading was successful, you will have access to a new Matlab variable spec (structure array). Details of the structure of spec are described next.
Fields of the structure array spec:
Fields | Description | Data type | |
---|---|---|---|
org | original mass spectra [2 x n array], n: number of data points | float32 | |
pre | pre-processed spectra [2 x n array], n: number of data points | float32 | |
nam | spectra id | char array | |
gen | genus information | char array | |
spe | species info | char array | |
str | strain info | char array | |
typ | type | char array | |
uid | taxonomy identification number for species as used by the NCBI (see [1]), can be modified | char array | |
uie | unmodified taxonomy identification number for strains used by the NCBI (see [2]) | char array | |
gti | cultivation conditions: growth time | char array | |
tem | cultivation conditions: cultivation temperature | char array | |
air | cultivation conditions: cultivation under aerobic or anaerobic conditions | char array | |
med | cultivation conditions: cultivation medium | char array | |
spo | spore formers (Yes or No) | char array | |
con | sample concentration | char array | |
trt | sample treatment | char array | |
ext | extra information | char array | |
las | laser parameters (power, spot diameter, frequency, etc.) | char array | |
cal | calibration info | char array | |
met | measurement method | char array | |
cus | customer info | char array | |
tim | date and time of measurement | char array | |
pth | path to spectrum | char array | |
cls | class assignment (valid values are 0,1,2,3 and 4) | float32 | |
lst | formatted text containing the peak table info | char array | |
seq | sequence of pre-processing steps | char array | |
smo | the number of smoothing points (Savitzky-Golay smoothing) | char array | |
bas | number of intervals used for baseline correction | float32 | |
nrm | normalization parameter (Yes:1, No:0) | float32 | |
clb | calibration parameters (not used) | float32 | |
red | data reduction factor (spectral binning) | char array | |
cut | cut in the spectral domain, m/z range | char array | |
tmp | temporary info (not always present) | char array | |
mod | original data modified by cut spectra or reduce resolution (Yes:1, No:0) | float32 | |
lms | MALDI-TOF MS, or LC-MS1 data? (0: MALDI, 1: LC-MS1) | float32 | |
pik | peak table, an array of the dimension [4 x npeaks] npeaks: number of peaks | 4 x npeaks, or 6 x npeaks array | |
ccl | calibration information (see below) | structure array | |
avr | average spectrum (Yes:1, No:0) | structure array | |
dbs | data base spectrum (Yes:1, No:0) | structure array | |
prm | parameters of peak detection | char array | |
qt | quality test parameter | structure array |
Format of peak tables (spec.pik):
Fields | Description |
---|---|
spec.pik(1,:) |
m/z positions of the peaks in the peak table |
spec.pik(2,:) |
absolute intensities of these peaks |
spec.pik(3,:) |
weighting factors (the sum of these factors equals 100) |
spec.pik(4,:) |
in case of single spectra, i.e. no database or average spectra: baseline-corrected absolute intensities of the peaks, in case of average or database spectra: the relative peak frequency |
spec.pik(5,:) |
FWHH of the given peak (requires QT) |
spec.pik(6,:) |
resolving power of the given peak (requires QT) |
Calibration Information (spec.ccl):
Fields | Description | Type | |
---|---|---|---|
cl1 | calibration constant 1 | float32 | |
cl2 | calibration constant 2 | float32 | |
cl3 | calibration constant 3 | float32 | |
del | delay time [ns] | float32 | |
npt | number of data points | float32 | |
res | time resolution [ns] | float32 | |
ncl | calibration info required to store the spectrum in a Bruker-specific data format | char array | |
ncr | calibration info required to store the spectrum in a Bruker-specific data format | char array | |
bid | hardware id of the spectrum ('Bruker ID') | char array | |
mid | MicrobeMS id of the spectrum | char array | |
org | manufacturer info | char array | |
tfu | 'ToF user' | char array | |
spm | not used | char array | |
stp | type of measurement (should be 'TOF') | char array | |
acq | further acquisition info | char array |
Data Base Spectrum (spec.dbs):
A database spectrum is usually created from many (>3) individual mass spectra. The structure array spec.dbs contains information (metadata, peak tables) on the mass spectra used to produce the given database spectrum. Details of the structure of spec.dbs are given in the table below.
Fields | Description | Type | |
---|---|---|---|
mem | string defining if the current spectrum is a data base spectrum (1) or not (0) | string | |
ids | id's of the individual mass spectra that constitute the data base spectrum | char array | |
pik | reserved for detailed peak information | float32 | |
sta | contains statistical info when creating the data base spectrum | char array | |
tax | taxonomic info of the source spectra | char array | |
prm | parameters of peak detection | char array |
Average Spectrum (spec.avr):
An average spectrum is usually created from many (>3) individual mass spectra. The structure array spec.avr contains information (metadata, peak tables) on the mass spectra used to produce the given avarage spectrum. Details of the structure of spec.avr are given in the table below.
Fields | Description | Type | |
---|---|---|---|
mem | string defining if the current spectrum is a data base spectrum (1) or not (0) | char array | |
ids | id of the individual mass spectrum used to create the avarage spectrum | char array | |
pik | peak table of the source spectrum | float32 | |
sta | contains statistical info when creating the data base spectrum | char array |