Format of Spectral Multifiles

From MicrobeMS Wiki
Revision as of 18:00, 2 October 2024 by Laschp (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Spectra multifiles combine multiple spectra in one single file. These files are stored in a Matlab™ specific data format and contain the spectral as well as the respective metadata. Spectral multifiles can be loaded in Matlab by entering the following command:

>> load('ecoli-filelist-oct16.muf','-mat')

This command will open ecoli-filelist-oct16.muf, an example multifile containing 16 individual MALDI-TOF mass spectra acquired from five different strains of E. coli. The file ecoli-filelist-oct16.muf can be downloaded here. If loading was successful, you will have access to a new Matlab variable spec (structure array). Details of the structure of spec are described next.
 

Fields of the structure array spec:

Fields Description Data type
org original mass spectra [2 x n array], n: number of data points float32
Matlab screenshot - format of a spectral multifile (*.muf) demonstrating the general structure of the structure array 'spec'. In this example the metadata of spectrum #17 are shown. Spectrum #17 is a data base spectrum which has been created from 8 individual mass spectra (cf. spec(1,17).dbs)
pre pre-processed spectra [2 x n array], n: number of data points float32
nam spectra id char array
gen genus information char array
spe species info char array
str strain info char array
typ type char array
uid taxonomy identification number for species as used by the NCBI (see [1]), can be modified char array
uie unmodified taxonomy identification number for strains used by the NCBI (see [2]) char array
gti cultivation conditions: growth time char array
tem cultivation conditions: cultivation temperature char array
air cultivation conditions: cultivation under aerobic or anaerobic conditions char array
med cultivation conditions: cultivation medium char array
spo spore formers (Yes or No) char array
con sample concentration char array
trt sample treatment char array
ext extra information char array
las laser parameters (power, spot diameter, frequency, etc.) char array
cal calibration info char array
met measurement method char array
cus customer info char array
tim date and time of measurement char array
pth path to spectrum char array
cls class assignment (valid values are 0,1,2,3 and 4) float32
lst formatted text containing the peak table info char array
seq sequence of pre-processing steps char array
smo the number of smoothing points (Savitzky-Golay smoothing) char array
bas number of intervals used for baseline correction float32
nrm normalization parameter (Yes:1, No:0) float32
clb calibration parameters (not used) float32
red data reduction factor (spectral binning) char array
cut cut in the spectral domain, m/z range char array
tmp temporary info (not always present) char array
mod original data modified by cut spectra or reduce resolution (Yes:1, No:0) float32
lms MALDI-TOF MS, or LC-MS1 data? (0: MALDI, 1: LC-MS1) float32
pik peak table, an array of the dimension [4 x npeaks] npeaks: number of peaks 4 x npeaks, or 6 x npeaks array
ccl calibration information (see below) structure array
avr average spectrum (Yes:1, No:0) structure array
dbs data base spectrum (Yes:1, No:0) structure array
prm parameters of peak detection char array
qt quality test parameter structure array


Format of peak tables (spec.pik):

Fields Description
spec.pik(1,:)
 
m/z positions of the peaks in the peak table
 
spec.pik(2,:)
 
absolute intensities of these peaks
 
spec.pik(3,:)
 
weighting factors (the sum of these factors equals 100)
 
spec.pik(4,:)
 
in case of single spectra, i.e. no database or average spectra: baseline-corrected absolute intensities of the peaks, in case of average or database spectra: the relative peak frequency
spec.pik(5,:)
 
FWHH of the given peak (requires QT)
 
spec.pik(6,:)
 
resolving power of the given peak (requires QT)
 


Calibration Information (spec.ccl):

Fields Description Type
cl1 calibration constant 1 float32
Matlab screenshot - format of structure array spec.ccl containing the calibration info, such as calibration constants, delay time, number of spectra data points, etc. for spectrum #1.
cl2 calibration constant 2 float32
cl3 calibration constant 3 float32
del delay time [ns] float32
npt number of data points float32
res time resolution [ns] float32
ncl calibration info required to store the spectrum in a Bruker-specific data format char array
ncr calibration info required to store the spectrum in a Bruker-specific data format char array
bid hardware id of the spectrum ('Bruker ID') char array
mid MicrobeMS id of the spectrum char array
org manufacturer info char array
tfu 'ToF user' char array
spm not used char array
stp type of measurement (should be 'TOF') char array
acq further acquisition info char array


Data Base Spectrum (spec.dbs):

A database spectrum is usually created from many (>3) individual mass spectra. The structure array spec.dbs contains information (metadata, peak tables) on the mass spectra used to produce the given database spectrum. Details of the structure of spec.dbs are given in the table below.

Fields Description Type
mem string defining if the current spectrum is a data base spectrum (1) or not (0) string
Matlab screenshot - format of structure array spec.dbs. spec(1,17).dbs(1,1) contains information of mass spectrum #1 which was used with others to obtain data base spectrum #17, such as the id, taxonomic information, peak tables and the respective peak detection parameters).
ids id's of the individual mass spectra that constitute the data base spectrum char array
pik reserved for detailed peak information float32
sta contains statistical info when creating the data base spectrum char array
tax taxonomic info of the source spectra char array
prm parameters of peak detection char array


Average Spectrum (spec.avr):

An average spectrum is usually created from many (>3) individual mass spectra. The structure array spec.avr contains information (metadata, peak tables) on the mass spectra used to produce the given avarage spectrum. Details of the structure of spec.avr are given in the table below.

Fields Description Type
mem string defining if the current spectrum is a data base spectrum (1) or not (0) char array
Matlab screenshot - format of structure array spec.avr. spec(1,18).avr(1,1) contains information of mass spectrum #1 which was used with others to obtain an average spectrum #18, such as the id, taxonomic information, peak tables and the respective peak detection parameters).
ids id of the individual mass spectrum used to create the avarage spectrum char array
pik peak table of the source spectrum float32
sta contains statistical info when creating the data base spectrum char array