Format of Spectral Multifiles

From MicrobeMS Wiki
Revision as of 10:32, 16 December 2024 by Laschp (talk | contribs) (→‎Fields of the structure array - spec)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Spectra multifiles combine multiple spectra in one single file. These files are stored in a Matlab™ specific format and contain the spectral, as well as the respective metadata. Spectra multifiles can be loaded in the Matlab environment by entering the following command at the Matlab command prompt:

>> load('muffname','-mat');

where muffname denotes the name of the spectrum multifile. For example, the command load('RKI-ring-trial-spectra.muf','-mat') will open the file RKI-ring-trial-spectra.muf, a MALDI-ToF mass spectrum multifile containing 24 individual MALDI-TOF mass spectra acquired within the so called RKI ring trial study. The file RKI-ring-trial-spectra.muf can be downloaded here. If loading was successful, you will have access to a new Matlab variable spec (struc array). Details of the structure of spec are described next.
 

Fields of the structure array - spec


Fields Description Data type
org original mass spectra [2 x n array], n: number of data points float32
Screenshot showing the content of the structure array spec that is stored in so called spectrum multi files (*.muf). Fields of spec contain spectral data (original, i.e. unmodified, and pre-processed), spectrum metadata as well as peak lists, calibration information, results of quality tests, and information collected during creation of average, or database spectra. In the example of the given screenshot, the content of one database spectrum is depicted.
pre pre-processed spectra [2 x n array], n: number of data points float32
nam spectra id char array
gen genus information char array
spe species info char array
str strain info char array
typ type char array
uid taxonomy identification number for species as used by the NCBI (see [1]), can be modified char array
uie unmodified taxonomy identification number for strains used by the NCBI (see [2]) char array
gti cultivation conditions: growth time char array
tem cultivation conditions: cultivation temperature char array
air cultivation conditions: cultivation under aerobic or anaerobic conditions char array
med cultivation conditions: cultivation medium char array
spo spore formers (Yes or No) char array
con sample concentration char array
trt sample treatment char array
ext extra information char array
las laser parameters (power, spot diameter, frequency, etc.) char array
cal calibration info char array
met measurement method char array
cus customer info char array
tim date and time of measurement char array
pth path to spectrum char array
cls class assignment (valid values are 0,1,2,3 and 4) float32
lst formatted text containing the peak table info char array
seq sequence of pre-processing steps char array
smo the number of smoothing points (Savitzky-Golay smoothing) char array
bas number of intervals used for baseline correction float32
nrm normalization parameter (Yes:1, No:0) float32
clb calibration parameters (not used) float32
red data reduction factor (spectral binning) char array
cut cut in the spectral domain, m/z range char array
tmp temporary info (not always present) char array
mod original data modified by cut spectra or reduce resolution (Yes:1, No:0) float32
lms MALDI-ToF MS, or LC-MS¹ data? (0: MALDI, 1: LC-MS¹) float32
pik peak table, an array of the dimension [4 x npeaks] or [6 x npeaks], where npeaks denotes the number of peaks float 32
ccl calibration information (see below) struc array
avr average spectrum (Yes:1, No:0) struc array
dbs data base spectrum (Yes:1, No:0) struc array
prm parameters of peak detection char array
qt quality test parameter struc array

Peak table format - spec.pik


Fields Description
spec.pik(1,:)
 
m/z positions of the peaks in the peak table
 
spec.pik(2,:)
 
absolute intensities of these peaks
 
spec.pik(3,:)
 
weighting factors (the sum of these factors equals 100)
 
spec.pik(4,:)
 
in case of single spectra, i.e. no database or average spectra: baseline-corrected absolute intensities of the peaks, in case of average or database spectra: the relative peak frequency
spec.pik(5,:)
 
FWHH of the given peak (requires QT)
 
spec.pik(6,:)
 
resolving power of the given peak (requires QT)
 


Calibration information - spec.ccl


Fields Description Type
cl1 calibration constant 1 float32
Screenshot showing the content of the structure array spec.ccl containing the calibration info (calibration constants, delay time, number of spectrum data points, etc.)
cl2 calibration constant 2 float32
cl3 calibration constant 3 float32
del delay time [ns] float32
npt number of data points float32
res time resolution [ns] float32
ncl calibration info required to store the spectrum in a Bruker-specific data format char array
ncr calibration info required to store the spectrum in a Bruker-specific data format char array
bid hardware id of the spectrum ('Bruker ID') char array
mid MicrobeMS id of the spectrum char array
org manufacturer info char array
tfu 'ToF user' char array
spm not used char array
stp type of measurement (should be 'TOF') char array
acq further acquisition info char array


Database spectra - spec.dbs


A database spectrum is usually created from many (>3) individual mass spectra. Like in regular experimental spectra, spectral data and metadata of average spectra are stored in specific fields of structure array spec. In database spectra the field spec(i).dbs is used to store relevant data from experimental source spectra from which the given database spectrum has been derived. These fields are left empty in experimental and average spectra. Details of the structure of spec.dbs are given in the table below.

Fields Description Type
mem specifies whether the current spectrum is a data base spectrum (1) or not (0) char array
Screenshot of structure array spec.dbs. This screenshot shows information like the spectrum id, taxonomic information, peak tables, respective peak detection parameters, etc of mass spectrum #1 [spec(1).dbs(1,1)] that was used with others to obtain an average spectrum
ids id of the individual mass spectrum that contributed to the given database spectrum char array
tax contains taxonomical information (i.e. the genus, species, strain information) char array
pik peak table of the given source spectrum float32
prm parameters used for peak detection char array


Average spectra - spec.avr


An average spectrum is usually created from many (>3) individual mass spectra. Like in regular experimental spectra, spectral data and metadata of average spectra are stored in specific fields of structure array spec. In average spectra the field spec(i).avr is used to store relevant data from experimental source spectra from which the given average spectrum has been derived. These fields are empty in experimental and database spectra. Details of the structure of spec.avr are given in the table below.

Fields Description Type
mem specifies whether the contributing spectrum is an average spectrum (1) or not (0) char array
Screenshot of structure array spec.avr. This screenshot shows information like the spectrum id, taxonomic information, peak tables, respective peak detection parameters, etc of mass spectrum #1 [spec(1).avr(1,1)] that was used with others to obtain an average spectrum
ids id of the individual mass spectrum that contributed to the given average spectrum char array
tax contains taxonomical information (i.e. the genus, species, strain information) char array
pik peak table of the given source spectrum float32
prm parameters used for peak detection char array


Quality test results - spec.qt


The structure array spec.qt contains the results of a Quality Test. Fields of this structure are empty if no QT has been performed. Details of the structure of spec.qt are given in the table below.

Fields Description Type
noise QT data of the noise test, contains fields abs, rnk, and obj struc array
Screenshot of structure array spec.qt that contains the results of a quality test (QT).
basln QT data of the baseline test, contains fields abs, rnk, and obj struc array
npiks QT data of the test number of peaks, contains fields abs, rnk, and obj struc array
respw QT data of the test resolution power, contains fields abs, rnk, and obj struc array
rnk overall rank that the given spectrum has achieved in a QT with a number of other spectra float32
res overall quality test score float32