Data Format of Peak List Files: Difference between revisions

Latest revision as of 10:32, 11 April 2025

Peak list files combine multiple peak lists in one single file. Such peak list files are stored in a Matlab™ specific data format and contain peak data as well as the respective metadata. Peak list files can be loaded by entering the following command at the Matlab command prompt:

>> load('pkffname','-mat');

where pkffname denotes the name of the peak list multifile. For example, the command load('RKI-ring-trial-test-data.pkf','-mat') will open the file RKI-ring-trial-test-data.pkf, a MALDI-ToF mass peak list multifile containing 24 individual peak tables acquired from experimental MALDI-ToF mass spectra. The latter spectra were recorded within the so called RKI ring trial study. The file RKI-ring-trial-test-data.pkf can be downloaded here.
You will have access to a new Matlab variable C (struc array) if loading was successful. Details of the structure of C are described next.

Fields of the structure array - C

Fields	Description	Data type
nam	spectra id	char array	Screenshot showing the contents of the structure array C that is stored in so called spectrum multi files (*.muf). Fields of C contain spectral data (original, i.e. unmodified, and pre-processed), spectrum metadata as well as peak lists, calibration information, results of quality tests, and information collected during creation of average, or database spectra. The example screenshot shows the contents of a database spectrum.
gen	genus information	char array
spe	species info	char array
str	strain info	char array
typ	type	char array
uid	taxonomy identification number for species as used by the NCBI (see [1]), can be modified	char array
uie	unmodified taxonomy identification number for strains used by the NCBI (see [2])	char array
gti	cultivation conditions: growth time	char array
tem	cultivation conditions: cultivation temperature	char array
air	cultivation conditions: cultivation under aerobic or anaerobic conditions	char array
med	cultivation conditions: cultivation medium	char array
spo	spore formers (Yes or No)	char array
con	sample concentration	char array
trt	sample treatment	char array
ext	extra information	char array
las	laser parameters (power, spot diameter, frequency, etc.)	char array
cal	calibration info	char array
met	measurement method	char array
cus	customer info	char array
tim	date and time of measurement	char array
pth	path to spectrum	char array
cls	class assignment (valid values are 0,1,2,3 and 4)	float32
lst	formatted text containing the peak table info	char array
seq	sequence of pre-processing steps	char array
smo	the number of smoothing points (Savitzky-Golay smoothing)	char array
bas	number of intervals used for baseline correction	float32
nrm	normalization parameter (Yes:1, No:0)	float32
clb	calibration parameters (not used)	float32
red	data reduction factor (spectral binning)	char array
cut	cut in the spectral domain, m/z range	char array
tmp	temporary info (not always present)	char array
mod	original data modified by cut spectra or reduce resolution (Yes:1, No:0)	float32
lms	MALDI-ToF MS, or LC-MS¹ data? (0: MALDI, 1: LC-MS¹)	float32
pik	peak table, an array of the dimension [4 x npeaks] or [6 x npeaks], where npeaks denotes the number of peaks	float 32
ccl	calibration information	struc array
avr	average spectrum	struc array
dbs	data base spectrum	struc array
prm	parameters of peak detection	char array
qt	quality test parameter	struc array

Peak table format - C.pik

Fields	Description
C.pik(1,:)	m/z positions of the peaks in the peak table
C.pik(2,:)	absolute intensities of these peaks
C.pik(3,:)	weighting factors (the sum of these factors equals 100)
C.pik(4,:)	in case of single spectra, i.e. no database or average spectra: baseline-corrected absolute intensities of the peaks, in case of average or database spectra: the relative peak frequency
C.pik(5,:)	FWHH of the given peak (not always present, requires QT)
C.pik(6,:)	resolving power of the given peak (not always present, requires QT)

Calibration information - C.ccl

Fields	Description	Type
cl1	calibration constant 1	float32	Screenshot showing the contents of the structure array C.ccl containing the calibration info (calibration constants, delay time, number of spectrum data points, etc.)
cl2	calibration constant 2	float32
cl3	calibration constant 3	float32
del	delay time [ns]	float32
npt	number of data points	float32
res	time resolution [ns]	float32
ncl	calibration info required to store the spectrum in a Bruker-specific data format	char array
ncr	calibration info required to store the spectrum in a Bruker-specific data format	char array
bid	hardware id of the spectrum ('Bruker ID')	char array
mid	MicrobeMS id of the spectrum	char array
org	manufacturer info	char array
tfu	'ToF user'	char array
spm	not used	char array
stp	type of measurement (should be 'TOF')	char array
acq	further acquisition info	char array

Database spectra - C.dbs

A database spectrum is usually created from many (>3) individual mass spectra. Like in regular experimental spectra, spectral data and metadata of average spectra are stored in specific fields of structure array C. In database spectra the field C(i).dbs is used to store relevant data from experimental source spectra from which the given database spectrum has been derived. These fields are left empty in experimental and average spectra. Details of the structure of C.dbs are given in the table below.

Fields	Description	Type
mem	specifies whether the current spectrum is a data base spectrum (1) or not (0)	char array	Screenshot of structure array C.dbs. This screenshot shows information like the spectrum id, taxonomic information, peak tables, respective peak detection parameters, etc of mass spectrum #1 [C(1).dbs(1,1)] that was used with others to obtain a database spectrum
ids	id of the individual mass spectrum that contributed to the given database spectrum	char array
tax	contains taxonomical information (i.e. the genus, species, strain information)	char array
pik	peak table of the given source spectrum	float32
prm	parameters used for peak detection	char array

Average spectra - C.avr

An average spectrum is usually created from many (>3) individual mass spectra. Like in regular experimental spectra, spectral data and metadata of average spectra are stored in specific fields of structure array C. In average spectra the field C(i).avr is used to store relevant data from experimental source spectra from which the given average spectrum has been derived. These fields are empty in experimental and database spectra. Details of the structure of C.avr are given in the table below.

Fields	Description	Type
mem	specifies whether the contributing spectrum is an average spectrum (1) or not (0)	char array	Screenshot of structure array C.avr. This screenshot shows information like the spectrum id, taxonomic information, peak tables, respective peak detection parameters, etc of mass spectrum #1 [C(1).avr(1,1)] that was used with others to obtain an average spectrum
ids	id of the individual mass spectrum that contributed to the given average spectrum	char array
tax	contains taxonomical information (i.e. the genus, species, strain information)	char array
pik	peak table of the given source spectrum	float32
prm	parameters used for peak detection	char array

Quality test results - C.qt

The structure array C.qt contains the results of a Quality Test. Fields of this structure are empty if no QT has been performed. Details of the structure of C.qt are given in the table below.

Fields	Description	Type
noise	QT data of the noise test, contains fields abs, rnk, and obj	struc array	Screenshot of structure array C.qt that contains the results of a quality test (QT).
basln	QT data of the baseline test, contains fields abs, rnk, and obj	struc array
npiks	QT data of the test number of peaks, contains fields abs, rnk, and obj	struc array
respw	QT data of the test resolution power, contains fields abs, rnk, and obj	struc array
rnk	overall rank that the given spectrum has achieved in a QT with a number of other spectra	float32
res	overall quality test score	float32

Data Format of Peak List Files: Difference between revisions

Latest revision as of 10:32, 11 April 2025

Contents

Fields of the structure array - C

Peak table format - C.pik

Calibration information - C.ccl

Database spectra - C.dbs

Average spectra - C.avr

Quality test results - C.qt

Navigation menu

Data Format of Peak List Files: Difference between revisions

Latest revision as of 10:32, 11 April 2025

Fields of the structure array - C

Peak table format - C.pik

Calibration information - C.ccl

Database spectra - C.dbs

Average spectra - C.avr

Quality test results - C.qt

Navigation menu

Search