MALDI Quality Tests

Screenshot of the spectral quality test user dialog box

In MicrobeMS, spectral quality tests can be used to systematically evaluate spectrum quality of bacterial MALDI-ToF mass spectra based on predefined quality criteria and eventually exclude them from further analysis. For this purpose, MicrobeMS enables automated execution of four different quality tests (QT) on selected MALDI-ToF mass spectra, whereby a quality score is determined for each test. These scores are subsequently utilized to calculate a general quality score parameter. In detail, the following quality criteria of a mass spectrum can be automatically assessed using the QT implemented in MicrobeMS: (a) mass spectral noise, (b) baseline shape, (c) number of peaks per spectrum, and (d) mean resolving power of the identified mass peaks. Quality testing involves automated determination of individual scores for each of the four QT criteria. Each score may range between values of zero (poor quality, test failed) and 100 (excellent quality). From these scores a weighted overall QT score is then determined using weightings of the underlying test scores. All parameters for QT are adjustable through the MicrobeMS software. The results of quality testing are visualized by a HTML document in which the spectral quality is encoded by a traffic light scheme. Furthermore, QT results can be stored in a Matlab data format which is thought to be helpful for subsequent statistical analysis of spectra quality parameters.

Introduction

How is the total quality score calculated? To obtain a general QT score, the individual QT scores are weighted and averaged using predefined weighting factors that can be set via the software. Note that the sum of weightings should always equal 100% (or 1), otherwise an error message will be provided.

QT threshold 1 and 2 - what do these specifications mean? These specifications merely define the color coding of the QT results in the QT's HTML report. Test results with a score greater than QT threshold 2 are shown in green; scores lower than QT threshold 1 are shown in red. For all other QT results the color yellow is used in the HTML document (according to a traffic light scheme). It should be noted that predefined (default) color coding settings should not be interpreted in a strict sense. It is important to understand, that current standard QT settings are just suggestions chosen on the basis of testing MALDI-ToF MS data acquired at the Robert Koch Institute (RKI) over a long period of time. In these tests MALDI data from one MS instrument, an autofleX I mass spectrometer were analyzed. Mass data collected by other institutions that use alternative instruments may exhibit different spectrum characteristics. Furthermore, current QT parameter settings were chosen to assure the full dynamic range [0 100] of the scores. Therefore, it is quite possible that spectra with low (red) QT obtained by others can still be suitable for microbial identification analysis.

Parameter number of bins. Each QT produces as a result a specific test parameter. For example, the number of peaks test provides a specific numeric value, the number of peaks. Such values have to be then converted into scores, with the highest score being assigned to spectra with equal or more than 70 peaks (current default setting, cf. screenshot). If fewer than 14 peaks are found, a score of 0 is given. For parameters between 14 and 70, intervals in the range of [14 70] are defined, on the basis of which the respective score values are then determined. The parameter number of bins specifies how many intervals are used by such an approach. To give an example, 14 bin limit values will be determined if the parameter number of bins equals 15. Bin limits will be determined in a log scale manner. Furthermore, it is obvious, that 14 bin limits are needed to define 15 bins. To stay with the example given above (number of peaks), the following interval, or bin limit values are automatically obtained: [14 16 19 22 25 28 32 36 40 45 50 56 63 70]). This means, that a QT score of 0 would be assigned if for example only 13 peaks would be determined from the mass spectrum under investigation. In case of 17 peaks a score of 14.3, of 28 peaks a score of 42.9, and of 69 peaks a QT score of 92.6 would be achieved, respectively.

Noise quality test

This first QT involves the following preprocessing steps:

Initial baseline correction by an asymmetric least-square algorithm (AsLS)
Cutting the spectra between two m/z values, usually between m/z 2000 and 13000
Normalization (modified 1-norm algorithm)
Offset correction using the high m/z region to determine the spectrum offset value

The spectral noise is in the following determined by a complex algorithm that starts with removing very high and very low intensity values from the spectral vectors (top 30% and bottom 2% of intensity data). The data is then extensively smoothed by a Savitzky-Golay smoothing filter with 99 smoothing points. The smoothed data is subsequently subtracted from the intensity-adjusted spectrum vector. Finally, the standard deviation of the difference spectrum is determined to obtain the value of the absolute noise.
Noise values between 0.2 and 6 are assigned to scores of 100 (great) and 0 (very poor), respectively, while the noise weighting factor is set to 30% (0.3, default settings).

Baseline quality test

Baseline deviation is calculated as the integral under the baseline curve. Prior to this, the spectrum is preprocessed, i.e. normalized (modified 1-norm algorithm) and offset corrected. This means that the procedure ultimately considers peak intensity values when normalizing MALDI-ToF mass spectra. From the preprocessed spectra a baseline curve is then obtained by the method baseline correction by asymmetric least squares (AsLS, see ref. below).
Baseline deviation values, i.e. areas under the baseline curve of intensity-normalized and offset corrected spectra, between 0.15 and 40, are assigned to baseline quality test scores of 100 and 0, respectively, while the baseline weighting factor is set to 20% (default parameters, values given are adjustable). Since the baselines can usually be well corrected by means of appropriate baseline correction routines such as AsLS, the weight of the baseline scores could be reduced below the default value (15%, i.e. 0.15). Note that the sum of all weightings should always equal 100% (or 1), otherwise an error message will be given.

 P.H. Eilers and H. F. M. Boelens. Baseline Correction with Asymmetric Least Squares Smoothing. 
 Leiden University Centre Medical Report 1(1) 2005 p. 5.

Test number of peaks

The third quality test number of peaks starts with the following processing steps:

Initial baseline correction by an asymmetric least-square algorithm (AsLS)
Obtain a function to be used as a so-called peak threshold intensity function

In this QT, peaks are defined when spectral intensity values are larger than the threshold intensity function at specific m/z positions. It is important to note, that calculation of the threshold intensity function involves not only the AsLS baseline curve, but also spectrum noise. In addition to this, the threshold function is defined as a generalized logistic function that models the higher sensitivity of the MALDI-ToF MS technique at lower m/z values and correspondingly lower spectrum intensities (peaks, noise) in the higher m/z range. Furthermore, it is important to note that the parameter number of peaks is considered a relative value, intended mainly for comparing spectra in the context of the QT. The parameter number of peaks given by the QT does not necessarily correspond to the actual number of peaks that can be extracted from a given mass spectrum. This parameter depends on many other parameters, such as the distance of the threshold function from the spectral curve (defined, among others, by the noise parameter).
Peak numbers of 14 and 70 (default settings) are assigned to peak number score values of 0 and 100, respectively, while the corresponding weighting factor has been set by default to 40% (0.4). Since the peak number score is considered the most important parameter for calculating the overall QT score, the weight of this specific test could be increased beyond this value. Again, the sum of weightings should always equal 100% (i.e. 1).

Quality test resolving power

The parameter resolving power, as the ratio of the m/z position of a peak divided by its FWHM, is effectively a by-product of the procedure described in the previous paragraph. The resolving power of the QT indicates the average of values for specifically selected peaks found above the intensity threshold function values. Note that to obtain the parameter resolving power only the 10 most intense peaks are utilized. A high resolving power, corresponding to low values of the FWHM, is thought to be beneficial for peak detection. Large FWHM values are determined in case of broad peak features, often in cases where high laser power has been applied, and may indicate sub-optimal settings when acquiring mass spectral data.
Mean resolving power values between 200 and 1200 are assigned to test scores of 0 and 100, respectively, while the weighting factor equals 15% (0.15, standard settings).

Useful links

Example of a HTML formatted quality test report: https://wiki.microbe-ms.com/uploads/report-quality-15-Dec-2024-11-17-18.html
Description of how to modify the default settings for quality testing by modifying the parameter file microbems.opt
Description of the format of quality test result files (MS Excel format, *.xls)

MALDI Quality Tests

Contents

Introduction

Noise quality test

Baseline quality test

Test number of peaks

Quality test resolving power

Useful links

Navigation menu

MALDI Quality Tests

Introduction

Noise quality test

Baseline quality test

Test number of peaks

Quality test resolving power

Useful links

Navigation menu

Search