MALDI Quality Tests

From MicrobeMS Wiki
Revision as of 14:17, 30 December 2024 by Laschp (talk | contribs) (→‎Introduction)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Screenshot of the spectral quality test user dialog box

In MicrobeMS, spectral quality tests can be used to systematically evaluate spectrum quality of bacterial MALDI-ToF mass spectra based on predefined quality criteria and eventually exclude them from further analysis. For this purpose, MicrobeMS enables automated execution of four different quality tests (QT) on selected MALDI-ToF mass spectra, whereby a quality score is determined for each test. These scores are subsequently utilized to calculate a general quality score parameter. In detail, the following quality criteria of a mass spectrum can be automatically assessed using the QT implemented in MicrobeMS: (a) mass spectral noise, (b) baseline shape, (c) number of peaks per spectrum, and (d) mean resolving power of the identified mass peaks. Quality testing involves automated determination of individual scores for each of the four QT criteria. Each score may range between values of zero (poor quality, test failed) and 100 (excellent quality). From these scores a weighted overall QT score is then determined using weightings of the underlying test scores. All parameters for QT are adjustable through the MicrobeMS software. The results of quality testing are visualized by a HTML document in which the spectral quality is encoded by a traffic light scheme. Furthermore, QT results can be stored in a Matlab data format which is thought to be helpful for subsequent statistical analysis of spectra quality parameters.

Introduction

How is the total quality score calculated? To obtain a general QT score, the individual QT scores are weighted and averaged using predefined weighting factors that can be set via the software. Note that the sum of weightings should always equal 100% (or 1), otherwise an error message will be provided.
 

QT threshold 1 and 2 - what do these specifications mean? These specifications merely define the color coding in the resulting HTML result file. Test results with a score greater than QT threshold 2 are shown in green; scores lower than QT threshold 1 are shown in red. For all other QT results the color yellow is used in the HTML document according to the traffic light scheme. It should be noted that predefined (default) color coding settings should not be interpreted in a strict sense. It is important to understand, that current standard QT settings are just suggestions chosen on the basis of testing MALDI-ToF MS data acquired at the Robert Koch Institute (RKI) over a long period of time. In these tests MALDI data from one MS instrument, an autofleX I mass spectrometer were analyzed. Mass data collected by other institutions that use alternative instruments may exhibit different spectrum characteristics. Furthermore, current QT parameter settings were chosen to assure the full dynamic range [0 100] of the scores. Therefore, it is quite possible that spectra with low (red) QT obtained by others can still be suitable for microbial identification analysis.
 

Parameter number of bins. Each QT produces as a result a specific test parameter. For example, the number of peaks test provides a specific numeric value, the number of peaks. Such values have to be then converted into scores, with the highest score being assigned to spectra with equal or more than 80 peaks (current default setting, see screenshot below). If fewer than 5 peaks are found, a score of 0 is given. For parameters between 5 and 80, intervals in the range of [5 80] are used, on the basis of which the respective score values are then determined. The parameter number of bins specifies how many intervals are used by such an approach. To give an example, 9 bin limit values will be determined if the parameter number of bins equals 10. Bin limits will be determined in a log scale manner. Furthermore, it is obvious, that 9 bin limits are needed to define 10 bins. To stay with the example given above (number of peaks), the following interval, or bin limit values are automatically obtained: [5 7 10 14 20 28 40 57 80]). This means, that a QT score of 0 would be assigned if for example only 4 peaks would be determined from the mass spectrum under investigation. In case of 15 peaks a score of 44.4, of 43 peaks a score of 77.8, and of 70 peaks a QT score of 89.9 would be achieved, respectively.

Noise quality test

This first QT involves the following preprocessing steps:

  • Initial baseline correction by an asymmetric least-square algorithm (AsLS)
  • Cutting the spectra between two m/z values, usually between m/z 2000 and 13000
  • Normalization (2-norm)
  • Offset correction using the high m/z region to determine the spectrum offset value

Spectrum noise is then determined as the standard deviation of the pre-processed spectra, with the top 20% of the intensity data, usually the peak data, removed.
Noise values between 20 and 80 are assigned to scores of 100 (great) and 0 (very poor), respectively, while the noise weighting factor is set to 25% (default settings, see screenshot below).

Baseline quality test

Baseline deviation is calculated as the integral under the baseline curve. Prior to this, the spectrum is preprocessed, i.e. normalized (2-norm) and offset corrected. This means that the procedure considers peak intensity values when normalizing MALDI-ToF mass spectra. From the preprocessed spectra a baseline curve is then obtained by the method baseline correction by asymmetric least squares (AsLS, see ref. below).
Baseline deviation values, i.e. areas under the baseline curve of intensity-normalized and offset corrected spectra, between 150 and 1500 are assigned to baseline quality test scores of 100 and 0, respectively, while the baseline weighting factor is set to 20% (default parameters, values given are adjustable). Since the baselines can usually be well corrected by means of appropriate baseline correction routines such as AsLS, the weight of the baseline scores could be reduced below the default value (0.2). Note that the sum of all weightings should always equal 100% (or 1), otherwise an error message will be given.

 P.H. Eilers and H. F. M. Boelens. Baseline Correction with Asymmetric Least Squares Smoothing. 
 Leiden University Centre Medical Report 1(1) 2005 p. 5.

Test number of peaks

The third quality test number of peaks starts with the following processing steps:

  • Initial baseline correction by an asymmetric least-square algorithm (AsLS)
  • Obtain a function to be used as a so-called peak threshold intensity function

In this QT, peaks are defined when spectral intensity values are larger than the threshold intensity function at specific m/z positions. It is important to note, that calculation of the threshold intensity function involves not only the AsLS baseline curve, but also spectrum noise. In addition to this, the threshold function is defined as a generalized logistic function that models the higher sensitivity of the MALDI-ToF MS technique at lower m/z values and correspondingly lower spectrum intensities (peaks, noise) in the higher m/z range. Furthermore, it is important to note that the parameter number of peaks is considered a relative value, intended mainly for comparing spectra in the context of the QT. The parameter number of peaks given by the QT does not necessarily correspond to the actual number of peaks that can be extracted from a given mass spectrum. This parameter depends on many other parameters, such as the distance of the threshold function from the spectral curve (defined, among others, by the noise parameter).
Peak numbers of 5 and 80 (default settings) are assigned to peak number score values of 0 and 100, respectively, while the corresponding weighting factor has been set by default to 30%. Since the peak number score is considered the most important parameter for calculating the overall QT score, the weight of this specific test could be increased beyond this value. Again, the sum of weightings should always equal 100% (or 1).

Quality test resolving power

The parameter resolving power, as the ratio of the m/z position of a peak divided by its FWHM, is effectively a by-product of the procedure described in the previous paragraph. The resolving power of the QT indicates the average of values for peaks found above the intensity threshold function. A high resolving power, corresponding to low values of the FWHM, is thought to be beneficial for peak detection. Large FWHM values are determined in case of broad peak features, often in cases where high laser power has been applied, and may indicate sub-optimal settings when acquiring mass spectral data.
Mean resolving power values between 200 and 800 are assigned to test scores of 0 and 100, respectively, while the weighting factor equals 25% (all in standard settings).

Useful links