MALDI Quality Tests

From MicrobeMS Wiki
Jump to navigation Jump to search

Introduction

In MicrobeMS, spectral quality tests can be used to systematically evaluate the quality of bacterial MALDI-ToF mass spectra based on predefined quality criteria and eventually exclude them from further analysis. For this purpose, MicrobeMS enables automated execution of four different quality tests (QT) on selected MALDI-ToF mass spectra, whereby a quality score is determined for each test, that are employed to calculate a general quality score parameter. In detail, the following quality criteria of a mass spectrum can be automatically assessed using the QT implemented in MicrobeMS: (a) mass spectral noise, (b) baseline shape, (c) number of peaks per spectrum, and (d) mean resolving power of the identified mass peaks. Quality testing involves automated determination of individual scores for each of the four QT criteria. Each score may range between values of zero (poor quality, test failed) and 100 (excellent quality). From these scores a weighted overall QT score can be determined. All parameters for QT are adjustable through the software. The results of quality testing are visualized by a HTML document in which the spectral quality is encoded by a traffic light scheme. Furthermore, QT results can be stored in a Matlab data format which is thought to be helpful for statistical analysis of spectra quality parameters.

How exactly is the total quality score calculated? For this purpose, the individual QT scores are weighted and averaged using the weighting factors that can be set via the software. Note that the sum of weightings should always equal 100% (or 1), otherwise an error message will be provided.
 

QT threshold 1 and 2 - what do these specifications mean? These specifications merely define the color coding in the resulting HTML results file. Test results with a score greater than QT threshold 2 are shown in green, scores lower than QT threshold 1 are shown in red. For all other QT results, the color yellow is used according to the traffic light scheme. It should be noted that this color coding is quite strict. It was observed that even spectra with low (red) QT were sometimes quite suitable for identification.
 

What does the parameter number of bins mean? Each QT produces a specific parameter in the result. For example, the number of peaks test provides a defined value for the number of peaks. This value has to be then converted into score values, with the highest score being assigned to spectra with equal or more than 80 peaks (default setting, see screenshot below). If fewer than 5 peaks are found, a score of 0 is given. For parameters between 5 and 80, intervals in the range of [5 80] are used, on the basis of which the respective score values will be assigned. The number of bins specifies the degree of dispersion into intervals. To give an example, if a value of 10 is specified for number of bins, 9 limit values must be determined. This is done in log scale manner while 9 limits correspond to a total of 10 intervals (in the QT example number of peaks the following limit values for the intervals are automatically calculated: [5 7 10 14 20 28 40 57 80]). To stay with the example given above, if only 4 peaks are determined in a spectrum, then a score of 0 is assigned, with 15 peaks a score of 44.4 and with 70 a score of 89.9% would be achieved.

Noise quality test

This first QT involves the following preprocessing steps:

  • Initial baseline correction by an asymmetric least-square algorithm (AsLS)
  • Cutting the spectra between two m/z values, usually between m/z 2000 and 13000
  • Normalization (2-norm)
  • Offset correction using the high m/z region to determine the spectrum offset

Spectrum noise is then determined as the standard deviation of the pre-processed spectra, with the top 20% of the intensity data, usually the peak data, removed.
Noise values between 20 and 80 are assigned to scores of 100 (great) and 0 (very poor), respectively, while the noise weighting factor is set to 25% (default settings, see screenshot below).

Screenshot of the spectral quality test user dialog box
    P.H. Eilers and H. F. M. Boelens. Baseline Correction with Asymmetric Least Squares Smoothing. Leiden University Centre Medical Report 1(1) 2005 p. 5.

Baseline quality test

Baseline deviation is calculated as the the integral under the baseline curve. Prior to this, the spectrum is preprocessed, i.e. normalized (2-norm) and offset corrected. This procedure considers the peak intensities when normalizing the mass spectra.
Baseline deviation values between 150 and 1500 are assigned to baseline quality test scores of 100 and 0, respectively, while the baseline weighting factor is set to 20% (default settings). Since the baselines can usually be well corrected by means of appropriate corrections (AsLS), the weight of the baseline scores could also be further reduced. Note that the sum of weightings should always equal 100% (or 1), otherwise an error message will be given.

Test 'number of peaks'

The third quality test number of peaks starts with the following processing steps:

  • Initial baseline correction by an asymmetric least-square algorithm (AsLS)
  • Obtain a function to use as a so-called peak threshold function

In the QT spectral intensity values greater than the threshold function at specific m/z positions define peaks. Spectral noise and the AsLS baseline curve are included in the calculation of this threshold function. In addition, the threshold function has a sigmoidal shape to models the higher sensitivity of the MALDI-ToF MS technique at lower m/z values and correspondingly lower spectrum intensities (peaks, noise) in the higher m/z range. It is important to note that the parameter number of peaks is considered a relative value, useful mainly for comparing spectra. The number of peaks given by the QT does not necessarily correspond to the actual number of peaks that can be extracted from the given spectrum. This parameter depends on many other parameters, such as the distance of the threshold function from the spectral curve (defined by the noise parameter).
Peak numbers of 5 and 80 are assigned to peak number score values of 0 and 100, respectively, while the corresponding weighting factor is 30% by default. Since the peak number score is considered the most important parameter for calculating the overall QT score, the weight of this specific test could be increased beyond this value. Again, the sum of weightings should always equal 100%, or 1.

Quality test 'resolving power'

The resolving power parameter is effectively a by-product of the procedure described in the previous paragraph. The resolving power indicates the average of the FWHM (Full Width at Half Maximum) values of peaks found above the threshold function. A high resolving power, corresponding to low values of the FWHH, is beneficial for peak detection. Large FWHH values are found in broad peak features, often due to application of high laser power and may indicate sub-optimal settings when acquiring spectral data.
Resolving power values between 200 and 800 are assigned to test scores of 100 and 0, respectively, while the weighting factor equals 25% (default settings).

Useful links

Example of a HTML formatted quality test report: https://report-quality-29-Oct-2024-16-33-26.html Description of how to modify the default settings for quality testing by modifying the parameter file microbems.opt peak detection Description of the content of a Matlab-formatted QT result file