Wilcoxon Rank-Sum Tests

From MicrobeMS Wiki
Jump to navigation Jump to search

Introduction

T-test.jpg

Wilcoxon rank-sum test (Wikipedia)

To be continued (2026)

Parameter of Wilcoxon rank-sum tests

  • m/z range: lower and upper bounds of the m/z region in which the series of Wilcoxon rank-sum tests are to be performed
  • α: significance level of the Wilcoxon rank-sum tests
  • dx (ppm): a parameter defining the relative width and thus the number of the spectrum segments. A spectrum segment centered at position covers a m/z interval of an absolute width equaling . The lower and upper bounds of the spectrum segments are defined by (lower bound) and (upper bound), respectively. Consequently, a spectrum segment of width dx=1000 and centered at =2000 Th would be 2 Th wide with boundaries located at 1999 and 2001 Th
  • use intensities: defines if barcode spectra (checkbox unchecked) or peak weighting factors (checked) are utilized as test inputs
  • show histogram: shows a histogram with test outputs (p-values, AUC, etc.) and provides also the mean, median and the standard deviation of the test variables

Performing Wilcoxon rank-sum tests

1. Load the mass spectral data files via the load spectra (Bruker data file format), import spectra from mzXML data, or the load MS multifile options of the File pulldown menu.
2. Wilcoxon rank-sum tests are carried out from labeled spectra, i.e. from spectra with a class assignment. To perform the test label two groups of spectra as class 1 and as class 2, respectively. Labeling, or class assignment, can be carried out by selecting the appropriate spectra and choosing class assignmentsclass X from the Edit pulldown menu.
3. The test routine always starts from original MALDI-ToF mass spectra, i.e. spectral pre-processing and peak detection is carried out automatically using pre-defined parameters. Existing pre-processed spectra and pre-defined peak tables are ignored by the Wilcoxon rank-sum test function.
4. Define test parameter, such as α (significance level), the m/z range and dx (ppm) which has a default value of 1000 (relative, in ppm). The parameter dx defines the width of m/z segments in which spectra are divided during the test. Peaks found in the same m/z segment are considered identical while  mass peaks in different segments are considered different peaks. 
5. When finished select peak frequency plots from the Analysis pulldown menu. Choose options from selection or from class X if the peak frequency test should involve selected spectra or spectra with an appropriate class labeling, respectively.

Output of Wilcoxon rank-sum test


Example of the output from a series of Wilcoxon rank-sum tests taken from the log file of MicrobeMS:

Command line output of Wilcoxon-tests. In this example, the m/z segment centered around 2518.0283 Th shows the greatest potential for distinguishing between classes I and II.
Plot of p-values (log scaled) obtained by Wilcoxon rank-sum tests using peak data from individual m/z segments as inputs: The smaller the p-value, the higher the discriminative potential of biomarker peaks at the specific m/z positions