Oxonium Browser uses a sugar oxonium ion database to define which masses to search for in MS2 spectra. Each entry specifies a diagnostic mass pair: the intact oxonium ion ([M+H−H₂O]⁺) and its water loss fragment ([M+H−2H₂O]⁺). Both masses must be present within the mass error tolerance for a positive detection.
Two databases are available:
The curated database is an Excel file (.xlsx) placed in the Input directory. The current version ships as OX_DB_CURATED_v03.xlsx.
| Column | Description | Example |
|---|---|---|
Oxonium |
Unique name for the sugar | HexNAc |
ox_mass1 |
Primary diagnostic mass (oxonium ion) | 204.0867 |
ox_mass2 |
Secondary diagnostic mass (water loss) | 186.0761 |
Additional columns (Example sugar, Name, Monoisotopic Mass, M+H+) may be present for reference but are not required by the software.
The curated database covers common, rare, and derivative monosaccharides compiled from CSDB, KEGG, and published glycan literature, including pentoses, hexoses, heptoses, ulosonic acids, deoxy sugars, amino sugars, acetylated and methylated derivatives, and others.
Each sugar is defined by two diagnostic masses rather than a single mass. Requiring both the intact ion and its water loss product significantly reduces false positive detections compared to single-mass matching.
For most sugars, the pair consists of:
ox_mass1 = [M+H−H₂O]⁺ (one water loss from the protonated sugar)ox_mass2 = [M+H−2H₂O]⁺ (two water losses)For some sugars (e.g. uronic acids), ox_mass2 may represent a carboxylic acid loss fragment instead.
To add a new sugar to the curated database:
ox_mass1 = [M+H]⁺ − H₂O = monoisotopic mass + 1.00728 − 18.01056ox_mass2 = ox_mass1 − 18.01056Oxonium columnox_mass1 and ox_mass2The tool will automatically include all entries during the next analysis run. Consider testing with a known positive control sample first.
The database includes entries named Ox_test_1, Ox_test_2, etc. These are random masses serving as built-in negative controls for empirical false discovery assessment. Any detections of test masses represent random chance matches, providing a direct estimate of the false positive rate at the current threshold settings. For guidance on using test masses to optimize thresholds, see Detection Metrics.
Test masses are generated as random values within the sugar oxonium ion mass range (100–400 Da): a random integer base with a fractional part of 0–0.2 Da (or 0–0.25 Da above 250 Da). Each candidate is checked against all real sugar masses at 0.0075 Da tolerance to avoid accidental overlap. The water loss mass is computed as ox_mass1 − 18.011 Da. The number of test masses is approximately equal to the number of real sugar entries (1:1 ratio).
The curated database (small set of empirical sugar masses) and chemspace ("extensive" chemical sugar space) database each have their own independent set of test masses, tagged separately (curated_test and chemspace_test). Test masses should not be removed from the provided database.
The Chemspace database contains 3,332 chemically plausible monosaccharide compositions systematically enumerated within a defined elemental space (C, H, O, N, S) and mass range (~83–382 Da). Each entry is represented by its molecular formula (e.g. C5H8O2, C8H15NO6) and follows the same diagnostic mass-pair structure as the curated database. The database also includes 3,332 fixed random test masses used as negative controls. These entries are included together with the empirical sugars in the provided OX_DB_COMBINED_v03.xlsx reference file. Empirical sugars are labelled as "curated" in the list column, whereas theoretical Chemspace entries are labelled as "extensive".
The pipeline automatically searches both the curated and Chemspace databases.
Switching to the combined view displays both curated matches and additional Chemspace matches. This automatically applies stricter default thresholds (counts ≥ 25, intensity ≥ 1.5%, presence ≥ 0.025%) to account for the larger search space. Switching back to curated view restores the original thresholds.
Genuine sugar signals stand out even with >3,000 candidates. Key indicators:
Chemspace hits are identified by molecular formula only (e.g. C8H15N1O6). To determine the actual sugar identity, cross-reference the formula with sugar databases (CSDB, KEGG, GlyTouCan), check whether the mass matches known sugar derivatives for the organism, and validate with complementary experiments.
Recommended for:
| Property | Curated | Chemspace |
|---|---|---|
| Size | ~35 sugars + ~35 test masses | ~3,300 sugars + ~3,300 test masses |
| Names | Descriptive (Hex, HexNAc, dHex) | Molecular formulas (C6H12O6, C8H15NO6) |
| Source | Literature, CSDB, KEGG | Systematic generation of compositions |
| Customizable | Yes (edit Excel file) | No (hardcoded) |
| Test masses | In Excel file, user-editable | Hardcoded, fixed across all runs |
| Always active | Yes | Only with CHEMSPACE_SEARCH=True |