SugarBase: mapping glycomolecule precursors in microbes
Overview
SugarBase is designed to process LC–MS/MS data (in .mzXML format) to identify nucleotide-activated sugars (NT-sugars) based on diagnostic nucleotide fragments and a predefined chemical composition space. The workflow combines pseudo-MS1 precursor information with MS2 fragment evidence, filters candidate masses using theoretical chemical compositions, and assigns a confidence score to each detected hit.
The pipeline supports both a discovery mode (screening a broad chemical space of possible nucleotide-sugars) and a targeted mode (searching only predefined compositions from an input list). The desktop executable allows to explore the results interactively, by providing the extracted ion chromatograms as well as the pseudo-MS1 and MS2 scans in which the potential NT-sugar was detected for each nucleotide-sugar hit. In addition, results can be exported as an Excel table.
Key analytical steps include MS1 preprocessing (deisotoping and filtering), MS2 fragment extraction, MS1–MS2 alignment, precursor matching against a theoretical chemical sugar space, detection of consistent signals across consecutive scans, and multi-parameter scoring to rank candidate nucleotide-sugar hits.
SugarBase can be used to discover the nucleotide-sugar profile in microbial extracts without requiring any prior knowledge.
Workflow
- Parameter initialization.
- Specify discovery or targeted mode analysis.
- Generation of a theoretical chemical space of nucleotide-activated sugar compositions.
- Peak lists are extracted from .mzXML files.
- Odd/even scans are separated into pseudo-MS1 and fragmented MS2 datasets.
- MS1 data pre-processing
• Optional deisotoping
• Filter MS1 data by m/z window, retention time and minimum intensity thresholds.
- Fragment extraction and MS1-MS2 alignment
• Identify diagnostic nucleotide fragments in MS2 spectra.
• Align fragment signals with MS1 precursor peaks.
• Match precursor masses against the theoretical chemical space.
- CDP-sugar detection
• Detect CMP fragments to specifically identify CDP-activated sugars.
• Annotate candidates with a CDP-sugar flag.
- Candidate precursors are retained only if detected in ≥3 consecutive scans with the corresponding nucleotide fragment in MS2.
- Candidate scoring; assign scores based on several criteria:
• Occurrence across consecutive scans
• Pearson correlation
• MS1 signal intensity
• CO₂ neutral loss
• Precursor (MS1) intensity/Fragment (MS2) intensity ratio (non-CMP-sugars)
• Precursor (MS2) intensity/Fragment (MS2) intensity ratio (CMP-sugars)
• nitrogen rule consistency (CMP-sugars)
- Interactive dashboard enables exploration or results.
- Tabular output with nucleotide-sugars hits and their likeliness score.
- Extracted ion chromatograms and pseudo-MS1 and MS2 spectra are generated.
- Results can be exported as Excel table.
Quick start
- Download the desktop executable:
https://sourceforge.net/projects/sugarbase-x/files/SugarBase_v08032026.zip/download
- Open the desktop executable
- Download the test data:
https://sourceforge.net/projects/sugarbase-x/files/Test_data_SugarBase.zip/download
- Select your input folder. A maximum of three .mzXML files is allowed per analysis.
- Select your output folder.
- Enter a sample name.
- Select the analysis mode:
- 'Targeted' -> requires a Target.xlsx file which should include the chemical formulas of the sugars you want to target, e.g. C6H12O6.
- 'Discovery' -> no Target.xlsx file is required.
- To run the analysis with default parameters, click "Run Analysis":
- A tabular output will show up, presenting the identified nucleotide-sugar hits with their total score.
- The table can be exported as an Excel file, using the "Export" function.
- Each nucleotide-sugar hit can be explored interactively by selecting one of the hits in the table:
- For a description of the input and output parameters, the reader is referred to the ReadMe file.
Citation
Van Ede, J.M., Holst Sørensen, M.C., Sorokin, D.Y., van Loosdrecht, M.C.M. and Pabst, M. SugarBase: mapping glycomolecule precursors in microbes. (2026).
Jitske M. van Ede (J.M.vanEde@tudelft.nl)
Martin Pabst (M.Pabst@tudelft.nl)