The previous version of Calis-p was introduced in our 2018 PNAS paper: "Metaproteomics method to determine carbon sources and assimilation pathways of species in microbial communities". It was designed to estimate natural isotope abundances of individual species in microbial communities using proteomics data, termed "protein-based stable isotope fingerprinting" or [Direct Protein-SIF]. In addition to isotope fingerprinting, the current version (2.0) can also estimate isotope abundances of labeling experiments, for use in protein-based stable isotope probing (Protein-SIP) experiments (Preprint).

Calis-p can run on Mac, Linux, and Unix. It takes protein identification search engine output in form of scored peptide spectra match (PSM) tables as input. An additional input that is required is the raw MS data in mzML format. From these two inputs Calis-p estimates ¹³C/¹²C and δ¹³C values for all peptides that pass the filters, as well as the summary of average values and standard errors for each species in the dataset.

This software is no longer maintained. Please use the newer python version of Calisp instead.

Installation

For noise filtering, Calis-p depends on another program, ''mcl''. You can find more information on ''mcl'' here: http://micans.org/mcl/. From that page, navigate to "Licence and Software" for installation instructions and a download link. After you have successfully installed ''mcl'', you can extract the downloaded zipped archive of Calis-p:

>unzip calis-p-2.0.zip

The jar file and [README] are in the "Calisp-2.0" folder.

How Calis-p works

Calis-p starts reading data from the input files that contain the peptides and spectra. A detailed explanation of how it deals with files and folders is provided here: [Input files].
How Calis-p handles peptides with sulfur, modifications and assigns peptides to different species is explained here: [Peptide processing]
How Calis-p extracts a peptide's MS1 spectra is explained here: [Parsing Spectra]
How Calis-p eliminates noisy spectra is explained here: [Eliminating noisy spectra]
How Calis-p estimates a peptide's isotopic composition is explained here: [Estimating isotopic composition]
How Calis-p calculates center statistics for species and proteins is explained here: [Center statistics]

Quick Start

Preparation of input files for Calis-p:

Calis-p requires at least two input files providing different data to be able to compute isotope compositions for organisms and proteins. First, a mzML file containing the mass spectrometry data is needed. Second, a file that contains the peptide-spectrum match (PSM) data provided by a peptide identification algorithm such as SEQUEST HT in Proteome Discoverer is needed. The PSM data can either be provided as a tab delimited table with specific columns or in the open mzIdentML (.mzid) format that can be provided by many proteomic search engines.

Click here for instructions on how to prepare the [mzML files].
Click here for instructions on how to prepare the [PSM files].

Command line usage and options

Common options are provided in the [README] file. For instructions, you cal also type:

>java -jar path/to/Calisp-2.0/Calisp-2.0.jar -h

To compute ¹³C/¹²C and δ¹³C values of species, proteins and peptides in files within folder "my_peptide_folder" and spectra in files within folder "my_spectrum_folder", and save results in "my_output_folder":

>java -Xms10g -jar path/to/Calisp-2.0/Calisp-2.0.jar -threads 10 -peptideFile my_peptide_folder -spectrumFile my_spectrum_folder -outputFile my_output_folder

In this example, Calis-p will use up to 10 Gb of memory and 10 threads, which will be sufficient to process around 10 mzML files in 10 minutes.

Calis-p output files

Calisp creates its reports, to share all its estimates using the different models with you (details on the models [Estimating isotopic composition]). By default, the reports are created in the folder “calisp-output”. You can instruct it to use a different folder with the parameter --output. If the output folder already exists, it will overwrite the previous results. Calisp will create the following tab-delimited files, which you can open in a spreadsheet program:

File	Screenshot of example file	Description
calisp-settings.csv	Settings	A list of all the user-parameters and files used for the computation.
filtering.metrics.csv	filtering.metrics	Summary of the number of isotope patterns that were kept or rejected during quality filtering.
default.delta.csv	default.delta	The estimates of the default model for organisms and files. These values are used for the [Direct Protein-SIF] method to obtain natural abundance ¹³C values.
default.proteins.csv	default.protein	The estimates of the default model for individual proteins. Usually the estimates are not accurate enough to determine SIF values for individual proteins and thus you mostly want to ignore these values.
default.amino.csv	default.amino	Estimate of the per-amino acid ¹³C content using the default model for organisms and files.
neutron_abundance.delta.csv	neutron.delta	The estimates of the neutron abundance model for organisms and files. This is the standard model to be used for Protein-SIP.
neutron_abundance.proteins.csv	neutron.proteins	The estimates of the neutron abundance model for individual proteins. This is the standard model to be used for Protein-SIP.
neutron_abundance.amino.csv	neutron.amino	Estimate of the per-amino acid label content using the neutron abundance model for organisms and files. This is the standard model to be used for Protein-SIP.
clumpy_label.delta.csv	clumpy.delta	The estimates of the clumpy label model for organisms and files. This model can be used for Protein-SIP under special circumstances for details see below and [Estimating isotopic composition].
clumpy_label.proteins.csv	clumpy.protein	The estimates of the clumpy label model for individual proteins.
clumpy_label.amino.csv	clumpy.amino	Estimate of the per-amino acid label content using the clumpy label model for organisms and files.
peptides.csv	peptides first half of columns / peptides 2nd half of columns	For each peptide, all information is provided including the aggregated intensity, the normalized spectrum and the estimates of all three models. Use this file if you would like to analyze your data, for example using R.
peptide-spectra.X.csv (one file for each MzML file X)	peptide-spectra	For each MzML file, each individual peptide spectrum extracted. These files are most probably only useful for debugging Calisp.

Project Members:

Manuel Kleiner (admin)
Marc Strous (admin)
Xiaoli Dong (admin)

Wiki: Center statistics
Wiki: Direct Protein-SIF
Wiki: Eliminating noisy spectra
Wiki: Estimating isotopic composition
Wiki: Input files
Wiki: PSM files
Wiki: Parsing Spectra
Wiki: Peptide processing
Wiki: README
Wiki: mzML files

Calis-p Wiki

Estimates delta13C of species in a microbiome from proteome data

Home

Calis-p: CALgary approach to ISotopes in proteomics