This page provides some information about the output files.
Each line in BED format corresponds to a peptide mapped to the genome.
For a generell description please see http://genome.ucsc.edu/FAQ/FAQformat.html#format1
There will be two BED files as output (annotation-filtered mapping or alternative mapping) which have different tracks for uniqueness (unique, non-unique, unmarked) so up to three tracks. Unmapped peptides won't be exported to BED.
Modification to the format are as follows:
As the format does not provide custom or additional columns, the name column (4) is used to present some peptide information (e.g. "3617a_AELGLPPLAEDSIQVVKSMR_z=3_sh=1_>").
Separated by underscores, it contains a query (with a modifier to make it unique), the peptide sequence, the charge, a shared value indicating the number of mappings of the peptide spectrum match to the genome and a strand orientation indicator.
The score column (5) contains the peptide spectrum match score, so it is not scaled between 0 and 1000 like usually done.
For coloring purposes each track uses the parameter itemRgb="On" and each peptide line provide an rgb color (column 9) which depends on the score and color settings in the configuration.
A short BED file example is provided here:
track name="darwin_top250_t2 (u,anno)" description="darwin_top250_t2 (unique, annotation mapped)" visibility=full itemRgb=On chr7 129096330 129096390 3617a_AELGLPPLAEDSIQVVKSMR_z=3_sh=1_> 8.32 + 129096330 129096390 191,191,191 1 60, 0, chr11 7614437 7618832 3617c_ESLILQVSVLTDQVEAQGEK_z=3_sh=1_> 9.79 + 7614437 7618832 191,191,191 2 18,42, 0,4353, chr16 72094543 72094603 3618c_SPVGVQPILNEHTFCAGMSK_z=3_sh=1_> 28.79 + 72094543 72094603 255,159,0 1 60, 0,
In a GFF3 file a peptide is represented as a feature.
For a generell description please see http://www.sequenceontology.org/gff3.shtml.
There will be two GFF3 files as output (annotation-filtered mapping or alternative mapping). Unmapped peptides won't be exported to GFF3.
Modifications to the format are as follows:
The SOURCE is "ipig".
The values in the TYPE column are self creations, starting with "peptide", followed by a mark for the uniqueness (unique, non-unique, unmarked) and a classification for the score in three groups (depending on user parameters threshold1 and threshold2). E.g. "peptide_unique_mid". The different feature type names might promt some genome browser to create different tracks per type. This could result in different colors for uniqueness and the scores ranges (e.g. in Geneious).
The SCORE is the same as the peptide sprectrum score.
ATTRIBUTES are:
A short GFF3 file example is provided here:
chr11 ipig peptide_unique_mid 5254203 5254238 32.16 - 0 ID=280e; Name=KHALANAVGAVV; z=3; shared=1; mods="Deamidated (NQ)"; modpos=0.000000100000.0; chr3 ipig peptide_unique_low 58109099 58109134 17.08 + 0 ID=280f; Name=VVASGPGLEHGK; z=3; shared=1; mods=""; modpos=; chr11 ipig peptide_unique_high 5246837 5246872 51.71 - 0 ID=281d; Name=KHALANAVGAVV; z=3; shared=1; mods="Deamidated (NQ)"; modpos=0.000000100000.0;
There will be two text files as output for the mapped peptides (annotation-filtered mapping or alternative mapping). Unmapped peptides will be exported to separated text file.
The format of the text files is a extension of the format of the input text files for peptide spectrum matches as described at [Input Formats].
The text file with mapped peptides is extended with columns indication the positions in the genome. This is chrom, strand, start_pos, stop_pos, shared. The start_pos and end_pos columns contain comma-separated position lists capable to indicate exon spanning peptides. The shared column indicates the number of mappings of the peptide spectrum match to the genome.
A short text file example is provided here:
prot_acc prot_desc pep_query pep_isunique pep_exp_z pep_score pep_seq pep_var_mod pep_var_mod_pos chrom strand start_pos stop_pos shared 112821681 G protein-regulated inducer of neurite outgrowth 1 [Homo sapiens] 9 1 2 7.23 KALGSAR chr5 - 176024627, 176024648, 1 41281496 mediator complex subunit 24 isoform 1 [Homo sapiens] 9 1 2 6.47 LSCHGK chr17 - 38191589,38191974, 38191602,38191979, 1 154800453 tastin isoform 1 [Homo sapiens] 10 1 3 11.66 IGILQQLLR chr12 + 49723930, 49723957, 1
The text file with unmapped peptides is extended with only one column indicating the problem of mapping a certain peptide spectrum match and providing the obtained references/links.
A short text file example is provided here:
prot_acc prot_desc pep_query pep_isunique pep_exp_z pep_score pep_seq pep_var_mod pep_var_mod_pos maperror 169208528 PREDICTED: similar to FLJ36144 protein [Homo sapiens] 68 1 3 17.95 NIITNKELK Deamidated (NQ) 0.000010000.0 noProt 169658367 BAH domain and coiled-coil containing 1 [Homo sapiens] 66 1 3 11.28 GEAGSLQKGPK Deamidated (NQ) 0.00000010000.0 noGene_(prots:[169658367, Q9P281, NP_001073988, ENST00000436173]) 4505541 USO1 homolog; vesicle docking protein [Homo sapiens] 211 1 2 16.59 ALKSLSK noMatch_(prots:[4505541, O60763, NP_003706, ENST00000264904],genes:[uc003hiw.2])