INTEGRATE Wiki
Brought to you by:
jin-wash-u
INTEGRATE version 0.1c Discover fusions by combining RNA-Seq and WGS data sets* usage: Integrate <subcommand> [options] list of data sets Integrate subcommands include: fusion: call fusions. mkbwt: build BWTs for reference genome. This has to be run one time before running subcommand fusion. *Note: Integrate can run with RNA only data sets.
INTEGRATE version 0.1c Creat directory: mkdir directory_to_bwts Run subcommand mkbwt: Integrate mkbwt (options) reference.fasta options: -mb integer : sequences in the reference fasta that are shorter than this value default: 10000000 are not included in the evaluation of repetitive reads. -dir string : directory to store the BWTs. default: ./bwts
INTEGRATE version 0.1c Make sure mkbwt has been run: Integrate fusion (options) reference.fasta annotation.txt directory_to_bwt accepted_hits.bam unmapped.bam (dna.tumor.bam dna.normal.bam) options: -cfn integer : Cutoff of spanning RNA-Seq reads for fusions with non-canonical exonic boundaries. default: 3 -rt float : Normal dna / tumor dna ratio. If the ratio is less than this value, then dna reads from the normal dna data set supporting a fusion candidates are ignored. default: 0.0 -minIntra integer : If only having RNA reads, a chimera with two adjacent genes in order is annotated as intra_chromosomal rather than read_through if the distance of the two genes is longer than this value. default: 400000 -minW float : Mininum weight for the encompassing rna reads on an edge. default: 2.0 -mb integer : See subcommand "mkbwt". This value can be larger than used by mkbwt. default: 10000000 -reads string : File to store all the reads. default: reads.txt -sum string : File to store summary. default: summary.tsv -ex string : File to store exons for fusions with canonical exonic boundaries. default: exons.tsv -bk string : File to store breakpoints default: breakpoints.tsv This version of Integrate works in the following situations: (1)having rna tumor, dna tumor, dna normal (2)having rna tumor, dna tumor (3)having rna tumor Integrate will only use sequences in reference.fasta. Chr names with and without "chr" are regarded as the same, e.g. chr1 = 1. The rna and dna bams can be from alignments mapped to different reference files with different order of the sequences and their names with or without "chr". However, The versions should be the same, e.g. hg19. (Also, the same as in annotation.) The tumor and normal dna bams should be mapped to the same reference file. For rna tumor: accepted_hits.bam is a bam file containing mapped rna reads. unmapped.bam is a bam contains the not mapped rna reads. If they have been merged into one bam, just use merged.bam twice in the command line. For dna bams: If solt-clips are provided, then Integrate is trying to search rearrangement breakpoints, otherwise, only paired reads may be included in the analysis. If having rna normal only or having both rna and dna normal data sets. These data sets can be run to find non somatic events. e.g. Integrate fusion -normal (options) reference.fasta annotation.txt directory_to_bwt accepted_hits.normal.bam unmapped.normal.bam (dna.normal.bam)