This part of the pipeline performs quality trimming of reads, mapping, fusion calling, read counting, estimation of insert size.
Output is stored in the following folder structure:
This part of the pipeline performs filtering of fusion events by the built-in filters of the callers, a custom generated blacklist and the metrics: Promiscuity Score (PS), Fusion Transcript Score (FTS) and Robustness Score (RS). A description of these metrics can be found here.
see Output files
Set the parameters in FP_run.sh before executing:
# Required:
threads=<integer> # Number of threads for running the detection pipeline
outputfolder=<string> # Path to output folder
genomebuild=<string> # ["hg19", "hg38"]
sample_name=<string> # Name of sample
fastq_folder=<string> # Path to folder containing the fastq files
strandness=<integer> # [0 -> unstranded, 1 -> stranded, 2 -> reversely stranded]
ref=<string> # Path to genome reference file in Fasta format (GENCODE)
anno=<string> # Path to gene annotation file in GTF format (GENCODE)
starindex=<string> # Path to folder containing the STAR index
fcdata=<string> # Path to folder containing the genomic database as required by FusionCatcher
# Steps to perform (0 skips the according step)
FusionCatcher=1 # Fusion calling by FusionCatcher
FastP=1 # Read trimming before STAR mapping
STAR=1 # Mapping by STAR (If FastP=0 mapping is performed on untrimmed reads)
Arriba=1 # Fusion calling by Arriba (Preceding mapping required)
FeatureCounts=1 # Read counting (Preceding mapping required)
Picard=1 # Insert size estimation (Preceding mapping required)
This will only work if you have performed Part I on at least two samples.
Set the parameters in FP_filter.sh before executing:
# Required:
anno=<string> # Path to gene annotation file in GTF format (GENCODE) as used in Part I
outputfolder=<string> # Path to output folder generated by the detection pipeline in Part I
clintable=<string> # Path to clinical information table in Excel format
# Optional:
debug_flag=0 # Set to 1 for saving R workspace
internal_BL=1 # Whether to use the internal blacklist of fusion genes
user_BL=<string> # Path to own fusion blacklist (xlsx file, first column with fusion labels)