LoFreq comes with a variety of subcommands, all which are accessed by simply calling lofreq itself. By just typing lofreq
you will get a list of available commands.
The two most important commands are
lofreq call
or lofreq call-parallel
: for simply calling variantsand
lofreq somatic
: for calling somatic variants in paired normal/tumor samplesCalling lofreq with one of these subcommands and without further arguments will display usage information for that command.
A few things to note:
-l regions.bed
.Assuming your BAM file (aligned_reads.bam) contains reads aligned against the sequence/s in ref.fa and you want to predict variants and save them to vars.vcf, you would use the following
lofreq call -f refe.fa aligned_reads.bam -o vars.vcf
If you are dealing with human samples (or large genomes in general) we recommend the use if -S
(source quality) in combination with -V dbsnp.vcf
to get rid of some mapping problems (source quality is automatically used in the somatic SNV calling subcommand).
Assuming you have your normal reads mapped in normal.bam and the tumor reads in tumor.bam, both of which are mapped against hg19 (hg19.fa) then you would use the following to call somatic SNVs, using 8 threads and store the results to files with the prefix somatic
:
lofreq somatic --threads 8 -n normal.bam -t tumor.bam -f hg19.fa -o somatic [-d dbsnp.vcf.gz]
The use of dbsnp is optional but recommended. It will remove possibly undetected germline variants from the final output.
If you are working with Exome data, don't forget to provide the somatic command with the corresponding bed-file (-l region.bed
)
call
: Call variantsThe main command for calling variants.
call-parallel
: Parallel calling of variantsA wrapper around the call command that executes several instances of lofreq by working on multiple regions.
somatic
: Call somatic variants in matched tumor/normal pairsRuns the somatic SNV calling pipeline.
You will rarely need to run any of these commands. Most of them are either automatically used by lofreq itself.
filter
: Filter variants in (LoFreq) VCF fileThis will rarely be needed as LoFreq calls this command automatically with default parameters after predicting variants.
uniq
: Test whether variants predicted really can't be called in otherVariants are sometimes only predicted from one sample, but not a second, related because of coverage issues, borderline SNV-pvalues etc. This command will tell you whether a SNV was really not possible to be called in another. You will rarely need to use this command. It is built into the somatic SNV calling pipeline. Also note, this is not designed for very high coverage samples (>>1000X).
plpsummary
: Print pileup summary per positionMainly useful for debugging.
vcfset
: VCF set operationsvcfset operations like intersection and complement (similar to bedtools intersect
and bedtools subtract
but base-aware by default).
version
: Print version infoThese are optionally installed tools, that you might find useful. They require PyVCF, Scipy & Numpy as well as Matplotlib installed.
vcfplot
: Plot VCF statisticsSummarize properties of variants listed in vcf file. Part of the optional LoFreq Python tools.
cluster
: Cluster variants in VCF fileClusters variants based on their frequency confidence into minimal haplotype groups. Will give a lower haplotype estimate. Note, this is designed for viral samples.
Convenience clones of samtools subcommands.
index
: Create index for BAM fileA clone of samtools index
idxstats
: Print stats for indexed BAM fileA clone of samtools idxstats