LoFreq takes a read mapping as input (see the end of this document for notes on short-read mapping). It's a good idea to be as stringent as possible with your mapping and to recalibrate base-call qualities as well. A simple LoFreq call would look like this:
lofreq_snpcaller.py -f ref.fa -b mapping.bam -o raw-snv-output-file
In almost all cases you will want to post-process the predicted SNV calls by applying some filtering criteria, for which you should use lofreq_filter.py
(see below).
Please note:
--format vcf
), however, downstream scripts in LoFreq do not work with this format at the moment. We are currently migrating to vcf as default.type:consensus-var
) and low-frequency variants (type:low-freq-var
). Consensus variants are majority/consensus changes with respect to the reference and do not have quality values assigned (LoFreq is not meant to be a genotyping program).lofreq_snpcaller.py -h
will print the full help.
Some of the more important options are described in the following:
--bonf
). A conservative setting would be genome-size multiplied by three. You can use the helper script lofreq_bonf.py
to compute this value automatically.-l
. -E
) by default. You can influence this with the --baq
option.-Q
. The default is 3, which is in accordance with Illumina guidelines.--lofreq-nq-on --lofreq-q-off
).Use lofreq_filter.py
to filter SNV predictions produced by LoFreq. The three most common filter options would be
--min-cov
) and--snp-phred
; unnecessary if you used automatic Bonferroni above)--strandbias-holmbonf
),An example call looks like this:
lofreq_filter.py --strandbias-holmbonf --min-cov 10 --snp-phred 60 \ -i raw-snv-file -o filtered-snv-file
SNVs only called in one sample (e.g. cancer) but not in another paired sample (e.g. blood), can either be biologically interesting or simply be due to low coverage in one sample. You can use lofreq_uniq.py
to find out whether a call made only in one sample cannot be simply explained by the low coverage in the other (e.g. blood). lofreq_uniq.py
takes as minimal input a file listing SNVs predicted in only one sample (see also lofreq_diff.py
) and the other sample's BAM file. LoFreq comes with a script that automatically calls SNVs, filters them and finally derives unique SNVs (lofreq_uniq_pipeline.py
). An example call looks like this:
lofreq_uniq_pipeline.py --bam1 first.bam --bam2 second.bam \ --ref ref-fasta --bed regions-bedfile -o output-dir
This pipeline requires a bed-file describing the regions of interest to calculate a Bonferroni factor automatically. You can derive a template for such a file using lofreq_regionbed.py
. Output files can be found in output-dir
.
Wiki: Home
Wiki: usage-version-0.4.0
Wiki: usage-version-0.5.0