Hi Catherine, most variant callers produce rather arbitrary variant scores. LoFreq's variant quality scores are"proper" error probabilities converted into a Phred score. The error probabilities are computed using a poisson binomial distribution, which takes all multiple quality scores (mapping quality, alignment quality, base quality) into account. If you look up the definition of Phred scores you will see that Q20 corresponds to an error probability of 0.01, Q30 to 0.001 etc. 49314 is simply the...
Dear Eugenia, thanks for you patience, while waiting for a reply. Source quality was a rather experimental attempt to add one more error source to LoFreq's core: it tries to account for contamination/mismappings etc. by looking at the amount of mismatches in a read (think of it as a variation of mapping quality). An accumulation of mismatches in a particular read leads to a penalty. However, you will want to ignore known variants, during the mismatch counting and for this you can for example use...
That's only indirectly possible. You can run it with all filters off on the region of interest: lofreq call -r sq:start-end --no-default-filter -a 1 Andreas On 23 May 2018 at 11:53, Camilo cvillaman@users.sourceforge.net wrote: Thanks for the answers, they are very helpful. I have a final question, though. Is there a way to check why a possible variant is not being called by LoFreq? Source quality and ignore VCF in single tumor sample. https://sourceforge.net/p/lofreq/discussion/general/thread/cdeddc89/?limit=25#e4b0/2978/3285/2592/75a0...
Oh I see. In general, using source quality will give you more conservative calls. There is a chance that it will undercall in mutational hotspots. Variants in the "ignore vcf" file are just used to tune the source quality computation. Normally reads with lots of variants get a low source quality, however, variants listed in the aforementioned file are ignored for this. These variants are not used to mask final calls! Hope this answers the question, Andreas On 22 May 2018 at 23:14, Camilo cvillaman@users.sourceforge.net...
Hi Camilo, -S won't mask variants. It just affect the somatic variant quality score. In fact, adding dbSNP here should have increased the quality of this call. What happens if you run it without the extra -S? Also, there is not (lowercase) '-s' option. Was that a typo? Best, Andreas On 18 May 2018 at 23:13, Camilo cvillaman@users.sourceforge.net wrote: Hello, I'm using LoFreq to call variants on some human tumor samples. We had analized those samples beforehand, so I had an idea about which variants...
Hi Steve, for duplicate marking (if needed) you can use any tools of your choice, e.g. sambamba. For realignment you can use LoFreq's own realigned lofreq viterbi (requires resorting afterwards). For base quality calibration you can still use GATK or alternatively Lacer https://www.biorxiv.org/content/early/2017/04/25/130732. You should get decent results even without recalibration. Best, Andreas On 16 May 2018 at 22:04, Steve stevekm@users.sourceforge.net wrote: In the documentation for LoFreq,...
Hi Steve, yes, these variants are not filtered, even though if you just look at the pvalue/quality, they should be. The reason is that strand-bias is a messy beast and we use some hacks: No one really knows why it happens (AFAIK). In viral amplicon data (for which LoFreq was originally designed) we often saw cases, where simply due to the ultra high coverage, you'd get very high p-values even though nothing seem wrong with these variants if you were to evaluate them by eye (plenty of coverage for...
Hi Steve, sure. The basics are explained in the NAR paper (Wilm, 2012): We compute a poisson-binomial distribution taking error probabilities at each pileup site into consideration and derive a p-value from that. Error probabilities were originally just converted base qualities (because that's what they are). In later LoFreq versions we merged base alignment, mapping and base quality into one error probability per base. The logic goes like this: either the read is misaligned (mapping quality) or...
Hi Steve, the strand-bias p-values is turned into a phred-quality, whose upper bound depends on the precision of the float. In practice it can get much higher then 1900. The fact that you see phred values <60 in other programs is simply because it's mostly arbitrary capped there. Andreas On 4 May 2018 at 03:50, Steve stevekm@users.sourceforge.net wrote: I have another question about the SB score values from the .vcf output. It is my understanding that these values are Phred quality scores, which...
Hi Steve, not sure why the actually quality filtering is not mentioned there. Let me look into this. Anyway, the main filtering step is working on the variant qualities (which are converted p-values) and it's by default based on Bonferroni correction and a significance threshold of 0.01 Best, Andreas On 4 May 2018 at 07:12, Steve stevekm@users.sourceforge.net wrote: The FAQ page for LoFreq says Do I need to filter LoFreq predictions? You usually don't. Predicted variants are already filtered using...
Hi Francisco, LoFreq doesn't have an AF filter. The default filter is based on variant quality only. It furthermore actually doesn't report genotypes. Taken together this makes it likely that your collaborator post-processed the vcf file somehow. Hope this helps, Andreas On 24 March 2018 at 13:45, Francisco De La Vega ribozyme@users.sourceforge.net wrote: I have received a VCF from LowFeq form a collaborator that used it for calling SNVs from a cfDNA targeted sequencing assay at a high depth of coverage....
Hi Nils, in short: the BAM file was created with a different reference. The checkref subcommand checks whether the reference fasta given on the command line matches the one given in the BAM header. In your case the BAM header contains a sequence named "1", which is not part of the fasta file. Hope this helps, Andreas On 13 November 2017 at 06:05, Nils Engel nils321@users.sf.net wrote: Hi, I have a problem using lofreq with human sequencing data and hg19 or GRCh38 reference sequences ( downloaded...
Hello, DP4 only lists the reference and variant base counts. There are usually other bases present as well, which are taken into account for computing AF. Hoping this explains the discrepancy, Andreas On 26 October 2017 at 04:58, siva siva80@users.sf.net wrote: Hi I have several variants (especially those with almost hom-alt allele) that have different allele fraction estimates from DP4 and the AF= tag. for example DP=4088;AF=0.872798;SB=171;DP4=9,33,3329,685 Here from DP4, the AF can be estimated...
Hello, strand-bias is defined as in samtools: reference and alternate base counts on forward and reverse strand are used as input for Fisher's exact test. This tries to quantify in how far the reference and alternate counts on forward and reverse strand differ, i.e. you'll get high p-values if you have lots of reference bases on one and lots of alternate bases on the other strand. It does not test however whether both, reference and alternate bases, are mainly on the same strand. I hope this explanation...
Hi Erik, hard to tell from this output. Might be because of strand bias. Could you...
Hi Jessica, these are SNVs that show significant strand bias (sb) and are therefore...
Hi Erik, when you switch of default filtering in the call subcommand[s] LoFreq will...
Sorry, I know what's happening: the filters will only affect the actual SNV calling...
Hi gmy, that is indeed a bit strange. Which exact LoFreq version are you using? Would...
Hi gmy, I would strongly encourage you to stick to default parameters in LoFreq,...
Hi Chris, LoFreq results are filtered already relatively stringent (1% p-value threshold...
Hey Chris, if you get a final vcf file, then there is no need to rerun LoFreq. Whether...
Hey Chris, yes, the file somatic_final_minus-dbsnp.snvs.vcf.gz is not there, because...
Hi Chris, When you call somatic SNVs then you only need to look at the file that...
Oh ok. That looks like an extension of the bed format. LoFreq (and samtools) expect...
Hi Chris, this looks like an unhandled error triggered in the bed reading function....
Home
Moved website and blog to github: http://csb5.github.io/lofreq/
Release LoFreq 2.1
Release LoFreq 2.1
Release LoFreq 2.1
Release LoFreq 2.1
Release LoFreq 2.1
Hi Jessica, the strand-bias test checks whether the proportion of bases on forward...
LoFreq as Docker container
Alpha testers for release 2.1 needed
LoFreq-Star-Best-Practices
Performance issues when using bed-file with many regions
Hi Joon There are at least two things going on here. One of the errors seems to come...
Hi Brian, thanks for pointing this out! Parsing of the "--cons-as-ref" option was...
Hi Brian, this is very likely caused by an error in the argument list, i.e. wrong...
LoFreq-Star-Usage
LoFreq-Star-Usage
LoFreq-Star-Best-Practices
LoFreq-Star-Installation
LoFreq-Star-Installation
LoFreq-Star-Installation
Release of final 2.0.0
You're right: parallel calls on single chromosomes were not implemented in RC-1....
Hi Brian, please don't implement parallelization manually. This will mess up the...
Jessica, the latest version of LoFreq actually removes the need for any thresholding...
Hi Jessica, if you keep low quality bases, then LoFreq will make fewer predictions,...
Hi Robert, to elaborate on the indel calling a bit: indel calling with LoFreq is...
Hi Robert, we've implemented indel calling in LoFreq, but are a bit reluctant to...
Home
Hi Elena, I just noticeda few more tiny errors in the usage information and fixed...
Hi Elena, filtering is a separate downstream step. You will have to run 'lofreq filter'...
LoFreq-Star-FAQ
LoFreq-Star-Best-Practices
LoFreq-Star-FAQ
LoFreq-Star-Best-Practices
LoFreq-Star-Best-Practices
LoFreq-Star-Usage
Small correction: I incorrectly said "LoFreq will only report SNVs with a p-value...
Hi Jessica, sorry for the late reply. I was on leave over Easter without internet...
LoFreq-Star
Home
usage-version-0.5.0
LoFreq-Star-FAQ
LoFreq-Star-FAQ
LoFreq-Star-FAQ
LoFreq-Star
LoFreq-Star-FAQ
LoFreq-Star-FAQ
LoFreq-Star-Best-Practices
LoFreq-Star-Usage
LoFreq-Star-Usage
LoFreq-Star-Usage
LoFreq-Star-Installation
LoFreq-Star
LoFreq-Star
LoFreq-Star
LoFreq-Star
Hi Jessica, the reference base ('R') is directly taken from the reference fasta file...
Hi Jessica, the difference in coverage should not be there in newer versions of LoFreq,...
Hi Jessica, SNV quality and base quality are two different things. Base quality tells...
LoFreq-Star
LoFreq-Star-Usage
LoFreq-Star-Installation
LoFreq-Star-Installation
LoFreq-Star-Installation
LoFreq-Star-Installation
LoFreq-Star
LoFreq-Star-Installation
LoFreq-Star-FAQ
LoFreq-Star
LoFreq-Star
LoFreq-Star
LoFreq-Star
LoFreq-Star