User Activity

  • Posted a comment on discussion General Discussion on LoFreq

    Hi Catherine, most variant callers produce rather arbitrary variant scores. LoFreq's variant quality scores are"proper" error probabilities converted into a Phred score. The error probabilities are computed using a poisson binomial distribution, which takes all multiple quality scores (mapping quality, alignment quality, base quality) into account. If you look up the definition of Phred scores you will see that Q20 corresponds to an error probability of 0.01, Q30 to 0.001 etc. 49314 is simply the...

  • Posted a comment on discussion General Discussion on LoFreq

    Dear Eugenia, thanks for you patience, while waiting for a reply. Source quality was a rather experimental attempt to add one more error source to LoFreq's core: it tries to account for contamination/mismappings etc. by looking at the amount of mismatches in a read (think of it as a variation of mapping quality). An accumulation of mismatches in a particular read leads to a penalty. However, you will want to ignore known variants, during the mismatch counting and for this you can for example use...

  • Posted a comment on discussion General Discussion on LoFreq

    That's only indirectly possible. You can run it with all filters off on the region of interest: lofreq call -r sq:start-end --no-default-filter -a 1 Andreas On 23 May 2018 at 11:53, Camilo cvillaman@users.sourceforge.net wrote: Thanks for the answers, they are very helpful. I have a final question, though. Is there a way to check why a possible variant is not being called by LoFreq? Source quality and ignore VCF in single tumor sample. https://sourceforge.net/p/lofreq/discussion/general/thread/cdeddc89/?limit=25#e4b0/2978/3285/2592/75a0...

  • Posted a comment on discussion General Discussion on LoFreq

    Oh I see. In general, using source quality will give you more conservative calls. There is a chance that it will undercall in mutational hotspots. Variants in the "ignore vcf" file are just used to tune the source quality computation. Normally reads with lots of variants get a low source quality, however, variants listed in the aforementioned file are ignored for this. These variants are not used to mask final calls! Hope this answers the question, Andreas On 22 May 2018 at 23:14, Camilo cvillaman@users.sourceforge.net...

  • Posted a comment on discussion General Discussion on LoFreq

    Hi Camilo, -S won't mask variants. It just affect the somatic variant quality score. In fact, adding dbSNP here should have increased the quality of this call. What happens if you run it without the extra -S? Also, there is not (lowercase) '-s' option. Was that a typo? Best, Andreas On 18 May 2018 at 23:13, Camilo cvillaman@users.sourceforge.net wrote: Hello, I'm using LoFreq to call variants on some human tumor samples. We had analized those samples beforehand, so I had an idea about which variants...

  • Posted a comment on discussion General Discussion on LoFreq

    Hi Steve, for duplicate marking (if needed) you can use any tools of your choice, e.g. sambamba. For realignment you can use LoFreq's own realigned lofreq viterbi (requires resorting afterwards). For base quality calibration you can still use GATK or alternatively Lacer https://www.biorxiv.org/content/early/2017/04/25/130732. You should get decent results even without recalibration. Best, Andreas On 16 May 2018 at 22:04, Steve stevekm@users.sourceforge.net wrote: In the documentation for LoFreq,...

  • Posted a comment on discussion General Discussion on LoFreq

    Hi Steve, yes, these variants are not filtered, even though if you just look at the pvalue/quality, they should be. The reason is that strand-bias is a messy beast and we use some hacks: No one really knows why it happens (AFAIK). In viral amplicon data (for which LoFreq was originally designed) we often saw cases, where simply due to the ultra high coverage, you'd get very high p-values even though nothing seem wrong with these variants if you were to evaluate them by eye (plenty of coverage for...

  • Posted a comment on discussion General Discussion on LoFreq

    Hi Steve, sure. The basics are explained in the NAR paper (Wilm, 2012): We compute a poisson-binomial distribution taking error probabilities at each pileup site into consideration and derive a p-value from that. Error probabilities were originally just converted base qualities (because that's what they are). In later LoFreq versions we merged base alignment, mapping and base quality into one error probability per base. The logic goes like this: either the read is misaligned (mapping quality) or...

View All

Personal Data

Username:
onde
Joined:
2004-03-01 15:23:46
Location:
Singapore / +08
Gender:
Male
Web Site:
  1. http://www.andreas-wilm.com/

Projects

This is a list of open source software projects that Andreas Wilm is associated with:

  • Project Logo LoFreq Fast and sensitive variant-calling from sequencing data Last Updated:

Personal Tools