Menu

Default filtering criteria

Steve
2018-05-03
2018-05-06
  • Steve

    Steve - 2018-05-03

    The FAQ page for LoFreq says

    Do I need to filter LoFreq predictions?

    You usually don't. Predicted variants are already filtered using default
    parameters (which include coverage, strand-bias, snv-quality etc).

    However, I do not see any details about what these default filtering
    parameters are. Is there a description anywhere? When I try to run lofreq filter --verbose, the only output I get is:

    Setting default SB filtering method to FDR
    Setting default minimum coverage to 10
    

    What other criteria are being used to filter variants?

     
    • Andreas Wilm

      Andreas Wilm - 2018-05-04

      Hi Steve,

      not sure why the actually quality filtering is not mentioned there. Let me
      look into this. Anyway, the main filtering step is working on the variant
      qualities (which are converted p-values) and it's by default based on
      Bonferroni correction and a significance threshold of 0.01

      Best,
      Andreas

      On 4 May 2018 at 07:12, Steve stevekm@users.sourceforge.net wrote:

      The FAQ page for LoFreq says

      Do I need to filter LoFreq predictions?

      You usually don't. Predicted variants are already filtered using default
      parameters (which include coverage, strand-bias, snv-quality etc).

      However, I do not see any details about what these default filtering
      parameters are. Is there a description anywhere? When I try to run lofreq
      filter --verbose, the only output I get is:

      Setting default SB filtering method to FDR
      Setting default minimum coverage to 10

      What other criteria are being used to filter variants?

      Default filtering criteria
      https://sourceforge.net/p/lofreq/discussion/general/thread/27dd92cb/?limit=25#268b


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       
  • Steve

    Steve - 2018-05-04

    Thanks Andreas.

    a significance threshold of 0.01

    I was looking in the source code and saw here:
    https://github.com/CSB5/lofreq/blob/master/src/lofreq/lofreq_filter.c#L1093

    if (! no_defaults) {
         if (cfg.sb_filter.mtc_type==MTC_NONE && ! cfg.sb_filter.thresh) {
              LOG_VERBOSE("%s\n", "Setting default SB filtering method to FDR");
              cfg.sb_filter.mtc_type = MTC_FDR;
              cfg.sb_filter.alpha = 0.001;
    }
    

    Does this mean that the default Strand Bias filter is at a p-value of 0.001? (cfg.sb_filter.alpha = 0.001)

    As per my other post, I am getting a lot of variants with SB values of >500, so does this mean that strand bias is not being filtered by default?

     
    • Andreas Wilm

      Andreas Wilm - 2018-05-06

      Hi Steve,

      yes, these variants are not filtered, even though if you just look at the
      pvalue/quality, they should be. The reason is that strand-bias is a messy
      beast and we use some hacks:
      No one really knows why it happens (AFAIK). In viral amplicon data (for
      which LoFreq was originally designed) we often saw cases, where simply due
      to the ultra high coverage, you'd get very high p-values even though
      nothing seem wrong with these variants if you were to evaluate them by eye
      (plenty of coverage for ref and alt and forward and reverse strand; but
      skewed obviously, otherwise you wouldn't get a high p-value). So we
      introduced a compound filter. See the section on strand bias when you run
      lofreq filter --help:
      Note, variants are only filtered if their SB pvalue is below the threshold
      AND 85% of variant bases are on one strand (toggled with
      --sb-no-compound).

      This is under-documented and not ideal, but we need to define defaults. I
      guess a newer LoFreq version would you presets to define a set of calling
      and filtering parameters based on input type. You might want to experiment
      with running lofreq call --no-default-filter and then trying different
      parameters in a subsequent lofreq filter run.

      Andreas

      On 4 May 2018 at 23:03, Steve stevekm@users.sourceforge.net wrote:

      Thanks Andreas.

      a significance threshold of 0.01

      I was looking in the source code and saw here:
      https://github.com/CSB5/lofreq/blob/master/src/lofreq/
      lofreq_filter.c#L1093

      if (! no_defaults) {
      if (cfg.sb_filter.mtc_type==MTC_NONE && ! cfg.sb_filter.thresh) {
      LOG_VERBOSE("%s\n", "Setting default SB filtering method to FDR");
      cfg.sb_filter.mtc_type = MTC_FDR;
      cfg.sb_filter.alpha = 0.001;
      }

      Does this mean that the default Strand Bias filter is at a p-value of
      0.001? (cfg.sb_filter.alpha = 0.001)

      As per my other post
      https://sourceforge.net/p/lofreq/discussion/general/thread/ee151ab0/, I
      am getting a lot of variants with SB values of >500, so does this mean that
      strand bias is not being filtered by default?


      Default filtering criteria
      https://sourceforge.net/p/lofreq/discussion/general/thread/27dd92cb/?limit=25#5e11


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.