Menu

LoFreq2: Indels and filtering

2014-06-06
2014-06-09
  • Robert Eveleigh

    Robert Eveleigh - 2014-06-06

    Hi there,

    I just came across your Lofreq paper and I'm very interested in trying this out on MiSeq data we generate here. When testing this out using Lofreq_star_2.00-rc-1 I've noticed 2 things.

    1. In the output vcf generated I did not notice any indels. I'm curious if LoFreq2 calls indels. In the paper and earlier versions it does not, but on your blog you mention that LoFreq2 now deals with indels. Could you please verify the version I'm using deals with indels appropriately.

    2. When running lofreq call with the --no-default-filter option the resulting vcf is still being filter by --snvqual-thresh. Is this a bug?

    Command:
    lofreq call -f hg1k_v37.fasta -l intervals.bed -q 15 -Q 20 -m 15 -C 10 --no-default-filter --out lofreq.target.vcf sorted.leftAligned.bam

    <snip>
    Executing lofreq filter --only-passed -i /tmp/lofreq2-call-dyn-bonf.2Ut3e3 -o lofreq.target.vcf --no-defaults --snvqual-thresh 74

    Thanks!

    Cheers,
    Rob Eveleigh

     
    • Andreas Wilm

      Andreas Wilm - 2014-06-09

      Hi Robert,

      we've implemented indel calling in LoFreq, but are a bit reluctant to
      open the feature to the public before we've extensively benchmarked it
      for the paper. We will make it part of the package very soon though. I
      can notify you by email if you'd like.

      RE filtering: no need to worry, this is as it's supposed to be.
      Normally, when you run 'lofreq call' it will execute the filter
      subcommand at the end, which has default settings to remove SNVs with
      high strand-bias and or are low coverage regions. You will want to get
      rid of those guys, i.e. I would not recommend to use
      --no-default-filter (for 'lofreq call' and likewise --no-defaults for
      'lofreq filter'). If you want full control over the SNV quality see
      further below for how to do this. First let me explain, why you still
      see the filter command running in your verbose output: when you use
      --no-default-filter as option for the call subcommand, the filter
      command will be run nevertheless (with the defaults switched off
      though, as you can see from the verbose output) because it has to
      apply an automatically computed Bonferroni factor (applied as SNV
      quality threshold). This in turn can be controlled with the -b option
      of the call subcommand. For example, if you wanted to make sure to get
      all SNVs (except the ones with strand-bias and at low coverage) with a
      Phred quality of at least 60, you can do two things:
      1. 'lofreq call -b 1 ...' followed by an explicit 'lofreq filter -B 60 ...'
      or
      2. 'lofreq call -b 1 -s 0.000001 ...' without explicit filtering
      This latter will use a Bonferroni factor of 1 with a p-value threshold
      of 1e-6 (== Phred quality 60)

      I hope that answers your question. Please let me know if you need
      further clarification!

      Best,
      Andreas

      On 7 June 2014 03:06, Robert Eveleigh insilicool@users.sf.net wrote:

      Hi there,

      I just came across your Lofreq paper and I'm very interested in trying this
      out on MiSeq data we generate here. When testing this out using
      Lofreq_star_2.00-rc-1 I've noticed 2 things.

      In the output vcf generated I did not notice any indels. I'm curious if
      LoFreq2 calls indels. In the paper and earlier versions it does not, but on
      your blog you mention that LoFreq2 now deals with indels. Could you please
      verify the version I'm using deals with indels appropriately.

      When running lofreq call with the --no-default-filter option the resulting
      vcf is still being filter by --snvqual-thresh. Is this a bug?

      Command:
      lofreq call -f hg1k_v37.fasta -l intervals.bed -q 15 -Q 20 -m 15 -C 10
      --no-default-filter --out lofreq.target.vcf sorted.leftAligned.bam

      Executing lofreq filter --only-passed -i /tmp/lofreq2-call-dyn-bonf.2Ut3e3
      -o lofreq.target.vcf --no-defaults --snvqual-thresh 74

      Thanks!

      Cheers,
      Rob Eveleigh


      LoFreq2: Indels and filtering


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       
      • Andreas Wilm

        Andreas Wilm - 2014-06-09

        Hi Robert,

        to elaborate on the indel calling a bit: indel calling with LoFreq is
        currently a multi-stage process which is not necessarily user friendly
        and requires you to know a bit about your data. Creating a simple
        work-flow with all functions implemented in one binary will take
        another couple of weeks.

        If however your BAM file was recalibrated with GATK 2 i.e. contains
        indel qualities (tags BI and BD in your BAM file) and was also indel
        realigned (with e.g. GATK) then the process is rather simple: you
        would need to run two binaries on your data, which will give you a
        vcf-file with indel calls only. I'm happy to share those binaries with
        you if you're interested. Just keep in mind this is "beta".

        Let me know if you're interested via PM: wilma@gis.a-star.edu.sg

        Andreas

        On 9 June 2014 10:57, Andreas Wilm andreas.wilm@gmail.com wrote:

        Hi Robert,

        we've implemented indel calling in LoFreq, but are a bit reluctant to
        open the feature to the public before we've extensively benchmarked it
        for the paper. We will make it part of the package very soon though. I
        can notify you by email if you'd like.

        RE filtering: no need to worry, this is as it's supposed to be.
        Normally, when you run 'lofreq call' it will execute the filter
        subcommand at the end, which has default settings to remove SNVs with
        high strand-bias and or are low coverage regions. You will want to get
        rid of those guys, i.e. I would not recommend to use
        --no-default-filter (for 'lofreq call' and likewise --no-defaults for
        'lofreq filter'). If you want full control over the SNV quality see
        further below for how to do this. First let me explain, why you still
        see the filter command running in your verbose output: when you use
        --no-default-filter as option for the call subcommand, the filter
        command will be run nevertheless (with the defaults switched off
        though, as you can see from the verbose output) because it has to
        apply an automatically computed Bonferroni factor (applied as SNV
        quality threshold). This in turn can be controlled with the -b option
        of the call subcommand. For example, if you wanted to make sure to get
        all SNVs (except the ones with strand-bias and at low coverage) with a
        Phred quality of at least 60, you can do two things:
        1. 'lofreq call -b 1 ...' followed by an explicit 'lofreq filter -B 60 ...'
        or
        2. 'lofreq call -b 1 -s 0.000001 ...' without explicit filtering
        This latter will use a Bonferroni factor of 1 with a p-value threshold
        of 1e-6 (== Phred quality 60)

        I hope that answers your question. Please let me know if you need
        further clarification!

        Best,
        Andreas

        On 7 June 2014 03:06, Robert Eveleigh insilicool@users.sf.net wrote:

        Hi there,

        I just came across your Lofreq paper and I'm very interested in trying this
        out on MiSeq data we generate here. When testing this out using
        Lofreq_star_2.00-rc-1 I've noticed 2 things.

        In the output vcf generated I did not notice any indels. I'm curious if
        LoFreq2 calls indels. In the paper and earlier versions it does not, but on
        your blog you mention that LoFreq2 now deals with indels. Could you please
        verify the version I'm using deals with indels appropriately.

        When running lofreq call with the --no-default-filter option the resulting
        vcf is still being filter by --snvqual-thresh. Is this a bug?

        Command:
        lofreq call -f hg1k_v37.fasta -l intervals.bed -q 15 -Q 20 -m 15 -C 10
        --no-default-filter --out lofreq.target.vcf sorted.leftAligned.bam

        Executing lofreq filter --only-passed -i /tmp/lofreq2-call-dyn-bonf.2Ut3e3
        -o lofreq.target.vcf --no-defaults --snvqual-thresh 74

        Thanks!

        Cheers,
        Rob Eveleigh


        LoFreq2: Indels and filtering


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/lofreq/discussion/general/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/

        --
        Andreas Wilm
        andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

        --
        Andreas Wilm
        andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.