Menu

Alternatives to GATK for upstream processing?

Steve
2018-05-16
2018-05-17
  • Steve

    Steve - 2018-05-16

    In the documentation for LoFreq, it is suggested:

    For Illumina data, we suggest that you preprocess your BAM files by following GATK’s best practice protocol, i.e. that you mark duplicates (not for very high coverage data though), realign indels and recalibrate base qualities with GATK (BQSR). The latter will also add indel qualities, which is needed for indel calling (alternatively use lofreq indelqual).

    However, GATK has upgraded to version 4, and has dropped many of these tools since they've been integrated directly with the variant callers, I believe.

    https://github.com/broadinstitute/gatk/issues/3084

    https://github.com/broadinstitute/gatk/issues/3104

    It is not clear when these tools will be ported over to GATK 4. This means that any variant calling pipelines that use both LoFreq and GATK callers are unable to upgrade to GATK 4, and must maintain reliance on GATK 3. This is a serious issue, because GATK 4 is free for commercial use, but GATK 3 is not.

    Is there an alternative to the GATK 3 tools for use in upstream processing & preparation of files for use with LoFreq?

     
    • Andreas Wilm

      Andreas Wilm - 2018-05-17

      Hi Steve,

      for duplicate marking (if needed) you can use any tools of your choice,
      e.g. sambamba. For realignment you can use LoFreq's own realigned lofreq viterbi (requires resorting afterwards). For base quality calibration you
      can still use GATK or alternatively Lacer
      https://www.biorxiv.org/content/early/2017/04/25/130732. You should get
      decent results even without recalibration.

      Best,
      Andreas

      On 16 May 2018 at 22:04, Steve stevekm@users.sourceforge.net wrote:

      In the documentation for LoFreq, it is suggested:

      For Illumina data, we suggest that you preprocess your BAM files by
      following GATK’s best practice protocol, i.e. that you mark duplicates (not
      for very high coverage data though), realign indels and recalibrate base
      qualities with GATK (BQSR). The latter will also add indel qualities, which
      is needed for indel calling (alternatively use lofreq indelqual).

      However, GATK has upgraded to version 4, and has dropped many of these
      tools since they've been integrated directly with the variant callers, I
      believe.

      https://github.com/broadinstitute/gatk/issues/3084

      https://github.com/broadinstitute/gatk/issues/3104

      It is not clear when these tools will be ported over to GATK 4. This means
      that any variant calling pipelines that use both LoFreq and GATK callers
      are unable to upgrade to GATK 4, and must maintain reliance on GATK 3. This
      is a serious issue, because GATK 4 is free for commercial use, but GATK 3
      is not.

      Is there an alternative to the GATK 3 tools for use in upstream processing
      & preparation of files for use with LoFreq?


      Alternatives to GATK for upstream processing?
      https://sourceforge.net/p/lofreq/discussion/general/thread/5845a1b5/?limit=25#ec88


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       

Log in to post a comment.