For Illumina data, we suggest that you preprocess your BAM files by following GATK’s best practice protocol, i.e. that you mark duplicates (not for very high coverage data though), realign indels and recalibrate base qualities with GATK (BQSR). The latter will also add indel qualities, which is needed for indel calling (alternatively use lofreq indelqual).
However, GATK has upgraded to version 4, and has dropped many of these tools since they've been integrated directly with the variant callers, I believe.
It is not clear when these tools will be ported over to GATK 4. This means that any variant calling pipelines that use both LoFreq and GATK callers are unable to upgrade to GATK 4, and must maintain reliance on GATK 3. This is a serious issue, because GATK 4 is free for commercial use, but GATK 3 is not.
Is there an alternative to the GATK 3 tools for use in upstream processing & preparation of files for use with LoFreq?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
for duplicate marking (if needed) you can use any tools of your choice,
e.g. sambamba. For realignment you can use LoFreq's own realigned lofreq
viterbi (requires resorting afterwards). For base quality calibration you
can still use GATK or alternatively Lacer https://www.biorxiv.org/content/early/2017/04/25/130732. You should get
decent results even without recalibration.
For Illumina data, we suggest that you preprocess your BAM files by
following GATK’s best practice protocol, i.e. that you mark duplicates (not
for very high coverage data though), realign indels and recalibrate base
qualities with GATK (BQSR). The latter will also add indel qualities, which
is needed for indel calling (alternatively use lofreq indelqual).
However, GATK has upgraded to version 4, and has dropped many of these
tools since they've been integrated directly with the variant callers, I
believe.
It is not clear when these tools will be ported over to GATK 4. This means
that any variant calling pipelines that use both LoFreq and GATK callers
are unable to upgrade to GATK 4, and must maintain reliance on GATK 3. This
is a serious issue, because GATK 4 is free for commercial use, but GATK 3
is not.
Is there an alternative to the GATK 3 tools for use in upstream processing
& preparation of files for use with LoFreq?
In the documentation for LoFreq, it is suggested:
However, GATK has upgraded to version 4, and has dropped many of these tools since they've been integrated directly with the variant callers, I believe.
https://github.com/broadinstitute/gatk/issues/3084
https://github.com/broadinstitute/gatk/issues/3104
It is not clear when these tools will be ported over to GATK 4. This means that any variant calling pipelines that use both LoFreq and GATK callers are unable to upgrade to GATK 4, and must maintain reliance on GATK 3. This is a serious issue, because GATK 4 is free for commercial use, but GATK 3 is not.
Is there an alternative to the GATK 3 tools for use in upstream processing & preparation of files for use with LoFreq?
Hi Steve,
for duplicate marking (if needed) you can use any tools of your choice,
e.g. sambamba. For realignment you can use LoFreq's own realigned
lofreq viterbi
(requires resorting afterwards). For base quality calibration youcan still use GATK or alternatively Lacer
https://www.biorxiv.org/content/early/2017/04/25/130732. You should get
decent results even without recalibration.
Best,
Andreas
On 16 May 2018 at 22:04, Steve stevekm@users.sourceforge.net wrote:
--
Andreas Wilm
andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC