|
From: Mark A. D. <dep...@br...> - 2011-04-26 16:41:41
|
Yes, we require the SQ to be sorted in karyotypic order for the autosomal and sex chromosomes at least according to either the UCSC or HGRC style. We do this for human data only, because we provide many supplementary data sets (dbsnp, OMNI 2.5M and HapMap sites and genotypes, refseq, BAMs) all in these two orders. It's an unfortunate constraint but is a trade-off we made to ensure that people could use more easily GATK supplementary files. Best, On Apr 26, 2011, at 10:57 AM, Bob Handsaker wrote: > Most GATK tools also require the reference sequence fasta file and > requires that the chromosome sort order in the bam files match the > chromosome order in the reference fasta file. > -Bob > > On 4/26/11 10:52 AM, Alec Wysoker wrote: >> Hi Ryan, >> >> Coordinate sort order is based on the order in which the @SQ lines >> appear in the header of the BAM file. Coordinate sort order should be >> consistent between samtools, GATK and Picard, with the caveats that >> ordering of reads with the same coordinate is arbitrary, and ordering of >> unmapped reads that also do not have a coordinate is arbitrary. I'm >> surprised that you say that GATK insists on a particular order. I would >> think it would just require that they be in coordinate order as defined >> in the SAM spec. >> >> -Alec >> >> On 4/26/11 10:33 AM, Ryan Golhar wrote: >>> Hi - I've noticed that when sorting BAM files with samtools, the >>> chromosomes are sorted lexicographically. GATK insists on the >>> chromosomes being sorted numerically. I'm using Picard tools right now >>> to make the conversion. >>> >>> I thought, at first, the sorting was based on the order of the >>> chromosomes in my fasta file when I indexed the genome, but that doesn't >>> seem to matter. Is there a way to have samtools sort the chromosomes >>> numerically or match the order in the .fai file? This could help >>> eliminate a step that currently takes some time to run, perhaps as an >>> option to the sort command? >>> >>> ------------------------------------------------------------------------------ >>> WhatsUp Gold - Download Free Network Management Software >>> The most intuitive, comprehensive, and cost-effective network >>> management toolset available today. Delivers lowest initial >>> acquisition cost and overall TCO of any competing solution. >>> http://p.sf.net/sfu/whatsupgold-sd >>> _______________________________________________ >>> Samtools-help mailing list >>> Sam...@li... >>> https://lists.sourceforge.net/lists/listinfo/samtools-help >>> >> ------------------------------------------------------------------------------ >> WhatsUp Gold - Download Free Network Management Software >> The most intuitive, comprehensive, and cost-effective network >> management toolset available today. Delivers lowest initial >> acquisition cost and overall TCO of any competing solution. >> http://p.sf.net/sfu/whatsupgold-sd >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help > > > ------------------------------------------------------------------------------ > WhatsUp Gold - Download Free Network Management Software > The most intuitive, comprehensive, and cost-effective network > management toolset available today. Delivers lowest initial > acquisition cost and overall TCO of any competing solution. > http://p.sf.net/sfu/whatsupgold-sd > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help Mark A. DePristo, Ph.D. Manager, Medical and Population Genetics Analysis Broad Institute of MIT and Harvard dep...@br... ma...@de... |