From: Vickie S <is...@li...> - 2014-09-20 00:17:28
|
Hi I am trying to sort the bam file by using the sort command: samtools sort -n aln.bam aln.qsort [W::sam_hdr_read] bgzf_check_EOF: Value too large for defined data type [bam_sort_core] truncated file. Continue anyway. I am not sure if "continue anyway" means it continues to sort or just aborts. I checked the file size. $ du -hs * 113G aln.bam 8.4M aln.bam.bai 28K aln.qsort.bam So it does not seem like file is sorted. I have checked one previous thread about this bug but could not find any solution. Anyone comments ? Suggestion for any other tool ? Thanks |
From: Vickie S <is...@li...> - 2014-09-20 03:05:09
|
Thanks Bob for the info. A bit of distraction from samtools bug, I like to mention the reason I wanted to use sorting here is to allow conversion of bam with paired end reads to fastq. I wonder if it would lead to misleading fastq outpur if I convert bam to fastq without sorting the reads ? Thanks Collins for ur suggestion of novosort. > Subject: Re: [Samtools-help] [bam_sort_core] truncated file. Continue anyway > From: rsh...@bx... > Date: Fri, 19 Sep 2014 22:37:40 -0400 > CC: sam...@li... > To: is...@li... > > As I recall, there's some condition that the decompression library incorrectly decides means there's something wrong with the input file, and it returns a value that the calling routine can't distinguish from an end of file. The calling routine then reports that the file is truncated but continues to try to read the rest of the file, and might be successful. > > I believe I reported this a few months back but I really got no response from the samtools folks. I think/guess the basic problem is that the decompression library doesn't originate with this project, so there's resistance to making any changes to it. The change I suggested at the time *looked* simple enough, but it would involve a change to how that library reports errors. > > All that may have nothing to do with why your file isn't being sorted though. > > Bob H > > > On Sep 19, 2014, at 8:04 PM, Vickie S wrote: > >> Hi >> I am trying to sort the bam file by using the sort command: >> samtools sort -n aln.bam aln.qsort >> [W::sam_hdr_read] bgzf_check_EOF: Value too large for defined data type >> [bam_sort_core] truncated file. Continue anyway. >> >> I am not sure if "continue anyway" means it continues to sort or just aborts. >> I checked the file size. >> $ du -hs * >> 113G aln.bam >> 8.4M aln.bam.bai >> 28K aln.qsort.bam >> >> So it does not seem like file is sorted. I have checked one previous thread about this bug but could not find any solution. Anyone comments ? Suggestion for any other tool ? >> >> Thanks >> >> ------------------------------------------------------------------------------ >> Slashdot TV. Video for Nerds. Stuff that Matters. >> http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help >> > |
From: Bob H. <rsh...@bx...> - 2014-09-20 03:07:49
|
As I recall, there's some condition that the decompression library incorrectly decides means there's something wrong with the input file, and it returns a value that the calling routine can't distinguish from an end of file. The calling routine then reports that the file is truncated but continues to try to read the rest of the file, and might be successful. I believe I reported this a few months back but I really got no response from the samtools folks. I think/guess the basic problem is that the decompression library doesn't originate with this project, so there's resistance to making any changes to it. The change I suggested at the time *looked* simple enough, but it would involve a change to how that library reports errors. All that may have nothing to do with why your file isn't being sorted though. Bob H On Sep 19, 2014, at 8:04 PM, Vickie S <is...@li...> wrote: > Hi > I am trying to sort the bam file by using the sort command: > samtools sort -n aln.bam aln.qsort > [W::sam_hdr_read] bgzf_check_EOF: Value too large for defined data type > [bam_sort_core] truncated file. Continue anyway. > > I am not sure if "continue anyway" means it continues to sort or just aborts. > I checked the file size. > $ du -hs * > 113G aln.bam > 8.4M aln.bam.bai > 28K aln.qsort.bam > > So it does not seem like file is sorted. I have checked one previous thread about this bug but could not find any solution. Anyone comments ? Suggestion for any other tool ? > > Thanks > > ------------------------------------------------------------------------------ > Slashdot TV. Video for Nerds. Stuff that Matters. > http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help > |
From: Colin H. <co...@no...> - 2014-09-20 03:51:52
|
Hi Vickie, If the BAM file is in coordinate order then converting to fastq will put the read1 & read2 files in slightly different orders which will be a problem for aligning. You can usually convert to fastq from unsorted bam produced by the aligner or a name sorted bam. Best, Colin On 20 September 2014 11:05, Vickie S <is...@li...> wrote: > Thanks Bob for the info. A bit of distraction from samtools bug, I like to > mention the reason I wanted to use sorting here is to allow conversion of > bam with paired end reads to fastq. I wonder if it would lead to misleading > fastq outpur if I convert bam to fastq without sorting the reads ? Thanks > Collins for ur suggestion of novosort. > Subject: Re: [Samtools-help] > [bam_sort_core] truncated file. Continue anyway > From: > rsh...@bx... > Date: Fri, 19 Sep 2014 22:37:40 -0400 > CC: > sam...@li... > To: is...@li... > > As I recall, > there's some condition that the decompression library incorrectly decides > means there's something wrong with the input file, and it returns a value > that the calling routine can't distinguish from an end of file. The calling > routine then reports that the file is truncated but continues to try to > read the rest of the file, and might be successful. > > I believe I > reported this a few months back but I really got no response from the > samtools folks. I think/guess the basic problem is that the decompression > library doesn't originate with this project, so there's resistance to > making any changes to it. The change I suggested at the time *looked* > simple enough, but it would involve a change to how that library reports > errors. > > All that may have nothing to do with why your file isn't being > sorted though. > > Bob H > > > On Sep 19, 2014, at 8:04 PM, Vickie S > wrote: > >> Hi >> I am trying to sort the bam file by using the sort > command: >> samtools sort -n aln.bam aln.qsort >> [W::sam_hdr_read] > bgzf_check_EOF: Value too large for defined data type >> [bam_sort_core] > truncated file. Continue anyway. >> >> I am not sure if "continue anyway" > means it continues to sort or just aborts. >> I checked the file size. >> $ > du -hs * >> 113G aln.bam >> 8.4M aln.bam.bai >> 28K aln.qsort.bam >> >> So > it does not seem like file is sorted. I have checked one previous thread > about this bug but could not find any solution. Anyone comments ? > Suggestion for any other tool ? >> >> Thanks >> >> > ------------------------------------------------------------------------------ > >> Slashdot TV. Video for Nerds. Stuff that Matters. >> > http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk > >> _______________________________________________ >> Samtools-help mailing > list >> Sam...@li... >> > https://lists.sourceforge.net/lists/listinfo/samtools-help >> > > > > ------------------------------------------------------------------------------ > Slashdot TV. Video for Nerds. Stuff that Matters. > > http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help > > |
From: Nils H. <nil...@gm...> - 2014-09-20 13:07:35
|
You can also try SamToFasq.jar within Picard Tools to create a Sam or Bam from a Fastq. N Thumb typed for added typos > On Sep 19, 2014, at 11:51 PM, Colin Hercus <co...@no...> wrote: > > Hi Vickie, > > If the BAM file is in coordinate order then converting to fastq will put the read1 & read2 files in slightly different orders which will be a problem for aligning. > > You can usually convert to fastq from unsorted bam produced by the aligner or a name sorted bam. > > Best, Colin > >> On 20 September 2014 11:05, Vickie S <is...@li...> wrote: >> Thanks Bob for the info. A bit of distraction from samtools bug, I like to mention the reason I wanted to use sorting here is to allow conversion of bam with paired end reads to fastq. I wonder if it would lead to misleading fastq outpur if I convert bam to fastq without sorting the reads ? Thanks Collins for ur suggestion of novosort. > Subject: Re: [Samtools-help] [bam_sort_core] truncated file. Continue anyway > From: rsh...@bx... > Date: Fri, 19 Sep 2014 22:37:40 -0400 > CC: sam...@li... > To: is...@li... > > As I recall, there's some condition that the decompression library incorrectly decides means there's something wrong with the input file, and it returns a value that the calling routine can't distinguish from an end of file. The calling routine then reports that the file is truncated but continues to try to read the rest of the file, and might be successful. > > I believe I reported this a few months back but I really got no response from the samtools folks. I think/guess the basic problem is that the decompression library doesn't originate with this project, so there's resistance to making any changes to it. The change I suggested at the time *looked* simple enough, but it would involve a change to how that library reports errors. > > All that may have nothing to do with why your file isn't being sorted though. > > Bob H > > > On Sep 19, 2014, at 8:04 PM, Vickie S wrote: > >> Hi >> I am trying to sort the bam file by using the sort command: >> samtools sort -n aln.bam aln.qsort >> [W::sam_hdr_read] bgzf_check_EOF: Value too large for defined data type >> [bam_sort_core] truncated file. Continue anyway. >> >> I am not sure if "continue anyway" means it continues to sort or just aborts. >> I checked the file size. >> $ du -hs * >> 113G aln.bam >> 8.4M aln.bam.bai >> 28K aln.qsort.bam >> >> So it does not seem like file is sorted. I have checked one previous thread about this bug but could not find any solution. Anyone comments ? Suggestion for any other tool ? >> >> Thanks >> >> ------------------------------------------------------------------------------ >> Slashdot TV. Video for Nerds. Stuff that Matters. >> http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help >> > >> >> ------------------------------------------------------------------------------ >> Slashdot TV. Video for Nerds. Stuff that Matters. >> http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help > > ------------------------------------------------------------------------------ > Slashdot TV. Video for Nerds. Stuff that Matters. > http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help |