From: Alec W. <al...@br...> - 2011-08-29 18:27:50
|
Hi Nathalie, As I look at it, I have a feeling that ReorderSam does not handle RNEXT field properly if the input is SAM format rather than BAM format. Could you try to convert from SAM to BAM before running ReorderSam, and let me know if that fixes your problem? You can convert to BAM like this: java -jar SamFormatConverter.jar I=yourfile.sam o=yourfile.bam The program knows to convert to BAM format based on the extension of the output file. -Alec On 8/29/11 2:06 PM, Nathalie Lailler wrote: > Alec, > I checked and I didnt give BWA the same reads as 1st and 2nd end. > The exemple I gave you was to show the fact that ReOrderSam.jar added > an extra "chr" for each read. > it is true that there was a value between 3S91M and 85391038 ('='). > > Does reordersam do anything alse than sorteing according to the 3rd > field of the initial SAM file? > Why does it add stuff in each row? > > Thanks, and sorry for bugging you.. > > Nathalie > > ------------------------------------------------------------------------ > *From:* Alec Wysoker [al...@br...] > *Sent:* Monday, August 29, 2011 12:58 PM > *To:* Nathalie Lailler > *Subject:* Re: [Samtools-help] MarkDuplicates > > Hi Nathalie, > > Are you sure that first record is correct? It looks like it is > missing the RNEXT value (should be between 3S91M and 85391038). Also, > it seems strange to me that the PNEXT value is the same as the POS > value (i.e. both ends have the same alignment start). This can be the > case if one end is unmapped, but your FLAG field indicates that both > ends are mapped: > > 113 : > read paired > read reverse strand > mate reverse strand > first in pair > > Also, do you expect that both ends would align to the same strand? I > wonder if somehow you have provided the same reads to BWA as both the > first and second ends. > > -Alec > > On 8/29/11 12:51 PM, Nathalie Lailler wrote: >> Alec, >> The problem seems to come from the ReorderSam.jar (I want to sort by >> karyotypic order). >> >> It transforms a line looking like >> >> MyRead 113 chr1 85391038 150 13S91M 85391038 0 >> GGCGAGAGAGTAGGA... etc >> >> Into something like >> >> MyRead 113 chr1 85391038 150 13S91M chr12 85391038 0 >> GGCGAGAGAGTAGGA... etc >> >> >> It adds "chrxxx" in the line and then the markduplicates becomes crazy... >> >> Do you know where this comes from? >> >> Thanks >> >> nathalie >> ------------------------------------------------------------------------ >> *From:* Alec Wysoker [al...@br...] >> *Sent:* Monday, August 29, 2011 10:15 AM >> *To:* Nathalie Lailler >> *Cc:* sam...@li... >> *Subject:* Re: [Samtools-help] MarkDuplicates >> >> Hi Nathalie, >> >> Typically this error means that you have more than one read with the >> same name marked as the same end, e.g. 2 reads marked as 'first of >> pair' named 'Myread'. Try running ValidateSamFile on your BAM. It >> may tell you what you need to fix. >> >> -Alec >> >> On 8/26/11 4:28 PM, Nathalie Lailler wrote: >>> >>> Hello, >>> >>> I am trying to use the GATK pipeline. >>> >>> I aligned different sequences with BWA, Paired-end. >>> >>> I have 4 individuals, and 3 libraries for each individuals. >>> >>> After I aligned individually each library, I added a proper @RG line >>> in the header of the SAM, then sorted in Karyotypic order. >>> >>> Then I merged all the libraries for each individual. >>> >>> I end up with 4 BAM files. >>> >>> I am trying to run the Picard Markduplicates on those files but I >>> keep getting an error…after 4 or 5 hours of processing: >>> >>> Value was put into PairInfoMap more than once . 10:Myread >>> >>> Since I did a paired-end alignment, the 2 ends have the same name >>> …HWI-XXX:1:1:66:344:3333#0 for instance >>> >>> What can I do to be able to run the MarkDuplicates tool? >>> >>> Thanks a lot >>> >>> N >>> >>> cid:image001.jpg@01C956BA.77051F60 >>> >>> >>> ------------------------------------------------------------------------------ >>> EMC VNX: the world's simplest storage, starting under $10K >>> The only unified storage solution that offers unified management >>> Up to 160% more powerful than alternatives and 25% more efficient. >>> Guaranteed.http://p.sf.net/sfu/emc-vnx-dev2dev >>> >>> >>> _______________________________________________ >>> Samtools-help mailing list >>> Sam...@li... >>> https://lists.sourceforge.net/lists/listinfo/samtools-help |