Re: [svtoolkit-help] "Invalid sequence position"
Status: Beta
Brought to you by:
bhandsaker
From: John B. <jo...@we...> - 2013-06-05 15:56:39
|
Hi Bob, I've had a bit of feedback from Gerton Lunter (the author of the mapper I'm using, "stampy"), regarding the "Invalid sequence position" error: "What's going on here, is that these two reads are BOTH mapped in the reverse direction. So it's unclear what's going on, and in particular it's unclear what the fragment was originally. "The trouble is that the SAM specification does not say what the ISIZE field should be in this situation. Stampy calculates the fragment size as the distance between the two read starts, where "read start" is the mapping position for FW reads, and the mapping position + read length for BW reads (when there are no indels), but clearly this is not what GATK / GenomeSTRiP expects. "Again, picard's ValidateSamFile and GATK's CountBases -S STRICT accept this isize fine. "Since it's not specified in the BAM specification what the ISIZE field should be in this case, I think the most practical solution is to accept anything. If Bob agrees perhaps he can change this in his code. If not we can filter the reads away (although they may be informative for some structural variation, so that wouldn't be my preferred solution.) Or I could change Stampy to produce what GenomeSTRiP expects; I'm somewhat reluctant to do that, since I don't know which tools are relying on Stampy's ISIZE field in this case (quite likely none, but who knows.)" Hopefully this can be resolved somehow. I did introduce a filter some time ago which removed some reads which were aberrant for some other (similar) reason, but that was down to a bug in the mapper (stampy). Cheers John On Thu, May 23, 2013 at 3:21 PM, John Broxholme <jo...@we...> wrote: > Pre-processing of one(of 270+) deep BAM files has failed with: > > ... > INFO 12:54:38,366 ComputeGCProfiles - Processing input file > org.broadinstitute.sv.dataset.SAMFileLocation@986cff74 ... > Exception in thread "main" java.lang.RuntimeException: Invalid sequence > position: 17:81195230 > ... > > Where would this have come from? The pipeline has been the same to > prepare all 270+ of the (25x deep) BAMs, and this is the only failure. Any > suggestions on what might be wrong and how to fix it will be most welcome! > > Thanks > John > > -- > John Broxholme > Wellcome Trust Centre for Human Genetics > Roosevelt Drive, Oxford, OX3 7BN, UK > > -- John Broxholme Wellcome Trust Centre for Human Genetics Roosevelt Drive, Oxford, OX3 7BN, UK Tel: (+44 1865) 287611 FAX: 287697 |