From: Peter C. <p.j...@go...> - 2015-06-01 11:06:25
|
Hello all, Some of you spotted this preprint going up on bioRvix on Friday: SAM/BAM format v1.5 extensions for de novo assemblies Peter J. A. Cock, James K. Bonfield, Bastien Chevreux, Heng Li bioRxiv DOI: 10.1101/020024 http://dx.doi.org/10.1101/020024 http://biorxiv.org/content/early/2015/05/29/020024 The current version is a terse three pages (trying to meet an "application note" page limit), but nevertheless should clarify the intended usage of these parts of the SAM/BAM specification. This manuscript has been in progress since 2012, in parallel with the associated file format change discussions all held openly here on the samtools-devel mailing list, and "samtools depad" work on GitHub. My apologies to my co-authors, the long delays are my fault. After getting useful comments from an internal pre-submission review (thank you to colleagues at the James Hutton Institute), I should have posted the current preprint back in February. Better late than never. Also, thank you to Nick Loman for a discussion in 2011 which was one of the motivations in making this effort in the first place: http://nickloman.github.io/high-throughput%20sequencing/2011/09/19/sambam-its-time-for-a-single-standard-for-assembly-output/ Also relevant are some of my blog posts from late 2011, with screenshots illustrating SAM/BAM files with a padded or un-padded reference: http://blastedbio.blogspot.co.uk/2011/09/sambam-with-gapped-reference.html http://blastedbio.blogspot.co.uk/2011/10/sambam-without-gapped-reference.html If there are queries about the file format itself, please raise them here on samtools-devel. I'm happy to receive comments about the manuscript itself by email directly. Thank you, Peter -- Peter Cock, Bioinformatician at the James Hutton Institute http://www.hutton.ac.uk/staff/peter-cock |