The main reason for having an @RG header in a SAM file is so that (a subset of) the alignment records can be associated with it via their RG:Z: tagged fields. When --sam-RG options are used, bowtie should also add an RG: field to all alignment records printed, associating them with the specified @RG header. Thus the right thing happens when several bowtie output SAM files are merged, for example.
This BTW is what bwa samse/sampe -r does: it adds both an @RG header and RG:Z: fields to all output records. (But thanks for using a more convenient command line syntax that doesn't need large amounts of quoting!)
At first glance, it appears that it suffices to print out the extra field in sam.cpp's appendAligned() and reportUnOrMax(). If it is of use, I have attached a suggested patch for your consideration.
Thanks,
John Marshall
<jm18@sanger.ac.uk>
Suggested patch
Also here is a small suggestion for the --sam-RG usage display (should be lab:value, not lab=value):
diff -urNp bowtie-0.12.7.orig/ebwt_search.cpp bowtie-0.12.7/ebwt_search.cpp
--- bowtie-0.12.7.orig/ebwt_search.cpp 2010-09-04 04:19:23.000000000 +0100
+++ bowtie-0.12.7/ebwt_search.cpp 2011-03-08 16:33:35.397080000 +0000
@@ -528,7 +530,7 @@ static void printUsage(ostream& out) {
<< " --mapq <int> default mapping quality (MAPQ) to print for SAM alignments" << endl
<< " --sam-nohead supppress header lines (starting with @) for SAM output" << endl
<< " --sam-nosq supppress @SQ header lines for SAM output" << endl
- << " --sam-RG <text> add <text> (usually \"lab=value\") to @RG line of SAM header" << endl
+ << " --sam-RG <text> add <text> (usually \"lab:value\") to @RG line of SAM header" << endl
<< "Performance:" << endl
<< " -o/--offrate <int> override offrate of index; must be >= index's offrate" << endl
#ifdef BOWTIE_PTHREADS
Minor usage display patch