I've created a python script [attached] to convert the bowtie2 output into a format easier for computers to parse. In the process, I noticed a few weird things about the paired-end output, and think it could be improved to make it more consistent with the unpaired output. Here's the current output for paired-end:
21931181 reads; of these:
21931181 (100.00%) were paired; of these:
2726470 (12.43%) aligned concordantly 0 times
18982613 (86.56%) aligned concordantly exactly 1 time
222098 (1.01%) aligned concordantly >1 times
----
2726470 pairs aligned concordantly 0 times; of these:
561791 (20.61%) aligned discordantly 1 time
----
2164679 pairs aligned 0 times concordantly or discordantly; of these:
4329358 mates make up the pairs; of these:
3119334 (72.05%) aligned 0 times
1131600 (26.14%) aligned exactly 1 time
78424 (1.81%) aligned >1 times
and for single-end:
12076644 reads; of these:
12076644 (100.00%) were unpaired; of these:
11933674 (98.82%) aligned 0 times
101935 (0.84%) aligned exactly 1 time
41035 (0.34%) aligned >1 times
I suggest the following for a paired end summary, as my first attempt at an improved output:
43862362 reads; of these:
39533004 (90.12%) aligned as 19766502 pairs; of these:
18982613 (96.03%) aligned concordantly exactly 1 time
222098 (1.12%) aligned concordantly >1 times
561791 (2.84%) aligned discordantly 1 time
4329358 (9.87%) could not be aligned as pairs; of these:
3119334 (72.05%) aligned 0 times
1131600 (26.14%) aligned exactly 1 time
78424 (1.81%) aligned >1 times
92.89% overall alignment rate
Main changes:
* top-level counts are reads (or mates) rather than pairs.
* removed repeated counts (pairs aligned concordantly 0 times)
* added percentages to all rows except the top level
This way, all the percentages in the sub-counts add up to 100% (give or take rounding error). There are some additional improvements that could be made, depending on preference:
* The concordant / discordant pairs could be split up into two sub-trees
* Use read-based numbering (instead of pair-based numbering) in 'aligned as X pairs' subtree
* If paired/unpaired are likely in a single run, a higher-level paired/unpaired split will be needed
Parser to convert bowtie2 summary output into CSV format