[Samtools-help] mpileup by sample or by file

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

I am doing a raw pileup of reads to take a look at the underlying 
alleles and I noticed that while samtools mpileup, piped through 
bcftools, collates the alleles on a per-sample basis, the "old style" 
mpileup (without -g/-u) does not:

samtools mpileup  -f ref.fa -r S00039:6188031-6188031 -b bampaths

[mpileup] 20 samples in 24 input files

<mpileup>  Set max per-file depth to 400

S00039  6188031 a       8       ,.,,.,,.        ;<<<=<<=        3       GGG     :#;     7       .g,G.,g :9;:;;9 5       ,$..gg  8;;95   0       *       *       4       .,,.    789:    4       Gg.,    9798    1       G       8       1   G    9       5       ...,,   :;<:9   4       ,.,,    :<:;    5       GGggG<9989   2       gg      98      9       gGGGgGG.G       99;<:;;<;       9       ggg.Gg,GG       :88<;9;;;       7       gG.gGG. 9;;:;<<  11      ..GgG,..gg^].   :;<9:;;<:5:  5       G.,g,   ;:::8   2       ,G      :;      2       .g      78      5       ggGGg   99;;9   5       ...,,   :;;89   5       ,,...   9:;:;   6       G,GGg,  :9;;9:

so 24*3 output columns, 3 per file, and NOT  per sample, even though 
mpileup obviously groks that there are only 20 samples in those 24 files.

Am I missing something basic here? If not I would call this a bug as the 
man page states that "Alignment records are grouped by sample 
identifiers in @RG header lines."

Any ideas for other ways of examining the underlying alleles, per 
sample? The quick fix here would be to merge the bam files, but thats 
not as elegant as Id hope to keep the data organized.

cheers
Pall

-- 

----------------------------------------------------------
Pall Isolfur Olason, PhD             pal...@eb...
Bioinformatics researcher / SNIC-UPPMAX application expert
Evolutionsbiologiskt Centrum           Uppsala Universitet
Norbyv. 18D                              tlf: 070-949 8104
75236 Uppsala                            fax: 018-471 6310
----------------------------------------------------------