I just did a test with the (only) two example BED lines you reported in mail.
It works as expected, producing a results file of about 350 lines and 6.5K.
Obviously the results changes depending on the BAM I have in my hand,
but it seem reasonable.
This is my cmd line:
$ samtools mpileup -l test.bed -f /db/samtools/hg19/hg19.fa TEST.bam > test.txt
test.bed contains the two lines; I didn't check with the full knowngene.bed.
/db/samtools/hg19/hg19.fa is the faidx indexed genome
TEST.bam contains ~20M human mapped reads.
Samtools Version: 0.1.18.
May be that 82G is what you can expect from an mpileup over the whole
set of known genes, since the result file format is quite verbose and
Try looking inside your result file and check if you have calls also
in intergenic regions (you should see only result in intra-genic
> Message: 1
> Date: Tue, 6 Nov 2012 11:30:57 -0800
> From: Chunjiang He <camelbbs@...>
> Subject: Re: [Samtools-help] questions about pileup
> To: Peter Johansson <trojkan@...>
> Cc: samtools-help@...
> Content-Type: text/plain; charset="iso-8859-1"
> While I use
> samtools mpileup -l knowngene.bed -f ~/bowtie2index/hg19.fa
> brain_fetus1.bam > test.pileup.txt
> I still got the counts on the whole genome. The result test.pileup.txt has
> I think that is because of knowngene.bed doesn't work.
> Knowngene.bed is like this:
> chr1 11873 14409 uc001aaa.3 0 + 11873 11873 0 3 354,109,1189, 0,739,1347,
> chr1 11873 14409 uc010nxr.1 0 + 11873 11873 0 3 354,52,1189, 0,772,1347,
> Is there any solution? I don't find the instructions of this in samtool
> On Mon, Nov 5, 2012 at 9:14 PM, Peter Johansson <trojkan@...> wrote:
>> On 11/06/2012 03:16 PM, Chunjiang He wrote:
>>> Thanks Peter,
>>> I see there is another option says -r STR region in which pileup is
>>> generated [null]
>>> What is the difference to -l
>>> -r takes a string while -l takes a file. -r only takes one region, if I
>> remember correctly.