|
From: mailing l. <mar...@gm...> - 2012-04-13 11:45:10
|
Sorry I came across as short. I'm just frustrated. I think we will upgrade someday, I'm sure it is better. I guess I'll use an older version for the time being. By the way, would you be able to point me to the source code that does the pileup in version 0.1.9? My main reason for wanting to upgrade was that pileup starting running really slow for me, maybe if I see how it works I can get some ideas of what would speed it up. Keep up the good work, I didn't mean to harsh on samtools. -Greg On Wed, Apr 11, 2012 at 2:12 PM, Heng Li <lh...@sa...> wrote: > > On Apr 11, 2012, at 7:37 AM, mailing list wrote: > >> Thanks. It seems weird they'd abandon pileup and leave those people >> using the 10 column format high and dry. > > I know it is a pain to see some features are removed, but you cannot expect every feature from the very first release exists forever, especially when the old features are inflexible, limited and substandard. You need to move forward and adopt better things. The pileup to mpileup transition is overall beneficial. > >> Maybe they don't talk to users much? > > Well, it would be difficult for the few samtools developers to send 8600+ emails to samtools-help/devel in three years. We talk to users, though I agree we could do better. > > Heng > >> I guess I just won't upgrade to the latest version. >> >> -Greg >> >> On Tue, Apr 10, 2012 at 3:56 PM, Joseph Fass <jos...@gm...> wrote: >>> Yup - those have to do with genotype / variant calling from the sequence >>> evidence ... and mpileup only provides genotype / variant calls in VCF >>> format. The only way you'll get access to genotype calls in pileup format >>> (i.e. "10-column" pileups) is by using an older samtools ... maybe 0.1.12? >>> Don't quote me on that ... >>> >>> ~Joe >>> >>> >>> >>> On Tue, Apr 10, 2012 at 11:56 AM, mailing list <mar...@gm...> wrote: >>>> >>>> Thanks. Maybe I'm making some progress. >>>> >>>> So here are some example lines from my old output from samtools pileup*: >>>> >>>> scaffold_1 47225 T T 30 0 44 1 >>>> ^M. I >>>> scaffold_1 47226 A A 30 0 44 1 . >>>> I >>>> scaffold_1 47227 A A 30 0 44 1 . >>>> I >>>> >>>> and here are some lines from samtools mpileup: >>>> >>>> scaffold_1 47225 T 1 ^M. I >>>> scaffold_1 47226 A 1 . I >>>> scaffold_1 47227 A 1 . I >>>> >>>> I appear to be missing a few columns. (I believe I'm missing: >>>> genotype, consensus quality, SNP quality, RMS mapping quality). >>>> >>>> Any ideas how I can get them back? >>>> >>>> -Greg >>>> >>>> *Here is what these columns to mean in my old output (from >>>> >>>> http://sourceforge.net/apps/mediawiki/samtools/index.php?title=SAM_FAQ#I_do_not_understand_the_columns_in_the_pileup_output.): >>>> >>>> reference sequence name >>>> reference coordinate >>>> reference base, or `*' for an indel line >>>> genotype where heterozygotes are encoded in the IUB code >>>> Phred-scaled likelihood that the genotype is wrong, which is also >>>> called `consensus quality'. >>>> Phred-scaled likelihood that the genotype is identical to the >>>> reference, which is also called `SNP quality'. >>>> root mean square (RMS) mapping quality >>>> # reads covering the position >>>> read bases at a SNP line (check the manual page for more >>>> information); the 1st indel allele otherwise >>>> base quality at a SNP line; the 2nd indel allele otherwise >>>> indel line only: # reads directly supporting the 1st indel allele >>>> indel line only: # reads directly supporting the 2nd indel allele >>>> indel line only: # reads supporting a third indel allele >>>> >>>> >>>> >>>> On Tue, Apr 10, 2012 at 2:44 PM, Joseph Fass <jos...@gm...> >>>> wrote: >>>>> Hi Greg, >>>>> Remove the bcftools command you're piping into ... bcftools deals only >>>>> with >>>>> VCF / BCF format, so once you've removed the -u or -g from the mpileup >>>>> command, what's coming out of mpileup can't be handled by bcftools. >>>>> ~Joe >>>>> >>>>> >>>>> On Tue, Apr 10, 2012 at 11:39 AM, mailing list <mar...@gm...> >>>>> wrote: >>>>>> >>>>>> Hmm, ok I tried running this: >>>>>> >>>>>> samtools mpileup -f ref.fa aligned_reads.bam | bcftools view -bvc - > >>>>>> out.pileup >>>>>> >>>>>> But I'm getting: >>>>>> >>>>>> [fai_load] build FASTA index. >>>>>> [mpileup] 1 samples in 1 input files >>>>>> <mpileup> Set max per-file depth to 8000 >>>>>> [bcf_sync] incorrect number of fields (0 != 5) at 0:0 >>>>>> [afs] 0:0.000 >>>>>> >>>>>> Any ideas what I might be doing wrong? >>>>>> >>>>>> -Greg >>>>>> >>>>>> >>>>>> On Tue, Apr 10, 2012 at 2:29 PM, Joseph Fass <jos...@gm...> >>>>>> wrote: >>>>>>> Run the mpileup command without the '-u' option (and without '-g') to >>>>>>> disable the variant calling; this will give the old pileup format >>>>>>> without variant calls (i.e. the 6(?) column format, without 4th-7th >>>>>>> columns >>>>>>> that give new consensus genotype(s) and SNP/consensus/RMS mapping >>>>>>> qualities). The only way samtools does variant calling now is in VCF >>>>>>> format >>>>>>> (BCF is the binary version of VCF, hence 'bcftools'). >>>>>>> HTH, >>>>>>> ~Joe >>>>>>> >>>>>>> >>>>>>> On Tue, Apr 10, 2012 at 11:22 AM, mailing list <mar...@gm...> >>>>>>> wrote: >>>>>>>> >>>>>>>> I'm currently running a command like this: >>>>>>>> >>>>>>>> samtools pileup -f ref.fa aligned_reads.bam -c > out.pileup >>>>>>>> >>>>>>>> I want to switch to using mpileup but it's not clear to me how to >>>>>>>> get >>>>>>>> identical (or as close as possible output). >>>>>>>> >>>>>>>> I tried this: >>>>>>>> >>>>>>>> samtools mpileup -uf ref.fa aligned_reads.bam | bcftools view -bvcg >>>>>>>> - >>>>>>>>> out.pileup >>>>>>>> >>>>>>>> But it's making a much smaller file and it's in a different format. >>>>>>>> I >>>>>>>> don't know how to get to the pileup format. >>>>>>>> >>>>>>>> I'm really stuck. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Greg >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Better than sec? Nothing is better than sec when it comes to >>>>>>>> monitoring Big Data applications. Try Boundary one-second >>>>>>>> resolution app monitoring today. Free. >>>>>>>> http://p.sf.net/sfu/Boundary-dev2dev >>>>>>>> _______________________________________________ >>>>>>>> Samtools-help mailing list >>>>>>>> Sam...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/samtools-help >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Joseph Fass >>>>>>> Lead Data Analyst >>>>>>> UC Davis Bioinformatics Core >>>>>>> joseph.fass -at- gmail.com (professional) >>>>>>> 970.227.5928 (c) || 530.752.2698 (w) >>>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Joseph Fass >>>>> Lead Data Analyst >>>>> UC Davis Bioinformatics Core >>>>> joseph.fass -at- gmail.com (professional) >>>>> 970.227.5928 (c) || 530.752.2698 (w) >>>>> >>> >>> >>> >>> >>> -- >>> Joseph Fass >>> Lead Data Analyst >>> UC Davis Bioinformatics Core >>> joseph.fass -at- gmail.com (professional) >>> 970.227.5928 (c) || 530.752.2698 (w) >>> >> >> ------------------------------------------------------------------------------ >> Better than sec? Nothing is better than sec when it comes to >> monitoring Big Data applications. Try Boundary one-second >> resolution app monitoring today. Free. >> http://p.sf.net/sfu/Boundary-dev2dev >> _______________________________________________ >> Samtools-help mailing list >> Sam...@li... >> https://lists.sourceforge.net/lists/listinfo/samtools-help > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. |