From: Thomas W. B. <tb...@um...> - 2014-05-27 22:41:30
|
The .fasta header line in the example below appears to have a space character in between '>' and 'EBV-MIR-BART13'. Most aligners report the string of characters following '>' up to and not including the first whitespace character as the query sequence identifier. For this example, that will be the empty string. Don't be embarrassed. I got caught by this too, many, many years ago, and had to have it explained to me. It seems perfectly reasonable that one could add a space for readability, but that's not what's expected. - tom blackwell - On Tue, 27 May 2014, Lana Schaffer wrote: > Tom, > Sorry I am talking about the header in the fasta file, ie EBV-MIR-BART13 as > In the example below: >> EBV-MIR-BART13 > TGTAACTTGCCAGGGACGGCTGA > Lana > > -----Original Message----- > From: Thomas W. Blackwell [mailto:tb...@um...] > Sent: Tuesday, May 27, 2014 10:58 AM > To: Lana Schaffer > Cc: Sam...@li... > Subject: Re: [Samtools-help] align to fasta DB with names > > > The usual hack is 'samtools view file.bam | cut -f 1 | sort | uniq -c'. > If on Windows, you're on your own. > > - tom blackwell - > > On Tue, 27 May 2014, Lana Schaffer wrote: > >> Hi, >> I am aligning to fasta DB of sequences and would like to count The >> number of reads to each fasta entry by header names. >> How do I designate to bowtie to store the names in the SAM File and >> then use samtool to count them? >> >> Lana Schaffer >> The Scripps Research Institute >> Biostatistics, Informatics >> DNA Array Core Facility >> 858-784-2263 >> > |