From: Christoph H. <chr...@gm...> - 2012-08-09 14:23:37
|
Hi Brian, thanks for your reply! The fragmentDepth utility does basically what I was interested in, thanks! I am a little confused with its output, though. If I run it in -scaffold mode like: fragmentDepth -scaffold < *.posmap.frgscf.sorted In the fragmentDepth output I get the following as an example: uid start end mode mean median 7180006953248 0 33010 40294 42.589286 2 7180006953249 0 31936 1518 42.845247 1 7180006953250 0 26539 62454 41.643727 41 A few questions there: What exactly is the mode (40294,1518,62454) column? According to *.posmap.scflen scaffold 7180006953248 is 33204 long - why does it calculate the coverage only until position 33010? Also, I am not sure how to understand the median value. To reach a value of 1 or 2 as in the first two scaffolds in the example about half of the positions need to have a coverage of 0-1 or 0-2, right? can that be correct, or am I misunderstanding something here? Thanks for your help! cheers, Christoph On 08/09/2012 05:03 AM, Walenz, Brian wrote: > Hi, Christoph- > > [Sorry, wrote this 16 hours ago and forgot to send] > > Check out the 'fragmentDepth' utility. It computes coverage, and outputs in > three different ways: coverage of each scaffold, a histogram of coverage (as > at the end of *.qc), and a fasta-like output of the actual depth of coverage > at each base in the scaffold. > > I can't think of a reason it would fail on contigs, but I haven't tried it. > > The posmap files should be capturing most of the important stuff from the > (agreed: very difficult to use) asm file. If you can't get what you're > looking for out of the posmap files, we need to add to them. > > b > > > > On 8/8/12 6:20 AM, "Christoph Hahn" <chr...@gm...> wrote: > >> Hello CA developers and experts, >> >> I have just finished my first big 454+illumina hybrid assembly using CA7 >> and I am about to assess the result now in comparison to purely illumina >> based assemblies. >> >> One question there: What is the easiest way to get coverage information >> for the scaffolds, contigs, unitigs in the *.scf.fasta, *.ctg.fasta, >> etc. files? I figured, that it is possible to calculate it manually >> using the information in the *.posmap.frgscf and *.posmap.scflen files >> (in case of scaffolds). I guess, the information is also in the *.asm >> file, but I am having problems reading/parsing the file. >> Is there an easy way you can think about? >> The reason, why I want to do this is that I want to bin the >> scaffolds/contigs based on coverage, GC-content and length. >> >> Any ideas are highly appreciated, thanks! >> >> Much obliged, >> Christoph >> >> University of Oslo, Norway >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |