Home / analyse_newbler
Name Modified Size InfoDownloads / Week
Parent folder
readme.txt 2011-11-16 1.3 kB
analyseace.pl 2011-11-16 14.4 kB
Totals: 2 Items   15.7 kB 0
Version 1.0, July 2008

Reports on a number of statistics for contigs
including the per-contig coverage
for all contigs in an ace file produced by the newbler assembler
from 454 (454Contigs.ace)

The script also works on other ace files (tested in a few cases)
but with at least one caveat:
the number of reads will most likely be TOO LOW

note that the symbol '*' stands for gaps in the alignment

output:
fasta file:
	>contig name, length, coverage, GC%, number of reads, number of bases
	contig consensus sequence
stats.tsv:
	table with for each contig the above information
cov.tsv:
  coverage frequency distribution 
metrics.tsv:
  assembly metrics
len.tsv
  length frequency distribution

EXAMPLE lines of an ace file:
CO contig13921 135 2 3 U --> contig name, first line of each contig

CO <contig name> <of bases> <of reads in contig> <of base segments in contig> <U or C>
	this is followed by the consensus sequence, including *s

BQ --> indicates start of quality data, or end of consensus sequence

QA 45 179 45 179  --> last two numbers are start and stop bases of the read
	the region between these has been used to build the consensus

License: CC NonCommercial 3.0 Unported license (http://creativecommons.org/licenses/by-nc/3.0/)

Release notes:
Current version:
version 1.0, July 2008, released November 2011
	first release

Source: readme.txt, updated 2011-11-16