Menu

#2 stats summary miscounting genes

1.0
open
nobody
None
2014-04-23
2014-04-23
No

The stats summary report is miscounting genes when N-gaps split a gene into several fragments. It is counting each fragment separately, leading to an artificially high count. A fix can be found in JCVI's development directory for VIGOR.

change

sub stats_report {
    my ( $OUT, $tab_delimited, $genome, $genelist ) = @_;
    my $mutations   = 0;
    my $genes       = 0;
    my $cdsbases    = 0;
    my $refbases    = 0;
    my $pepbases    = 0;
    my $coveredaa   = 0;
    my $identicalaa = 0;
    my $similaraa   = 0;
    for my $gene (@$genelist) {

to

sub stats_report {
    my ( $OUT, $tab_delimited, $genome, $genelist ) = @_;
    my $mutations   = 0;
    my $genes       = 0;
    my $cdsbases    = 0;
    my $refbases    = 0;
    my $pepbases    = 0;
    my $coveredaa   = 0;
    my $identicalaa = 0;
    my $similaraa   = 0;
    my %genes;
    for my $gene (@$genelist) {

and change

        $genes++;

to

        my $genesym = gene_name_to_symbol( $$gene{gene_name} );
        if ( ! exists $genes{$genesym} ) {
            $genes++;
            $genes{$genesym} = 1;
        }

Discussion


Log in to post a comment.

MongoDB Logo MongoDB