From: Allen D. <all...@uc...> - 2005-04-29 17:54:39
|
you have a typo as "transcat". -allen On Fri, 29 Apr 2005, Scott Cain wrote: > Update of /cvsroot/gmod/schema/chado/load/bin > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20281 > > Modified Files: > bulk_load_gff3.PLS > Log Message: > added --notransact and --nosequence command line options > > > Index: bulk_load_gff3.PLS > =================================================================== > RCS file: /cvsroot/gmod/schema/chado/load/bin/bulk_load_gff3.PLS,v > retrieving revision 1.21 > retrieving revision 1.22 > diff -C2 -d -r1.21 -r1.22 > *** bulk_load_gff3.PLS 28 Apr 2005 21:44:22 -0000 1.21 > --- bulk_load_gff3.PLS 29 Apr 2005 14:51:32 -0000 1.22 > *************** > *** 63,66 **** > --- 63,68 ---- > --analysis The GFF data is from computational analysis > --noload Create bulk load files, but don't actually load them. > + --nosequence Don't load sequence even if it is in the file > + --notranscat Don't use a single transaction to load the database > --validate Validate SOFA terms before attempting insert (can cause > script startup to be slow, 0 (false) by default) > *************** > *** 75,78 **** > --- 77,101 ---- > =over > > + =item Transactions > + > + This application will, by default, try to load all of the data at > + once as a single transcation. This is safer from the database's > + point of view, since if anything bad happens during the load, the > + transaction will be rolled back and the database will be untouched. > + The problem occurs if there are many (say, greater than a 2-300,000) > + rows in the GFF file. When that is the case, doing the load as > + a single transcation can result in the machine running out of memory > + and killing processes. If --notranscat is provided on the commandline, > + each table will be loaded as a separate transaction. > + > + =item Sequence > + > + By default, if there is sequence in the GFF file, it will be loaded > + into the residues column in the feature table row that corresponds > + to that feature. By supplying the --nosequence option, the sequence > + will be skipped. You might want to do this if you have very large > + sequences, which can be difficult to load. In this context, "very large" > + means more than 200MB. > + > =item The ORGANISM table > > *************** > *** 107,118 **** > =item The Gap GFF reserved tag not supported > > ! Just flat out not supported--if you would like to see support, contact > the authors > > =item Any custom (ie, lowercase-first) tag is supported > > ! Custom tags are supported, provided they already have an entry in the cvterm table. > ! For example, if you have a custom tab, 'orf_classification', you need > ! an entry in the dbxref and cvterm tables something like this: > > > --- 130,141 ---- > =item The Gap GFF reserved tag not supported > > ! Just flat out not supported yet--if you would like to see support, contact > the authors > > =item Any custom (ie, lowercase-first) tag is supported > > ! Custom tags are supported, provided they already have an entry in the > ! cvterm table. For example, if you have a custom tab, 'orf_classification', > ! you need an entry in the dbxref and cvterm tables something like this: > > > *************** > *** 158,162 **** > =cut > > ! my ($ORGANISM, $GFFFILE, $DBNAME, $DBUSER, $DBPASS, $DBHOST, $DBPORT, $ANALYSIS, $ANALYSIS_GROUP, $GLOBAL_ANALYSIS, $NOLOAD, $VALIDATE); > > if (eval {require Bio::GMOD::Config; > --- 181,185 ---- > =cut > > ! my ($ORGANISM, $GFFFILE, $DBNAME, $DBUSER, $DBPASS, $DBHOST, $DBPORT, $ANALYSIS, $ANALYSIS_GROUP, $GLOBAL_ANALYSIS, $NOLOAD, $VALIDATE, $NOTRANSACT, $NOSEQUENCE); > > if (eval {require Bio::GMOD::Config; > *************** > *** 188,191 **** > --- 211,216 ---- > 'noload' => \$NOLOAD, > 'validate' => \$VALIDATE, > + 'notransact' => \$NOTRANSACT, > + 'nosequence' => \$NOSEQUENCE, > ) or ( system( 'pod2text', $0 ), exit -1 );; > > *************** > *** 197,200 **** > --- 222,227 ---- > $DBPORT ||='5432'; > $VALIDATE ||=0; > + $NOTRANSACT ||=0; > + $NOSEQUENCE ||=0; > > $GLOBAL_ANALYSIS=0; > *************** > *** 289,293 **** > ######################## > my $db = DBI->connect("dbi:Pg:dbname=$DBNAME;port=$DBPORT;host=$DBHOST", > ! $DBUSER,$DBPASS, {AutoCommit => 0}); > > my $sth = $db->prepare("select nextval('$sequences{feature}')"); > --- 316,320 ---- > ######################## > my $db = DBI->connect("dbi:Pg:dbname=$DBNAME;port=$DBPORT;host=$DBHOST", > ! $DBUSER,$DBPASS, {AutoCommit => $NOTRANSACT}); > > my $sth = $db->prepare("select nextval('$sequences{feature}')"); > *************** > *** 936,946 **** > > #deal with sequence > ! open SEQ, ">$files{sequence}" or die; > ! while (my $seq = $gffio->next_seq) { > ! my $string = $seq->seq(); > ! my $name = $seq->display_id(); > ! print SEQ "UPDATE feature set residues='$string' WHERE uniquename='$name';\n"; > } > - close SEQ; > > if(!$NOLOAD){ > --- 963,975 ---- > > #deal with sequence > ! unless ($NOSEQUENCE) { > ! open SEQ, ">$files{sequence}" or die; > ! while (my $seq = $gffio->next_seq) { > ! my $string = $seq->seq(); > ! my $name = $seq->display_id(); > ! print SEQ "UPDATE feature set residues='$string' WHERE uniquename='$name';\n"; > ! } > ! close SEQ; > } > > if(!$NOLOAD){ > *************** > *** 953,967 **** > } > > ! $db->commit || die "commit failed: ".$db->errstr(); > $db->{AutoCommit}=1; > > #load sequence > ! warn "Loading sequences (if any) ...\n"; > ! open SEQ, $files{sequence} or die; > ! while (<SEQ>) { > ! chomp; > ! $db->do($_); > } > - close SEQ; > > warn "Optimizing database (this may take a while) ...\n"; > --- 982,998 ---- > } > > ! ($db->commit || die "commit failed: ".$db->errstr()) unless $NOTRANSACT; > $db->{AutoCommit}=1; > > #load sequence > ! unless ($NOSEQUENCE) { > ! warn "Loading sequences (if any) ...\n"; > ! open SEQ, $files{sequence} or die; > ! while (<SEQ>) { > ! chomp; > ! $db->do($_); > ! } > ! close SEQ; > } > > warn "Optimizing database (this may take a while) ...\n"; > > > > ------------------------------------------------------- > SF.Net email is sponsored by: Tell us your software development plans! > Take this survey and enter to win a one-year sub to SourceForge.net > Plus IDC's 2005 look-ahead and a copy of this survey > Click here to start! http://www.idcswdc.com/cgi-bin/survey?id=105hix > _______________________________________________ > Gmod-schema-cmts mailing list > Gmo...@li... > https://lists.sourceforge.net/lists/listinfo/gmod-schema-cmts > |