Perl Entrez Gene Parser project provides Perl parsers for NCBI's Entrez Gene based on regular expression, Parse::RecDescent, Parse::Yapp and Perl-byacc. Some can parse human genome annotations in minutes. Documentation and user guides are provided.
Be the first to post a text review of Entrez Gene Parser in Perl. Rate and review a project by clicking thumbs up or thumbs down in the right column.
updated the parser so it can parse the NCBI 4/5/2005 download minor change/fixes to some code.
Added parser/indexer for NCBI's ASN.1-formatted sequence files (like Genbank records). Updated test, example scripts and documentation Minor fix on parse_entrez_gene_example.pl Added code to deal with CCDS xref and Hugo symbol (under gene properties! unlike before) in parse_entrez_gene_example.pl Updated parser & indexer file handle code to work with perl version 5.005_03 (previous code since 1.07 only works with 5.6 or higher). Commented out count_records call in testindex.t to allow successful test on 5.005_03-compatible bioperl versions.
Added feature over V1.08 includes: Added parser/indexer for NCBI's ASN.1-formatted sequence files (like Genbank records). Updated test, example scripts and documentation Minor fix on parse_entrez_gene_example.pl Added code to deal with CCDS xref and Hugo symbol (under gene properties! unlike before) in parse_entrez_gene_example.pl Fixed incompatibility with Perl 5.005_03 in codes introduced in V1.07 after Indexer was implemented.
Added a fast indexer (This indexer indexes human file in 21 seconds on one Xeon 2.4 GHz CPU). Added test scripts Minor bug fixes Added new convenient methods (rawdata() and fh()) Now file handles are accepted too (by new() and fh(). new() also now accept '-file', '-fh', 'fh' in addition to 'file') Updated documentation
Version 1.05 is released in response to the new NCBI Entrez Gene download file. NCBI introduced some changes to the file format that broke earlier versions of my parsers. This version 1.05 works for both the new NCBI format and the previous NCBI format, so it is recommended that you download this version. Note that, however, this version is about 30% slower than version 1.04 due to the changes. IMPORTANT NOTE: I only fixed my regex-based parser on this new NCBI Entrez Gene file, which means that from this version on, the other parsers in the package will cease to work on the latest NCBI Entrez Gene downloads. Version 1.05 will also be the last version of my packages that contain ALL FOUR parsers I created. The regex-based parser will be actively maintained in the future versions, however.
Copyright © 2009 Geeknet, Inc. All rights reserved. Terms of Use
Thanks for your rating!
Would you also like to write a review?