Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README-data.txt | 2017-09-15 | 1.5 kB | |
gnat_sql_data-1.5.tar.gz | 2017-09-15 | 595.6 MB | |
gnat_GoMeSH_dictionaries-1.3.tar.gz | 2015-11-11 | 22.6 MB | |
gnat_human_dictionaries-1.3.tar.gz | 2015-08-27 | 22.9 MB | |
gnat_dictionaries-1.1.tar.gz | 2011-07-21 | 125.1 MB | |
Totals: 5 Items | 766.2 MB | 0 |
This folder contains important data sources for GNAT. You should always obtain the latest data releases, even if only partial, to replace older data. All data are up- and downwards compatible at this point, but this is not guaranteed for future releases. As of August 2017, for example, the binary release 1.3 works best with - gnat_human_dictionaries-1.3.tar.gz (latest human dictionaries) - gnat_dictionaries-1.1.tar.gz (human, mouse, rat, fruit fly, etc. dictionaries) - gnat_sql_data-1.5.tar.gz (database) However, any other combination will also work. Dictionaries are used by GNAT for the initial recognition of gene names in text. There is one dictionary (split into binary files) for each species. SQL data is data that need to be loaded into a relational database system and that GNAT needs to disambiguate the recognized gene names to assign NCBI Entrez Gene IDs. The SQL data represent a profile for each gene, describing its (and it's products') function, tissue specificity, synonyms, disease assocation, and so on. Dictionaries always go into the dictionaries/ folder, with subfolders organized by each species NCBI Taxonomy ID (9606 for human, 10116 for mouse, etc.). Access to the SQL database can be configured in the isgn_properties.xml file: database username (ideally, a read-only user), password, DB host, port, etc. Please also see the general README and installation instructions provided with the code releases and/or the individual data packages.