Download Latest Version gnat-binaries-1.31.tar.gz (37.4 MB)
Email in envelope

Get an email when there's a new version of GNAT

Home / data
Name Modified Size InfoDownloads / Week
Parent folder
README-data.txt 2017-09-15 1.5 kB
gnat_sql_data-1.5.tar.gz 2017-09-15 595.6 MB
gnat_GoMeSH_dictionaries-1.3.tar.gz 2015-11-11 22.6 MB
gnat_human_dictionaries-1.3.tar.gz 2015-08-27 22.9 MB
gnat_dictionaries-1.1.tar.gz 2011-07-21 125.1 MB
Totals: 5 Items   766.2 MB 0
This folder contains important data sources for GNAT.

You should always obtain the latest data releases, even if only partial, to
replace older data. All data are up- and downwards compatible at this point,
but this is not guaranteed for future releases. As of August 2017, for 
example, the binary release 1.3 works best with
- gnat_human_dictionaries-1.3.tar.gz (latest human dictionaries)
- gnat_dictionaries-1.1.tar.gz (human, mouse, rat, fruit fly, etc. dictionaries)
- gnat_sql_data-1.5.tar.gz (database)
However, any other combination will also work.


Dictionaries are used by GNAT for the initial recognition of gene names in
text. There is one dictionary (split into binary files) for each species.

SQL data is data that need to be loaded into a relational database system and
that GNAT needs to disambiguate the recognized gene names to assign NCBI Entrez
Gene IDs. The SQL data represent a profile for each gene, describing its (and
it's products') function, tissue specificity, synonyms, disease assocation, and
so on.


Dictionaries always go into the dictionaries/ folder, with subfolders organized
by each species NCBI Taxonomy ID (9606 for human, 10116 for mouse, etc.).

Access to the SQL database can be configured in the isgn_properties.xml file:
database username (ideally, a read-only user), password, DB host, port, etc.


Please also see the general README and installation instructions provided with
the code releases and/or the individual data packages.

Source: README-data.txt, updated 2017-09-15