Taxamatch

"Taxamatch" is an algorithm designed for fuzzy matching of scientific names of taxa - genera alone, or binomials (genus+species) - in taxonomic databases. It utilises both character substitution (similar to Soundex) to catch phonetic errors, and a customised edit distance (ED) approach to catch non-phonetic ones, which can be up to 50% of all errors in real-world queries. Since ED-based queries are typically slow against large data sets, Taxamatch includes a range of optimisations to heavily reduce the number of names to be tested at query time without impacting on recall of likely intended correctly spelled target names, speeding up overall query time by a factor of between x100 and x1000.

For a more complete discussion of the algorithm, refer to this journal article published in 2014: "Taxamatch, an Algorithm for Near (‘Fuzzy’) Matching of Scientific Names in Taxonomic Databases". https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0107510

Project Samples

Project Activity

See All Activity >

Follow Taxamatch

Taxamatch Web Site

Other Useful Business Software

AI-powered service management for IT and enterprise teams

Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free

Rate This Project