"Taxamatch" is an algorithm designed for fuzzy matching of scientific names of taxa - genera alone, or binomials (genus+species) - in taxonomic databases. It utilises both character substitution (similar to Soundex) to catch phonetic errors, and a customised edit distance (ED) approach to catch non-phonetic ones, which can be up to 50% of all errors in real-world queries. Since ED-based queries are typically slow against large data sets, Taxamatch includes a range of optimisations to heavily reduce the number of names to be tested at query time without impacting on recall of likely intended correctly spelled target names, speeding up overall query time by a factor of between x100 and x1000.

For a more complete discussion of the algorithm, refer to this journal article published in 2014: "Taxamatch, an Algorithm for Near (‘Fuzzy’) Matching of Scientific Names in Taxonomic Databases". https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0107510

Project Samples

Project Activity

See All Activity >

Follow Taxamatch

Taxamatch Web Site

You Might Also Like
Free CRM Software With Something for Everyone Icon
Free CRM Software With Something for Everyone

216,000+ customers in over 135 countries grow their businesses with HubSpot

Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
Get free CRM
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Taxamatch!

Additional Project Details

Registered

2020-12-21