Concordia - Roman goddess of agreement. Concordance searcher - tool for translators who need their translations to "agree" with one standard.

Concordia is a C++ library for fast text lookup in large corpora. It uses a RAM stored index, which takes up approximately 600MB of memory for a corpus of 2 million sentences. It is based on the idea of a suffix array, enhanced by the presence of other auxiliary data structures.

The effects are stunning - Concordia is able to do simple substring lookup at the pace of 5000 queries per second (on personal PC) - a speed which can not be achieved by any other search library.

Moreover, Concordia can perform its own "concordia search". For a given input sentece, all substring matches covering this sentence are retrieved.

This project now contains fully functional Concordia search library. In the near future, it will be extended by concordia-server: ligthweight, robust web server providing corpora search functionalities

Project Activity

See All Activity >

License

GNU Library or Lesser General Public License version 3.0 (LGPLv3)

Follow concordia

concordia Web Site

Other Useful Business Software
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of concordia!

Additional Project Details

Operating Systems

BSD, Linux

Languages

English

Intended Audience

Developers, Science/Research

User Interface

Console/Terminal

Programming Language

C, C++

Related Categories

C++ Information Analysis Software, C++ Linguistics Software, C++ Libraries, C Information Analysis Software, C Linguistics Software, C Libraries

Registered

2015-04-24