Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.
Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
An extremely fast implementation of Aho Corasick algorithm
AhoCorasickDoubleArrayTrie is a Java implementation of the Aho–Corasick multi-pattern matching algorithm that is optimized using a Double-Array Trie data structure. It is designed for fast keyword scanning across large texts, where you want to search for many patterns simultaneously and efficiently. The core idea is to build an automaton from a dictionary of patterns, then stream through input text to emit matches with minimal overhead. By using a double-array trie representation, the...
The DjVu complete solution,with OCR Technology(Arabic ,English).
DjVu++ is a user-friendly program that used to manipulate DjVu file formats
such as eBooks with a penalty of editing features. The program introduce a free
replacement for the property PDF format with similar resolution and smaller
file size
DjVu++ also support OCR to handle text in scanned books and images.
The program shows good performance for English. In addition to the Arabic
language to lead free and commercial software in this area.
The main features of DjVu++ program are:
o...
This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.
reputron is a knowledge extraction engine platform that covers all aspect of text mining, relevance, indexing and querying on a corpus of text documents.