Showing 29 open source projects for "version 2"

View related business solutions
  • Catch Bugs Before Your Customers Do Icon
    Catch Bugs Before Your Customers Do

    Real-time error alerts, performance insights, and anomaly detection across your full stack. Free 30-day trial.

    Move from alert to fix before users notice. AppSignal monitors errors, performance bottlenecks, host health, and uptime—all from one dashboard. Instant notifications on deployments, anomaly triggers for memory spikes or error surges, and seamless log management. Works out of the box with Rails, Django, Express, Phoenix, Next.js, and dozens more. Starts at $23/month with no hidden fees.
    Try AppSignal Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    WordCount

    WordCount

    Count frequency of single, 2-word and 3-word clusters in a text

    The program can read a text file and count the occurrences of single words and clusters of 2 and 3 words. The resulting list will be sorted in descending order (highest frequency on top).
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2

    Tokenized Text Aligner

    Aligns tokens in two versions of a text with differing tokenization.

    This tool performs token-by-token alignment of two versions of a text with differing tokenization by interpreting the results of a file diff (https://docs.python.org/3/library/difflib.html). It is intended for use in the preparation of annotated linguistic corpora, where differences in tokenization may arise (i) following corrections or modifications to the source text or (ii) through the creation of different layers of annotation (part-of-speech, treebank) requiring different tokenization....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    Arabic Corpus

    Text categorization, arabic language processing, language modeling

    The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    concordia

    concordia

    Powerful search library, best suited for computer-aided translation

    Concordia - Roman goddess of agreement. Concordance searcher - tool for translators who need their translations to "agree" with one standard. Concordia is a C++ library for fast text lookup in large corpora. It uses a RAM stored index, which takes up approximately 600MB of memory for a corpus of 2 million sentences. It is based on the idea of a suffix array, enhanced by the presence of other auxiliary data structures. The effects are stunning - Concordia is able to do simple substring...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • 5
    Welsh Natural Language Toolkit
    ...The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. The project also includes CYMRIE an adapted version for Welsh of the GATE - ANNIE Named Entity Recognition (NER) application for a range of entities such as Persons, Organisations, Locations, and date and time expressions. Version 2.x The CYMRIE pipeline is accessible via a API, standalone GUI and CLI. The CymrIE pipeline has also been adapted for Twitter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    PADIC

    A multilingual Parallel Arabic DIalectal Corpus

    PADIC (Parallel Arabic DIalectal Corpus) is a multi-dialectal corpus built in the framework of the National Research Project "TORJMAN", led by Scientific and Technical Research Center for the Development of Arabic Language and funded by the Algerian Ministry of Higher Education and Scientific Research. PADIC is composed of 6 dialects: two Algerian dialects (Algiers and Annaba cities), Palestinian, Syrian, Tunisian, Moroccan) and MSA. Mourad Abbas Computational Linguistics Department,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering,...
    Leader badge
    Downloads: 272 This Week
    Last Update:
    See Project
  • 8

    Drug Extraction

    Drug name extraction

    Drug name recognition and normalisation/grounding to DrugBank ids and standard names. Package provides 2 taggers: 1. DrugTagger - CRF-based with DrugBank presence feature (see feature set for details). 2. DrugnameGazetteer - gazetteer/dictionary-based. Dictionary created from DrugBank.ca database. Both taggers include grounding/normalisation to DrugBank ids and standard names. Feature set: Word, Word-1, Word+1, Word-1_Word, Word_Word+1, DrugBankPresence, POS DrugBankPresence feature indicates the presence of the drug name in the DrugBank. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    Board Game Language
    Board Game Language (BGL, pronounced "bagel") is a natural language syntax programming language for first-time programmers. It uses board games as a metaphor for programming concepts, with the goal of teaching users the foundations of programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    CoSyne Integrated Prototype
    Multilingual Content Synchronization with Wikis: CoSyne is a Research and Technological Development project co-funded by the European Union. Details: http://cosyne.eu
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    BuckTagger

    User-assisted tool for Arabic stem entry to Buckwalter Morpho Analyzer

    Using rules written in a Drools decision table, BuckTagger determines the correct Buckwalter Tag based on morphological properties of the input, automatically extracted or given by the user. At the moment, BuckTagger is not complete; it can only handle input that is: - Uninflected - In lexical form, i.e., no clitics or affixes. - A Perfect or Imperfect Verb - Preferably the first and before-last letters are diacritized/vocalized. The interface is in Arabic. See the README for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    TextComparer

    Small Java program to compare two texts

    Small Java program to compare two texts, originally designed to be used to find quotations in a Byzantine anthology. It can quite likely be used to detect plagiarism between two texts as well Graphical interface which allows easy navigation between corresponding parts in the two different texts. Uses the http://software.jessies.org/salma-hayek/ Java TextArea for this. Probably very much can be done to improve this program and the algorithm which it uses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Le projet Gramlab vise à mettre à disposition des entreprises des outils logiciels OpenSource et gratuits, qui peuvent être mis en oeuvre par des développeurs qui ne sont pas spécialistes du traitement des langues. Note : L'outil GLabCorpus Manager nécessite l'installation d'un serveur SolR. Pour le télécharger et plus d'information, veuillez vous rendre dans la section Files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Smulsa 2001 is a multilanguage two-way translator, transliterator, and dictionary. Its dependencies are gambas2-ide & gambas2-gb-db-sqlite. The application needs its database to run properly. Smulsa 2001 adalah penerjemah, pengalih aksara, dan kamus dua arah multibahasa. Dependensinya gambas2-ide & gambas2-gb-db-sqlite. Aplikasi ini memerlukan basisdatanya untuk berjalan dengan baik.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    jWords is a port of WORDS (by William Whitaker, a free latin-to-english dictionary program written in Ada), to Java. Besides the dictionary will be translated to the German language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    espeak-tswana is a branch of espeak project implementing Setswana (A Southern African Bantu speaking language) .
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Affisix
    Affisix is a program for automatic recognition of prefixes. It takes large amount of words and according to the user setting it tries to determine which segments of these words are prefixes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Genie
    Genie is a highly sophisticated cognitive child-machine. Genie at its core is an artificial intelligence project, focusing on creating a new form of life.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Python API to the german wordnet GermaNet. In the current state this can be only seen as a quickstart-help to access GermaNet. To be honest, this API can't be called API. To use it, you will need access to a licensed copy of GermaNet (version > 5).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Sprachraumkonverter als Fortsetzung von V3C, Plan C
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A collection of Metasyntaxes like EBNF for .Net including a definition file parser and an expression tree.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Nekoshka is a cross-platform open-source shell for Japanese dictionaries like edict and yarxi. It supports radical lookup, handwriting and direct keyboard input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    SEPa! Java grammar related class library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB