Showing 108 open source projects for "text processing"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 1
    lottie vectors

    lottie vectors

    Create, display and process 2D vectors in a 3D window.

    Lottie Vectors is an application for Matlab that alows you to do some pretty neat things -with vectors. More exactly -displaying them in ways that hopefully will allow you to explore and better understand your vector data. The basic idea is simple, take a vector defined in one of a few different types of data formats and map it on the screen. Add another vector and you start to form a 'route'. Each route or position vector can be accompanied with a 'force' vector. This can be used to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    IceNLP is an open source Natural Language Processing (NLP) toolkit for analyzing and processing Icelandic text. The toolkit is implemented in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    High-Throughput Tabular Data Processor
    ...Citation: Madanecki P, Bałut M, Buckley PG, Ochocka JR, Bartoszewski R, Crossman DK, et al. (2018) High-Throughput Tabular Data Processor – Platform independent graphical tool for processing large data sets. PLoS ONE 13(2): e0192858. https://doi.org/10.1371/journal.pone.0192858
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Capsim(r) C Text Mode Kernel(TMK),DSP and communication blocks, topologies, libraries and tools for the development of high performance block diagram digital signal processing and communications systems,built in interpreter for scripting.SystemC support.
    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5

    Musaheb

    An Arabic collocation extraction tool

    “Musaheb”, an Arabic collocation extraction tool that has been designed and implemented to overcome the limitations of existing collocation extraction tools. “Musaheb” is able to extract n-gram collocations up to 5-gram, in addition to extracting the collocates of the nodes (the word-types we are looking for its collocates) within a window size of zero to 15 words. Moreover, it provides eight collocation statistics to calculate the strength of the collocation, and permits the input of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    PSIworx

    Data processing for the PSI fluorometer

    PSIworxR (R) and PSIworx (MATLAB) are a collection of functions and scripts to analyze data from the PSI SuperHead Fast Fluorometer series (www.psi.cz). These fluorometers are used in limnology and oceanography research communities. The program retrieves parameters from single turnover induction and relaxation. Results are returned as text files and PDF figures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Welsh Natural Language Toolkit
    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    Discriminative Language Editor

    Discriminative language editor based on ontologies

    Text editor in Java that is able to detect discriminative expressions while the user is typing. When the internal ontology-based analyzer detects a potential discriminative expression the user is advised by underscoring the related words in the text. A descriptive message about the issue is also shown to the user when the cursor is placed over the potential discriminative expression.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. ...
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 10
    Welsh Natural Language Toolkit

    Welsh Natural Language Toolkit

    WNLT is a suite of open source natural language modules for the Welsh

    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    ATTENTION! Morfologik is now at GitHub: https://github.com/morfologik/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    iMir

    Integrated pipeline for HT miRNA-Seq data analysis

    Processing of smallRNA-Seq data to gather biologically relevant information requires application of multiple statistical and bioinformatics tools from different sources, each focusing on a specific step of the analysis pipeline. The analytical workflow can be challenging for the continuous interventions by the operator, a critical factor when large numbers of datasets need to be analyzed at once.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ArabicDiacritizer

    ArabicDiacritizer

    An automatic restoration of Arabic diacritic marks

    This is a software of Arabic diacritical marks restoration. It is based mainly on deep architectures using deep neural network. The algorithm generates diacritized text with determined end case. The algorithm is described in detail in: Ilyes Rebai, and Yassine BenAyed 'Text-to-speech synthesis system with Arabic diacritic recognition system', Computer Speech & Language, 2015. We appreciate it very much if you can cite our related work. ************** Installation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    This project is devoted to the development of natural language processing tools and resources for the Lingala language, which is spoken by tens of millions of people in central Africa.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Khawas

    An Arabic Corpora Processing Tool

    The new version is available at https://sourceforge.net/projects/ghawwasv4/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    SetFon focus is an interface web based for Praat resources (www.praat.org) wich focus speech sound annalysis; it is a gerent program for acoustic analysis PHP/Mysql based. Developed with the framework SIMP.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    HAWK - PDF Text Search Java Project

    HAWK - PDF Text Search Java Project

    No more support for this project - TAKE A LOOK AT FALCONSEARCH

    No more support for this project - TAKE A LOOK AT FALCONSEARCH "https://sourceforge.net/projects/falcontextsearch/"
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Knowtator is a general-purpose text annotation tool that is integrated with the Protégé knowledge representation system. Knowtator facilitates the manual creation of training and evaluation corpora for a variety of biomedical language processing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    gannu

    gannu

    Java API and tools for performing NLP and other AI tasks

    Java API and tools for performing a wide range of AI tasks such as: word sense disambiguation (released), optimization (5 Evolutionary Algorithms Implemented ETA February 2014), opinion mining (ETA November 2014) and text wikification (ETA July 2014). Gannu includes some graphical interfaces for scientific purposes. When using Gannu please cite: *Jiménez, F. V., Gelbukh, A. F. & Sidorov, G. (2013). Simple Window Selection Strategies for the Simplified Lesk Algorithm for Word Sense...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB