Showing 368 open source projects for "text processing"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1

    Rootvole

    a text parsing library that matches text with concepts.

    For general processing of voice queries we developed a text parsing library named 'Rootvole' that can be used to match text with semantic concepts. The algorithm was implemented in Java and can be described as a form of a parsing expression grammar, where we generate the expressions to be detected beforehand by regular expressions and store them in a vocabulary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Speechalyzer

    Speechalyzer

    Process large speech data wrt transcription, labeling and annotation

    ...It is implemented as a client server based framework in Java and interfaces software for speech recognition, synthesis, speech classification and quality evaluation. The application is mainly the processing of training data for speech recognition and classification models and performing benchmarking tests on speech-to-text, text-to-speech and speech classification software systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Welsh Natural Language Toolkit

    Welsh Natural Language Toolkit

    WNLT is a suite of open source natural language modules for the Welsh

    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Osman Arabic Text Readability

    Osman Arabic Text Readability

    Open Source tool for Arabic text readability

    We present OSMAN (Open Source Metric for Measuring Arabic Narratives) - a novel open source Arabic readability metric and tool. The open source Java tool allows users to calculate readability for Arabic text (with and without diacritics). The tool provides methods to split the text into words and sentence, count syllables, Faseeh letters, hard and complex words in addition to adding diacritics (vocalise text). This makes the tool useful for researchers and educators working with Arabic text....
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5

    WebDjVuTextEd

    Edit the OCR text layer of DjVu documents in a web browser

    WebDjVuTextEd allows to edit the text layer of OCR'ed DjVu documents in a web browser. You can modify the structure (paragraphs, lines, words...) create, delete, edit text nodes, modify their container box by mouse, and run a spellchecker. The program does not directly read the DjVu files, it requires exported XML text data and images. When using without a webserver, you can open and save local files, but cannot take advantages of auto-save and spell checking. Note that current SVN...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering,...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 7
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    JCLALtext

    Text processing module for JCLAL

    JCLALtext is a class library designed to extend the framework JCLAL text tasks. JCLALtext is free, open source and developed with the Java programming language. JCLALtext is distributed under the GNU license. The researcher can use the class library by adding it to your project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    ...The name of the portals are as follows: 1. http://dosyalar.hurriyet.com.tr/rss. 2. http://www.posta.com.tr/rss. 3. http://www.iha.com.tr/rss.html. 4. http://www.haberturk.com/rss. 5. http://www.radikal.com.tr/rss/. 6. http://www.zaman.com.tr/rss_rssMainPage.action?sectionId=341. This data set is created in order to perform text mining operations on Turkish and make experimental results re-producable. The TTC-3600 data set has 4 different forms in terms of pre-processing: 1. Original: No pre-processing step is applied. 2. FPS-5: The first five characters of terms are selected as stem and stop-words elimination is performed. 3. FPS-7: The first seven characters of terms are selected as stem and stop-words elimination is performed. 4. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Vision2u

    Vision2u

    free image processing software

    Vision2u offers a free image processing software for personal use and research. Primary tasks of the image processing can be realized during simple operation of the software. Every Web cam owner can have simplest measuring, counting or tasks of monitoring done without high capital outlays.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    VADER

    VADER

    Lexicon and rule-based sentiment analysis tool

    VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool designed for analyzing the sentiment of text, particularly in social media and short text formats. It is optimized for quick and accurate analysis of positive, negative, and neutral sentiments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    ArabicDiacritizer

    ArabicDiacritizer

    An automatic restoration of Arabic diacritic marks

    This is a software of Arabic diacritical marks restoration. It is based mainly on deep architectures using deep neural network. The algorithm generates diacritized text with determined end case. The algorithm is described in detail in: Ilyes Rebai, and Yassine BenAyed 'Text-to-speech synthesis system with Arabic diacritic recognition system', Computer Speech & Language, 2015. We appreciate it very much if you can cite our related work. ************** Installation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    This project is devoted to the development of natural language processing tools and resources for the Lingala language, which is spoken by tens of millions of people in central Africa.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Machine translation engine based on a dependency grammar and XML interchange format. The Spanish-Basque (es-eu) translation is ready.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    Persica-A new Persian corpus for NLP

    This project presents a new corpus for NEWS text analysis in Persian

    Lack of multi-application text corpus despite of the surging text data is a serious bottleneck in the text mining and natural language processing especially in Persian language. This project presents a new corpus for NEWS articles analysis in Persian called Persica. NEWS analysis includes NEWS classification, topic discovery and classification, category classification and many more procedures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Speech Sentiment Analysis

    Voice to Text Sentiment Analysis

    Voice to text Sentiment analysis converts the audio signal to text to calculate appropriate sentiment polarity of the sentence. The code currently works on one sentence at a time. Sentiment scoring is done on the spot using a speaker. The Speech to text processing system currently being used is the MS Windows speech to text converter. However significant modifications can be made for audio recognition by a refined signal processing system. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Darkbot

    The IRC's Talking Robot

    [ Please read https://sourceforge.net/p/darkbot/news/2014/01/darkbots-revitalization/ ] Darkbot is a portable IRC chat robot written in the C language that can be taught responses to user inquiries, and even have conversations with them. Darkbot was originally created by Jason Hamilton as an aid for help channels on Intenet Relay Chat.
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A python module that provides algorithms for advanced search - basically all you need to build a search engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Consilium Sentence Suggestions Tools

    Consilium Sentence Suggestions Tools

    Consilium – User Defined sentence Suggestion Tool.

    There are many tools available in market which will provide spell correction or grammer correction while making documents, but very few tools are available which are providing sentence completion according to previously entered text. But this all are providing sentence complition suggestion for sentences which are oftenly or regularly used by all people in same manner. But in reality style of writing changes person to person. While our aim is to provide a sentence suggestion tool which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    HAWK - PDF Text Search Java Project

    HAWK - PDF Text Search Java Project

    No more support for this project - TAKE A LOOK AT FALCONSEARCH

    No more support for this project - TAKE A LOOK AT FALCONSEARCH "https://sourceforge.net/projects/falcontextsearch/"
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB