Search Results for "machine learning for text categorization java"

Showing 38 open source projects for "machine learning for text categorization java"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software. Icon
    Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software.

    Banks, lending institutions

    Founded in 2004, axefinance is a global market-leading software provider focused on credit risk automation for lenders looking to provide an efficient, competitive, and seamless omnichannel financing journey for all client segments (FI, Retail, Commercial, and Corporate.)
    Learn More
  • 5
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    RE/flex lexical analyzer generator

    RE/flex lexical analyzer generator

    The regex-centric, fast lexical analyzer generator for C++

    A C++ high-performance regex library and Flex-compatible lexical analyzer generator with full Unicode support, new indentation anchors, lazy quantifiers, and many other modern features. Accepts Flex lexer specification syntax and is compatible with Bison/Yacc parsers. Generates reusable source code that is easy to understand. Supports fast scanning of UTF-8/16/32 files, strings, and streams. The reflex scanner generator generates clean C++ lexer class code that is thread-safe. Generates...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Hacker Scripts

    Hacker Scripts

    Based on a true story

    Hacker Scripts is a cheeky collection of small automation scripts and language ports collected under the tagline “Based on a true story.” The repository gathers playful utilities (originally shell and Ruby scripts) that automate short, real-world tasks — for example, sending a quick “late at work” text when SSH sessions are active, firing off an automated “I’m sick / working from home” email on certain mornings, or even talking to a networked coffee machine to start brewing at precisely the...
    Downloads: 33 This Week
    Last Update:
    See Project
  • 8
    Lingua

    Lingua

    The most accurate natural language detection library for Java

    Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Java Tablesaw

    Java Tablesaw

    Java dataframe and visualization library

    Tablesaw is a dataframe and visualization library that supports loading, cleaning, transforming, filtering, and summarizing data. If you work with data in Java, it may save you time and effort. Tablesaw also supports descriptive statistics and can be used to prepare data for working with machine learning libraries like Smile, Tribuo, H20.ai, DL4J. Import data from RDBMS, Excel, CSV, TSV, JSON, HTML, or Fixed Width text files, whether they are local or remote (http, S3, etc.) Tablesaw supports data visualization by providing a wrapper for the Plot.ly JavaScript plotting library. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered conversation intelligence software Icon
    AI-powered conversation intelligence software

    Unlock call analytics that provide actionable insights with our call tracking software, empowering you to identify what's working and what's not.

    Every customer interaction is vital to your business success and revenue growth. With Jiminny’s AI-powered conversation intelligence software, we take recording, capturing, and meticulous analysis of call recordings to the next level. Unlock call analytics that provide actionable insights with our call tracking software, empowering you to identify what's working and what's not. Seamlessly support your biggest objectives across the entire business landscape with our innovative call tracking system.
    Learn More
  • 10
    libpostal

    libpostal

    A C library for parsing/normalizing street addresses around the world

    A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data. libpostal is a C library for parsing/normalizing street addresses around the world using statistical NLP and open data. The goal of this project is to understand location-based strings in every language, everywhere. Addresses and the locations they represent are essential for any application dealing with maps (place search, transportation, on-demand/delivery services,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Microsoft Bot Framework SDK

    Microsoft Bot Framework SDK

    Tool for building conversation applications

    Bot Framework provides the most comprehensive experience for building conversation applications. With the Bot Framework SDK, developers can build bots that converse free-form or with guided interactions including using simple text or rich cards that contain text, images, and action buttons. Developers can model and build sophisticated conversation using their favorite programming languages including C#, JS, Python and Java or using Bot Framework Composer, an open-source, visual authoring...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Arabic Corpus

    Text categorization, arabic language processing, language modeling

    The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    DSTK - Data Science TooKit 3

    DSTK - Data Science TooKit 3

    Data and Text Mining Software for Everyone

    DSTK - Data Science Toolkit 3 is a set of data and text mining softwares, following the CRISP DM model. DSTK offers data understanding using statistical and text analysis, data preparation using normalization and text processing, modeling and evaluation for machine learning and algorithms. It is based on the old version DSTK at https://sourceforge.net/projects/dstk2/ DSTK Engine is like R. DSTK ScriptWriter offers GUI to write DSTK script. DSTK Studio offers SPSS Statistics like GUI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    GNAT

    GNAT

    GNAT recognizes gene names in text and maps them to NCBI Entrez Gene

    GNAT is a BioNLP/text mining tool to recognize and identify gene/protein names in natural language text. It will detect mentions of genes in text, such as PubMed/Medline abstracts, and disambiguate them to remove false positives and map them to the correct entry in the NCBI Entrez Gene database by gene ID. March 2017: We started to upload GNAT output on Medline. See files/results/medline/.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided. Thus, analysis of new types...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    sgmweka

    Weka wrapper for the SGM toolkit for text classification and modeling.

    Weka wrapper for the SGM toolkit for text classification and modeling. Provides Sparse Generative Models for scalable and accurate text classification and modeling for use in high-speed and large-scale text mining. Has lower time complexity of classification than comparable software due to inference based on sparse model representation and use of an inverted index. The provided .zip file is in the Weka package format, giving access to text classification. Other functions are usable through...
    Leader badge
    Downloads: 17 This Week
    Last Update:
    See Project
  • 18

    FENNIX

    Fast EXperimentation with Neural Networks

    FENNIX is a simulator of artificial neural networks written in Java. It allows you to easily describe a complete simulation by using a simple text script language or by adding nodes to a tree of tasks by using the graphical used interface. Moreover, FENNIX is composed of pluggable tools that can be easily modified in order to add new functionalities to the simulator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20

    JCLALtext

    Text processing module for JCLAL

    JCLALtext is a class library designed to extend the framework JCLAL text tasks. JCLALtext is free, open source and developed with the Java programming language. JCLALtext is distributed under the GNU license. The researcher can use the class library by adding it to your project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Intelligent Keyword Miner

    Intelligent Keyword Miner

    Intelligent SEO keyword miner and predicing tool

    THIS IS A NETBEANS 8.02 PROJECT ENGLISH ONLY This program was made to help me with the patent research. It simply generates the search keywords, based on your upvotes or a downvotes of the input parameters. It can accept a text or URL (text takes a prescedence over the URL). If you input URL, it goes to a page, and learns its text from HTML format. This program is intelligent as it predicts what you may want to search next, based on your personal trends. After searching the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    VarImpact

    Extracting effects of mutations on molecular properties from text.

    Genetic variants alter cellular behavior in a variety of ways, changing biochemical properties of DNA, mRNA, and proteins. Many large-scale sequencing projects are under way to detect human variation in health and disease. Although broad disease associations can be discovered by GWAS studies, the low-level impact of mutations is hardly available in structured form. The results of thousands of small-scale experiments, on the other hand, are present in the literature and discuss observations...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Consilium Sentence Suggestions Tools

    Consilium Sentence Suggestions Tools

    Consilium – User Defined sentence Suggestion Tool.

    There are many tools available in market which will provide spell correction or grammer correction while making documents, but very few tools are available which are providing sentence completion according to previously entered text. But this all are providing sentence complition suggestion for sentences which are oftenly or regularly used by all people in same manner. But in reality style of writing changes person to person. While our aim is to provide a sentence suggestion tool which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Java application for training and deploying text processing applications such as part-of-speech taggers, based on a re-implementation of Brill's algorithm in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next