VMs, containers, AI, databases, storage | build anything. No commitment to start.
Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale with Google Cloud.
Start Building Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.
You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
The NITE XML Toolkit supports the creation, analysis, and browsing of annotated multimodal, text, or spoken language corpora, and represents both timing and rich linguistic structure. It contains libraries for developers and some end user tools.
D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.
Web-as-corpus tools in Java.
* Simple Crawler (and also integration with Nutch and Heritrix)
* HTML cleaner to remove boiler plate code
* Language recognition
* Corpus builder
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
JUNG provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network.
New version now available on GitHub: https://github.com/jrtom/jung/releases/tag/jung-2.1
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.
ANNJ, Another Neural Network for Java is a neural network framework for the Java programming language. It is still in an early development stage, currently supporting only feed-forward type networks, but will soon be able to handle many other types.
TBLTools is a set of GATE processing resources that implements the Fast Transformation Based Learning Algorithm. You can train it to learn rules for NLP tasks such as Named Entity Recognition and Shallow parsing.
LT4eL (Language Technology for e-Learning) develops a framework of multilingual language technology tools and semantic web techniques for improving the retrieval and the metadata annotation of learning material.
SYRAH si propone di far emergere e rappresentare i concetti espressi per mezzo di un linguaggio naturale. SYRAH aims to discover and represent concepts expressed in natural languages. NLP, lemma, lemmario, italiano, rete, semantica, clustering, semantic
This is a suite of several software agents to provide a complete architecture of lexical base as proposed in Didier Schwab's PhD. thesis. It will be used for automatic translation, information retrieval and other natural language processing tasks.
The program provides Java interface (to C++ Lemmatizer via XML-RPC) in order to perform lemmatizing in Russian, English, and German (lemma is the canonical form of a lexeme in Natural Language Processing). RussianPOSTagger could work as a module of GATE.
The Java Text Categorizing Library (JTCL) is a pure java implementation of libTextCat which in turn is "a library that was primarily developed for language guessing, a task on which it is known to perform with near-perfect accuracy."
JWebPro: A Java tool that can interact with Google search and then process the returned Web documents in a couple of ways. The outputs can serve as inputs for NLP, IR, infor extraction, Web mining, online social network extraction/analysis applications.
Java Expert Rule Based Inference Language. Jerbil is an open source rule processing engine written in Java. Currently Jerbil supports a full set of processing functions with text-based and XML interfaces; a Java interface is planned.
JVnSegmenter is a Java-based and open-source Vietnamese word segmentation tool. The segmentation model was trained on about 8,000 sentences using Conditional Random Fields (FlexCRFs). This tool would be useful for Vietnamese NLP community.
K-automaton is a new parsing (syntactic analysis) machine isomorphous to language. Implemented in Java. Can generate Java code from grammars described in EBNF.
Bitnets instantiates and operates on graphs and subgraphs of large complex networks, such as kinship networks. Bitnets consists mainly of a java library, a number of use examples and an interactive interpreted language interface.
The Text Annotation Environment (tae) can be used to annotate natural language text manually or automatically (UIMA Annotator) with meta information (tokens, part-of-speech, named entities, ...). Tae is based on Eclipse and IBM's UIMA.
JRete is a rule engine written in Java. Advantages over other expert system shell and artificial intelligence(AI) API - code rules in javalanguage, data may compute accross network with multiple JRete, auto data persistence to database, event-fire direc
The UEMLFacilitator project aims to develop and evaluate a prototype GUI for defining and managing the Unified Enterprise Modelling Language version 2 (UEML2).
JudoScript is a general-purpose, Java scripting, multi-domain scripting tool/language. It combines the powers of declarative scripting for many modern tasks and general object/procedural programming. It is simple, intuitive, practical and powerful.
It is an universal language translator and written in Java. All languages are translated to an unique language (interlingua) and generate any native language from the interlingua. The wordbooks are XML. It use the context of a text, rules and a grammar.