The Java Text Categorizing Library (JTCL) is a pure java implementation of libTextCat which in turn is "a library that was primarily developed for language guessing, a task on which it is known to perform with near-perfect accuracy."
The aim of MIEX (Metadata and Information Extractor from small XML documents) is to create a wrapper for the Stanford Parser, to extract and store metadata (syntactic structures, relationships among words...) from simple XML documents.
hypKNOWsys aims at developing a Java-based workbench for knowledge discovery and knowledge management. Currently, hypKNOWsys has released two intermediate tools: DIAsDEM Workbench (text mining for semantic tagging) and WUMprep (Web mining pre-processing)
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
JWebPro: A Java tool that can interact with Google search and then process the returned Web documents in a couple of ways. The outputs can serve as inputs for NLP, IR, infor extraction, Web mining, online social network extraction/analysis applications.
Trauma registry suite; Data collection application and server scripts to build trauma data warehouse and perform web-based analysis reporting. Cross-platform compatible for Windows, Apple, Unix, or Linux.
Java Expert Rule Based Inference Language. Jerbil is an open source rule processing engine written in Java. Currently Jerbil supports a full set of processing functions with text-based and XML interfaces; a Java interface is planned.
It is an application for Bluetooth-enabled mobile phones that allows you to connect to other BluetunA users in range and share music recommendations. Mobile music, metadata sharing, Bluetooth applications, proximity-based interactions, social awareness
MultiJADS is a domain independent multiagent active design documents shell. It uses multiagent technology to support activities in concurrent and distributed design systems and is based on the Active Design Documents (ADD) approach.
A user-friendly open-source toolkit written in Java that lets you visualize and analyze the behaviour of users in the ActiveWorlds family of 3D virtual worlds by mapping them over 2D space.
geolocate is a front-end java program that works with google maps to provide dynamic maps to users. Combined with the flexibility of XML and the power of javascript, users can see various relationships on their map to draw conclusions.
JVnSegmenter is a Java-based and open-source Vietnamese word segmentation tool. The segmentation model was trained on about 8,000 sentences using Conditional Random Fields (FlexCRFs). This tool would be useful for Vietnamese NLP community.
This project consists in a simulation software of robot A.I. It is aimed at comparing the efficiency of robot intelligence against movement tasks between fixed checkpoints in a logical world.
Bitnets instantiates and operates on graphs and subgraphs of large complex networks, such as kinship networks. Bitnets consists mainly of a java library, a number of use examples and an interactive interpreted language interface.
Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
W.H.A.T. is an analytic tool for Wikipedia with two main functionalities: an article network and extensive statistics. It contains a visualization of the article networks and a powerful interface to analyze the behavior of authors.
JTextPro: A Java-based Text Processing tool that includes sentence boundary detection (using maximum entropy classifier), word tokenization (following Penn conventions), part-of-speech tagging (using CRFTagger), and phrase chunking (using CRFChunker).
RunCC is a new kind of parsergenerator that generates parsers and lexers at runtime. Source generation is only optional. It features the absence of any cryptography. Although intended for small languages, it comes with Java and XML example parsers.
Azureus Plug-In that allocates the ip adresses of the peers to the country and the city they belong to and visualize that data on a world map or in statistics. This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com/.
BabyTALK is to add another brick in the wall of natural languages learning. The baby needs to structure a corpus of texts when his tutor points and talks about a particular part of the corpus. The baby is also to describe any selected part of the corpus.
The Kinship Algebra Modeller is a suite of java applications that assist development of an algebra to describe a given kinship terminology, and to support models and simulations of social processes based on relating people using this algebra.
A complete survey administration and data collection system. A fully featured replacement for Quancept, supporting CAPI, Web, CATI, PDA and Paper survey modes. Sonar is the reference implementation of JCaiF for CAPI and Web survey interviewing. Try it!
The Cornell Tree-Ring Analysis System. A program for several aspects of dendrochronology: measuring, indexing, crossdating, graphing, summing masters, and even drawing maps of locations of sites.