Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.
Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
Web-as-corpus tools in Java.
* Simple Crawler (and also integration with Nutch and Heritrix)
* HTML cleaner to remove boiler plate code
* Language recognition
* Corpus builder
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
Projeny (Probablistic Networks Generator in Java) is a graphical (Java SWT) front-end to BNT (Bayes Net Toolbox for Matlab). Projeny requires BNT, JMatLink and a Matlab back-end. There is no installable release package, but source code is available on SVN - please check out from SVN to use Projeny. Projeny was started with BNJ as the base.
A project aims to develop a system which trains LDA model in distributed enviorenment. I studied Hadoop based solution and found that Hadoop is not fit for distributed LDA training case. In this project I implement a platform based on socket.
Java package to study a clustering model described in the paper \"Novel Clustering Algorithm Based Upon Games on Evolving Network\" by Q. Li, Z. Chen, Y. He and J-P. Jiang (in arxiv: http://arxiv.org/pdf/0812.5064v1), generalizations and similar issues.
KEManager models Knowledge Based Systems following a methodological approach ( Knowledge Engineering) for academical purposes of CommonKADS and IDEAL. It supports the viability test, some knowledge acquisition techniques and conceptual modeling.
Scan, the Semantic Content ANnotator, is a semantic pipeline that helps connecting information extraction tools to semantic database. UIMA-based, it allows easy plugin-writing: information extraction, ontology control, store in RDF Repositories.
This Java project creates a testing environment application to analyze an image at its low level features and suggest tags to clasify it using an ontology search based on the tags of similar images.
TBLTools is a set of GATE processing resources that implements the Fast Transformation Based Learning Algorithm. You can train it to learn rules for NLP tasks such as Named Entity Recognition and Shallow parsing.
T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
This is a suite of several software agents to provide a complete architecture of lexical base as proposed in Didier Schwab's PhD. thesis. It will be used for automatic translation, information retrieval and other natural language processing tasks.
A Trial Workbench for Facilitating Best Practices from Prospective to Acute Care in Respiratory Medicine. We aim to provide a set of information management toolboxes that facilitates decision support applications in medical information systems.
OntoExtractor is a way of building ontologies that proceeds in a bottom-up fashion, defining concepts as clusters of concrete XML objects. From a set of XML documents the application generates a taxonomy. OntoExtractor has been developed so far by the Kn
The aim of MIEX (Metadata and Information Extractor from small XML documents) is to create a wrapper for the Stanford Parser, to extract and store metadata (syntactic structures, relationships among words...) from simple XML documents.
JWebPro: A Java tool that can interact with Google search and then process the returned Web documents in a couple of ways. The outputs can serve as inputs for NLP, IR, infor extraction, Web mining, online social network extraction/analysis applications.
Java Expert Rule Based Inference Language. Jerbil is an open source rule processing engine written in Java. Currently Jerbil supports a full set of processing functions with text-based and XML interfaces; a Java interface is planned.
Open Source Semantic Web Search Engine Software: If two machines anywhere on the web can agree on the same definition of a digital service or digital good, then machine to machine transactions can use this lingua franca to transact on the users behalf.
The Citizen Privacy Service is an asynchronous component using artificial intelligence capabilities including DL decidability and first order logic provenance that provide policy decision and policy enforcement points based on the US Privacy Act of 1974.
JVnSegmenter is a Java-based and open-source Vietnamese word segmentation tool. The segmentation model was trained on about 8,000 sentences using Conditional Random Fields (FlexCRFs). This tool would be useful for Vietnamese NLP community.
Evidence-based Guideline and Decision Support System. Provides patient specific point of care reminders in order to aid physicians provide high quality care. Input/output in the form of HL7 CDA Level 2 documents. Knowledge is encoded using Arden Syntax.
Qualiweb aims at providing semantic web metrics for modeling a website visitors needs according to a given taxonomy or document classification. Web metrics provided by Qualiweb give an indication of how successful each of the website topics have been.
JTextPro: A Java-based Text Processing tool that includes sentence boundary detection (using maximum entropy classifier), word tokenization (following Penn conventions), part-of-speech tagging (using CRFTagger), and phrase chunking (using CRFChunker).
The Text Annotation Environment (tae) can be used to annotate natural language text manually or automatically (UIMA Annotator) with meta information (tokens, part-of-speech, named entities, ...). Tae is based on Eclipse and IBM's UIMA.
CRFChunker: Conditional Random Fields Phrase Chunker (Phrase Chunking Tool) for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (F1-score of 95.77). Chunking speed: 700 sentences/s