With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.
You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Try free now
Cloud tools for web scraping and data extraction
Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
jWords is a port of WORDS (by William Whitaker, a free latin-to-english dictionary program written in Ada), to Java. Besides the dictionary will be translated to the German language.
ELIA(Eyegaze Language Integration Analysis) supports the analysis of eye-tracking data for studies in language processing. ELIA eases early analysis of data to enable iterative development of experiments in response to spoken language.
CORPSE (CORPus SEarch) is a powerful search engine written in Java. The aim is to provide an efficient implementation of a word level inverted index search with various cool functions that can be used on very large corpora.
Java program to create a (potentially multilingual) glossary of the unique words in any given Lojban text.
Note that the Sourceforge page for this was superceded by the Bitbucket repository: https://bitbucket.org/pretoriusjf/vlastezba/overview
Any further updates will be made there.
Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight
Lock Down Any Resource, Anywhere, Anytime
CLEAR by Quantum Knight is a FIPS-140-3 validated encryption SDK engineered for enterprises requiring top-tier security. Offering robust post-quantum cryptography, CLEAR secures files, streaming media, databases, and networks with ease across over 30 modern platforms. Its compact design, smaller than a single smartphone image, ensures maximum efficiency and low energy consumption.
A linguistic tool to aid in the study of Linguistics/Phonology, specifically distinctive features of possible language sounds. Comprised of both a Visual C++ .NET version as well as a Java based web applet version. The C++ version has all but been ab
Editor for formal grammars. Attempts to be universal – customizable for any grammatical formalism and any syntax. Provides features such as syntax checking and highlighting, transformations (refactoring) and advanced rule editor.
OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
Premier is a global leader in financial construction ERP software.
Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
Java implementation of semirings as used in Joshua Goodman's thesis. Allows parsers to easily return different kinds of values by simply changing the semiring.
Standardizing the existing RelEx2Frame Engine of the RelEx semantic dependency relationship extractor and adding an statistical learning AI for automatic extension of the rule base
A lyrical analysis and classification tool focused specifically on rhyming style in rap lyrics. Functions include phonetic transcription, rhyme visualization, and rapper classification.
Connecting Historical Authorities with Links, Contexts and Entities. CHALICE is a historic placename gazetteer for the UK, published as Linked Data and linked to other widely-used sources of placename reference information on the semantic web.
stocleka is a project divided into a UI and a library for cleaning user stories and converting them to arff files (used for Weka). it may be mainly used for research and scientific purposes.
Wikipedia Concept Association Map (WCAM) is new approach for textual knowledge representation and understanding. All concepts and associations are stored in a graph database for better performance and easy distribution.
Reconcile is an opensource research platform for coreference resolution. It combines a large number of opensource NLP components and provides extension points for researchers to plug in additional features and techniques.
Java Suffix array library for phrase discovery. Inspired initially by the classic paper of Yamamoto & Church, with newer ideas from Abouelhoda et al and Kim et al. Adapted for large alphabet so that words can be tokenized as alphabet characters.
It's a utility application for updating and integrating translation memories, created by the Autshumato ITE, over a network. Licensed under the TMate OpenSource License and free to download and be used by anyone.
This project extends the ASV Toolbox from the Wortschatz-project at the University of Leipzig.
It annotates terms extracted by the "TE" (Terminolgy Extraction) and "Namerec" modules with semantic resources.
leXkit: a client-server dictionary edition environment, that makes editing easier for the lexicographer, who hasn’t to be aware of technical issues. Entry meta-information is used to provide advanced functionality, such as context-dependent tasks.
Sanchay is a collection of tools and APIs for language researchers. It has some implementations of NLP algorithms, some flexible APIs, several user friendly annotation interfaces and Sanchay Query Language for language resources.