Showing 11 open source projects for "java text mining preprocessing"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Lingua

    Lingua

    The most accurate natural language detection library for Java

    Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    libpostal

    libpostal

    A C library for parsing/normalizing street addresses around the world

    A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data. libpostal is a C library for parsing/normalizing street addresses around the world using statistical NLP and open data. The goal of this project is to understand location-based strings in every language, everywhere. Addresses and the locations they represent are essential for any application dealing with maps (place search, transportation, on-demand/delivery services,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    JSentiWordNet

    A wrapper for the famous SentiWordNet, a resource for opinion mining

    This project aims to provide a wrapper around the SentiWrodnet, a lexical resource for opinion mining. As defined by the authors : SentiWordNet assigns to each synset of WordNet three sentiment scores: positivity, negativity, objectivity. You can find additional information about the creation of SentiWordnet here : http://nmis.isti.cnr.it/sebastiani/Publications/LREC06.pdf sentiWordnet (avilable here : https://drive.google.com/open?id=0B0ChLbwT19XcOVZFdm5wNXA5ODg) is a text file with a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ipmems

    ipmems

    Real-time data acquisition and visualization software

    Cross-platform data acquisition and visualization software with an embedded HTTP-server, binary protocol parsing library, protocol emulation server, remote secure administration server, embedded Groovy scripting facilities and HMI (SCADA) visualization module.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    This library offers an API to useful tools that can be utilized for any text mining project. More details will be mentioned in the project's future documentation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Contextor
    Contextor is a light-weight simple-to-use Java based library to help developers and researchers working with the general concept of a resource; as examples, resources can be text resources, web resources, images and videos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    This project intends to create an indexing search engine, for knowledge management. The primary object is to apply an information retrieval core. And implement a knowledge data discovery theory such as data mining algorithm, text mining.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    reputron is a knowledge extraction engine platform that covers all aspect of text mining, relevance, indexing and querying on a corpus of text documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo