Showing 175 open source projects for "extraction"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    SourceDoc is a powerful system for automatic creation, extraction, and verification of embedded documentation. Designed for C code, it features both a C parser and a preprocessor. The default output format is HTML, but other formats are possible to plug in using a public Java interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    Ticket Cluster

    Java aplication who groups related text documents, text mining

    Ticket Cluster is a java aplication who groups related text documents(text is extracted from a helpdesk) into clusters, providing an overview of the document set. This is done without preconceptions about keywords — this Java software analyzes the text and identifies the structure that arises naturally. The extraction phase depends of the data of the helpdesk, in the current implementation there is a php script who extracts all text from numered tickets (Facil HelpDesk ) to a folder. Once created the folder with text, run ticket cluster, select folder and click process. After processing is done you can watch results in dendogram or tree. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    BioContext

    Software for extraction of biomedical information from literature

    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    BlogTEX is an ad-hoc blog posts extraction algorithm written in Java for TREC Blog08 dataset. It includes an optimized sentence model for clearly identifying sentence boundaries in each blog post. Its output can be customized using its config file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Distributed phrase-based machine translation training tool based on Hadoop.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    HanNanum - Korean POS Tagger
    ...A plug-in component-based architecture is adapted to the new Java version for flexible use. You can find the work flow for morphological analysis, POS tagging, noun extraction, etc. Contact: kschoi@kaist.ac.kr hjjeong@world.kaist.ac.kr
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Lioness (Languages Interop Framework)
    Framework for making Windows applications that are one .exe file in AutoHotKey_L,C++,C#, VB.NET,Java,Groovy,Common Lisp,Nemerle,Ruby,Python,PHP,Lua,Tcl,Perl,Jint,S#,WSH VBScript,HTML/JavaScript/CSS,COM, PowerShell without compiling . For .NET 4.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    This project provides a toolkit and framework based on PDFBox for document analysis of PDF files and performing custom conversion tasks and is published under the Apache licence. A GUI is also included, and is published using the GPL licence.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SEMANTIXS is a semantic information extraction system that can extract, represent and visualize domain-specific information from free-text in the form of complex (and simple) relationships. Refer - http://www.cs.iastate.edu/~semantix/ for more info.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    TextMarker
    TextMarker is now developed and hosted at Apache UIMA (http://uima.apache.org/textmarker.html). TextMarker is a UIMA-based tool for information extraction and more. The full featured editor of the rule language and the build process of UIMA descriptors are complemented with components for visualization, explanation, testing and rule learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    HTML Parser
    HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaBeans. It is a fast, robust and well tested package.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    An information extraction library implementing modern algorithms for the extraction of named entities from text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    FX Player is a Web-based streaming server with a Flash iTunes-like interface. It shares your MP3 library and allow access to your tracks through the Internet. Coded in Java, FX Player run on most platforms, including Mac OS X, Windows, Linux and Unix.
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 16
    Command line tool written in Java, that automaticly unpacks (password protected) RAR-Archives or multi-part RARs, if all belonging files are complete. It is designed to do unrar jobs on Linux-based NAS-Devices when downloading from Rapidshare & co.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Provides a set of tools for processing text, such as text extraction and classification. Classification implementations to be implemented include: Bayesian and Statistical (N-gram).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    openPDF
    openPDF is based on a several open source software products, such as iText, JPedal, CryptoApplet among others. Allow users to view/modify PDF documents and forms, barcodes generation, data extraction and signature validation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Palo ETL Server is a Java based Tool for Extraction, Transformation and Loading of mass data into the Palo OLAP Server. Palo ETL Server is one part of the Palo Suite.
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 20
    suffix arrays for phrase extraction
    Java Suffix array library for phrase discovery. Inspired initially by the classic paper of Yamamoto & Church, with newer ideas from Abouelhoda et al and Kim et al. Adapted for large alphabet so that words can be tokenized as alphabet characters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    With the "xix" library, GATE functionality is available in XQuery (via an MXQuery extension). OpenCalais invocation is supported, too. -- Source code at http://sgv-jenkins-01.ethz.ch/job/xixlib/ws/-- "Show project details" for instruction
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This code is part of a JavaScript extraction and analysis engine under development between 06/2009 and 09/2009, by Paul Seymer and Angelos Stavrou at the CSIS at GMU.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    JBiblex
    Cross-platform explorer of ZIP archives with FB2 books.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Ingres Migration Tool Set
    The 'Ingres Migration Tool Set' is a collection of Tools and Libraries which was developed with the purpose to support you in migrating your Database Schemas to the 'Ingres Database' Open Source DBMS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    semantic term annotation and description
    This project extends the ASV Toolbox from the Wortschatz-project at the University of Leipzig. It annotates terms extracted by the "TE" (Terminolgy Extraction) and "Namerec" modules with semantic resources.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB