Showing 29 open source projects for "document analysis"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 1
    elasticsearc-php

    elasticsearc-php

    PHP low-level client for Elasticsearch

    Introducing Elasticsearch DSL library to provide objective query builder for Elasticsearch bundle and elasticsearch-php client. You can easily build any Elasticsearch query and transform it to an array. This agnostic package is a lightweight wrapper on top of the Elasticsearch PHP client. Its main goal is to allow for easier structuring of queries and indices in your application. It does not want to hide or replace the functionality of the Elasticsearch PHP client. Feature complete, object...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Lattice

    Lattice

    Free and Open-Source Qualitative Data Analysis Software

    Lattice is a free and open-source Computer-Assisted Qualitative Data Analysis Software (CAQDAS). Native, lightweight, and offline-first, it supports document imports in plain text from various formats. Features include a six-level code hierarchy, five memo types, attribute-typed documents, precision retrieval, and an analysis workspace with Code Frequency, Co-occurrence, Crosstab, Coding Coverage, and Word Cloud.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AI learning

    AI learning

    AiLearning, data analysis plus machine learning practice

    We actively respond to the Research Open Source Initiative (DOCX) . Open source today is not just open source, but datasets, models, tutorials, and experimental records. We are also exploring other categories of open source solutions and protocols. I hope you will understand this initiative, combine this initiative with your own interests, and do what you can. Everyone's tiny contributions, together, are the entire open source ecosystem. We are iBooker, a large open-source community,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    Gamera is a framework for the creation of structured document analysis applications by domain experts. It combines a programming library with GUI tools for the training and interactive development of recognition systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    libcrn is document image processing library written in C++11 for Linux, Windows, Mac OsX and Google Android. It is a toolbox that allows to create easily software such as OCRs and layout analysis tools.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7

    jLDADMM

    A Java package for the LDA and DMM topic models

    The Java package jLDADMM is released to provide alternative choices for topic modeling on normal or short texts. It provides implementations of the Latent Dirichlet Allocation topic model and the one-topic-per-document Dirichlet Multinomial Mixture model (i.e. mixture of unigrams), using collapsed Gibbs sampling. In addition, jLDADMM supplies a document clustering evaluation to compare topic models. See the usage of jLDADMM in its website at http://jldadmm.sourceforge.net/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DCTFinder

    DCTFinder

    Extract title and creation time from web page.

    Web pages do not offer reliable metadata concerning their creation date and time. However, getting the document creation time is a necessary step for allowing to apply temporal normalization systems to web pages. DCTFinder is a system that parses a web page and extracts from its content the title and the creation date of this web page. DCTFinder combines heuristic title detection, supervised learning with Conditional Random Fields (CRFs) for document date extraction, and rule-based creation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SCAN
    SCAN (Smart Content Aggregation and Navigation) is a universal semantic content aggregator. It combines search, text analysis, tagging and metadata functions to provide new user experience of desktop navigation and document management.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    Texalyzer

    Text analyzer

    Analyzes text document using TF-IDF and optionally stopword list, and extracts important keywords.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Unsupervised TXT classifier

    Unsupervised TXT classifier

    Classify any two TXT documents, no training required - JAVA

    ...In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    XmlView
    GUI utility in pure Java for viewing and editing XML content; example of application built with Superficial http://superficial.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SIDoBI is an automatic summarization system for documents in Indonesian language. It is an acronym for Sistem Ikhtisar Dokumen untuk Bahasa Indonesia. SIDoBI is built based on MEAD, a public domain portable multi-document summarization system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    OpenSHORE is an XML based Semantic Document Repository (SDR) with a free definable meta model that builds up a semantic network from sections and relations in documents. The acronym SHORE means Semantic Hypertext Object Repository.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Maui is a multi-purpose automatic topic indexing algorithm. Given a document, Maui automatically identifies its topics. Depending on the task topics are tags, keywords, keyphrases, vocabulary terms, descriptors or Wikipedia titles.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    RDF-DocMan is a document manager based on a Sesame (RDF repository) backend. Documents are stored in the filesystem and their metadata in a Sesame repository. It was developed for porQual web content generator (also in sf.net).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    iDocs is a intellectual document work flow with text mining options project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Irudiko is a library written in C++ for generating Locality Sensitive Hashing sketches from any textual and web document. Mainly designed to work with HTML pages, it has also an optimization support for English or Italian documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Graphist uses PHP's GD library to produce data plots, in real time, served up as standard images for consumption by web pages (though such images could be saved for use in other document types).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Flesh is a Java application designed to analyze a document (plain text, rich text, Word documents, and PDFs) and display the difficulty associated with comprehending using the Flesch-Kincaid Grade Level and the Flesch Reading Ease Score.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Qualiweb aims at providing semantic web metrics for modeling a website visitors needs according to a given taxonomy or document classification. Web metrics provided by Qualiweb give an indication of how successful each of the website topics have been.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    vyasa is a digital library application that incorporates the functions of digital asset and document management systems. It facilitates information retrieval and knowledge discovery by providing comprehensive metadata generation and semantic analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Kriterion is a document retrieval and categorization engine capable of full text searching. There is no need for keyword or context-based information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo