Showing 29 open source projects for "document analysis"

View related business solutions
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    ...The system uses a multi-stage pipeline (e.g., document parsing, content analysis, knowledge graph construction, intelligent retrieval) so queries can navigate across modalities with deeper understanding and relevance.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    elasticsearc-php

    elasticsearc-php

    PHP low-level client for Elasticsearch

    Introducing Elasticsearch DSL library to provide objective query builder for Elasticsearch bundle and elasticsearch-php client. You can easily build any Elasticsearch query and transform it to an array. This agnostic package is a lightweight wrapper on top of the Elasticsearch PHP client. Its main goal is to allow for easier structuring of queries and indices in your application. It does not want to hide or replace the functionality of the Elasticsearch PHP client. Feature complete, object...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    OWL

    OWL

    Optimized Workforce Learning for General Multi-Agent Assistance

    OWL (Optimized Workforce Learning) is a sophisticated open-source framework built on the CAMEL-AI ecosystem for orchestrating teams of AI agents to collaboratively solve complex, real-world tasks with dynamic planning and automation capabilities. Unlike single-agent systems, it treats task completion as a collaborative workforce where agents take on specialized roles (planning, execution, analysis) and coordinate via a modular multi-agent architecture that supports flexible teamwork across domains. OWL delivers state-of-the-art performance on benchmarks like GAIA and emphasizes real-time decision-making, web automation, rich search integration, document parsing, and multi-tool workflows, making it suitable for tasks ranging from information retrieval to interactive automation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Vanilla.PDF

    Vanilla.PDF

    Cross-platform SDK for creating and modifying PDF documents

    ...Vanilla.PDF supports advanced PDF features such as adding CMS (PKCS#7) digital signatures, modifying content streams and metadata, and working with encryption and permissions based on standard PDF security models. It includes tools for parsing PDF internals like cross-reference tables and objects, providing fine-grained document analysis capabilities. The project is unit-tested with continuous integration pipelines, supporting sanitizers for enhanced code quality and stability.
    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    PDF4QT

    PDF4QT

    Open source PDF editor

    PDF4QT is open source PDF editor based on Qt framework. It contains a C++ library, applications for viewing/editing PDF documents, and a command line tool. PDF4QT is an open-source PDF editor for Windows/Linux. It is a modern solution for viewing/editing/rendering PDF documents, for users and developers alike. For developers, there is a C++ library and a command line tool for use in scripts. For users, there are four applications offering many features. The project is hosted on Github and...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 6
    rollama

    rollama

    Wrap the Ollama API, which allows you to run different LLMs

    ...The package emphasizes reproducibility and privacy by enabling local execution of models, which is especially valuable for sensitive or research-oriented workflows. It supports common LLM tasks such as text generation, annotation, and embedding creation, making it useful for tasks like document analysis and data labeling. The design mirrors familiar R workflows, allowing users to integrate AI capabilities into scripts, notebooks, and data pipelines with minimal friction. It also provides flexibility to extend functionality to any feature supported by the underlying Ollama API.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Software Copyright Materials Skill

    Software Copyright Materials Skill

    Skills, a Chinese software copyright application material generator

    Software Copyright Skill is an open-source Codex skill for generating Chinese software copyright application materials from a local software project. It helps developers prepare the documents required for a software copyright filing without relying on paid document-preparation services. The skill reads the real project, guides the user through key confirmations, and produces organized materials that can be reviewed and edited locally. It can generate application-form reference information,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to expand its capabilities, focusing on versatile data extraction, platform support, and seamless integration with various systems. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    PySchool

    PySchool

    Installable / Portable Python Distribution for Everyone.

    PySchool is a free and open-source Python distribution intended primarily for students who learn Python and data analysis, but it can also used by scientists, engineering, and data scientists. It includes more than 150 Python packages (full edition) including numpy, pandas, scipy, sympy, keras, scikit-learn, matplotlib, seaborn, beautifulsoup4...
    Leader badge
    Downloads: 115 This Week
    Last Update:
    See Project
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 10
    UML and SysML with TRAK

    UML and SysML with TRAK

    Annotate/link UML and SysML diagrams with TRAK AF elements

    Provides UML (SysML) profiles and a MDG for Sparx Systems EA to allow you to add TRAK elements to standard UML and SysML diagrams and link to TRAK architecture descriptions. Links TRAK Resource to UseCase as subject or collaborator and to States exhibited.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Rubberduck

    Rubberduck

    Every programmer needs a rubberduck. COM add-in for the VBA & VB6 IDE

    Rubberduck aims to bring the VBIDE into this century. Rubberduck understands Classic-VB code like no other add-in, giving it superior static code analysis capabilities that go far above and beyond what is possible with simple text-based analysis. Avoid common pitfalls (some not-so-common) with dozens (100+) of configurable inspections. Gain full control over module and member attributes, create a virtual folder hierarchy, and document modules and procedures, all with special comment annotations. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 12
    eslint-config-alloy

    eslint-config-alloy

    Progressive ESLint config for your React/Vue/TypeScript projects

    Progressive ESLint config for your React/Vue/TypeScript projects. The AlloyTeam ESLint config is not only a progressive ESLint config for your React/Vue/TypeScript projects but also the best reference for configuring your personalized ESLint rules. Let Prettier handle style-related rules. Inherit ESLint's philosophy and help everyone build their own rules. High degree of automation: advanced rules management, test as a document, as a website. Keep up with the times, follow up the latest...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Interpret-Text

    Interpret-Text

    State-of-the-art explainers for text-based machine learning models

    ...Interpret-Text incorporates community-developed interpretability techniques for NLP models and a visualization dashboard to view the results. Users can run their experiments across multiple state-of-the-art explainers and easily perform comparative analysis on them. Using these tools, users will be able to explain their machine-learning models globally on each label or locally for each document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PHP Language Server

    PHP Language Server

    PHP Implementation of the VS Code Language Server Protocol

    A pure PHP implementation of the open Language Server Protocol. Provides static code analysis for PHP for any IDE.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    cquery

    cquery

    C/C++ language server supporting multi-million line code base

    C/C++ language server supporting multi-million line code base, powered by libclang. Emacs, Vim, VSCode, and others with language server protocol support. Cross-references, completion, diagnostics, semantic highlighting, and more. cquery is a highly-scalable, low-latency language server for C/C++/Objective-C. It is tested and designed for large codebases like Chromium. cquery provides accurate and fast semantic analysis without interrupting workflow. cquery implements almost the entire...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    HackMyResume

    HackMyResume

    Generate polished résumés and CVs

    Create polished résumés and CVs in multiple formats from your command line or shell. Author in clean Markdown and JSON, export to Word, HTML, PDF, LaTeX, plain text, and other arbitrary formats. Fight the power, save trees. Compatible with FRESH and JRS resumes. HackMyResume is a dev-friendly, local-only Swiss Army knife for resumes and CVs. Use it to generate HTML, Markdown, LaTeX, MS Word, PDF, plain text, JSON, XML, YAML, print, smoke signal, carrier pigeon, and other arbitrary-format...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Gumbo

    Gumbo

    An HTML5 parsing library in pure C99

    Gumbo is an implementation of the HTML5 parsing algorithm implemented as a pure C99 library with no outside dependencies. It's designed to serve as a building block for other tools and libraries such as linters, validators, templating languages, and refactoring and analysis tools. Gumbo gains some of this by virtue of being written in C, but it is not an important consideration for the intended use-case, and was not a major design factor. Gumbo is intentionally designed to turn an HTML document into a parse tree, and free that parse tree all at once. To install the python bindings, make sure that the C library is installed first, and then sudo python setup.py install from the root of the distro. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    CMIS Input plugin for Pentaho

    CMIS Input plugin for Pentaho

    Allows querying Content Management Systems that use the CMIS.

    Imagine being able to extract from your Enterprise Content Management System, all the metadata of your documents using simple queries with a query language very close to the traditional SQL. Imagine using the information extracted for statistical purposes, for creating reports and, more generally, to analyse your document archives in a way unthinkable until now with the current tools available. All this is possible within the Pentaho Suite, the Open Source Business Intelligence platform, which is useful to the extraction and analysis of structured and semi-structured data. With this goal (the extraction and analysis of data) has been designed and developed the CMIS Input plugin for Pentaho Data Integration (Kettle) that allows querying Content Management Systems that use the CMIS interoperability standard. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    WinDbg Uncovered

    Advanced Debugging Techniques in WinDbg

    This project/document has been created to give more exposure of the advanced debugging and dump file analysis/concepts using WinDbg. The document contains the real world scenario of programming bugs/problems with the authors explanation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    A command line tool that converts a custom xml document (xsav) to a SPSS binary file (sav). It is often easy to generate xml files from software, and by using this tool a SPSS (computer tool for statistical analysis) binary file can easily be generated.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    XmlView
    GUI utility in pure Java for viewing and editing XML content; example of application built with Superficial http://superficial.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ILEDocs is a documentation tool which helps the software developers to document their programs in a convenient way similar to javadoc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    OpenSHORE is an XML based Semantic Document Repository (SDR) with a free definable meta model that builds up a semantic network from sections and relations in documents. The acronym SHORE means Semantic Hypertext Object Repository.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo