Showing 18 open source projects for "semantic documents"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes. The tool combines deterministic parsing methods with an optional hybrid AI-powered mode that improves extraction quality for difficult layouts such as multi-column documents, scanned files, and scientific papers. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 2
    CCIL
    A SOA framework for web content classification, clustering and automated interlinking of terms between documents. Will provide an expandable set of services such as semantic search, ranking, retrieval and classification of large scale web resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DODDLE-OWL

    DODDLE-OWL

    a Domain Ontology rapiD DeveLopment Environment – OWL extension

    DODDLE-OWL is a domain ontology development tool for the Semantic Web. DODDLE-OWL makes reuse of existing ontologies and supports the semi-automatic construction of taxonomic and other relationships in domain ontologies from documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5

    Adam

    Adam - information management platform

    Adam is opensource, cross-platform, mobile, extensible information management platform designed for unified storage, next-generation semantic retrial and processing of different types of documents. It has innovative, friendly and comfortable user interface for document viewing, editing and organizing. Adam can be used: - For creating unified personal information pool. - One can easily be extended, customized and adapted to user needs, shared with other people, transported via USB-stick or network. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    The goal of this project is to provide a reusable library to transform common file formats to content objects and ContentProvider plugins to common file repositories like Filesystem, CMIS and others for iQser GIN Semantic Middleware (www.iqser.com).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    earmark
    *** IMPORTANT NOTICE *** The source code of the earliest version of the EARMARK Data Structure is now available at http://www.github.com/essepuntato/EarmarkDataStructure Even if the SourceForge repository is still active, it concerns old versions of the API and it is not maintained anymore. *** SERVICE DESCRIPTION *** Extremely Annotational RDF Markup (EARMARK) is an ontological approach to the specification of markup structures on text content. It allows not only documents...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Porqual is a website generator that manages documents using the Sesame RDF database and it has a rich web client on Flash, focused on usability and accesibility and integrated in the Semantic Web. It is programmed in Java, ActionScript and JavaScript.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    CERIF-TG-Toolbox

    Tool to produce the CERIF deliverables (XML Schemas and XML Semantics)

    ...The CERIF TG Toolbox generates the CERIF-XML Schemas from the database model (maintained as the *.dmx file of TOAD Data Modeler), both original style (one namespace/schema per CERIF entity) and the updated style (introduced 2012: a common namespace, one schema, embedding). It also generates the CERIF-XML instance documents (updated style) of the CERIF Semantic Layer entities, based on a *.xls or *.xlsx file. While being most useful to the CERIF Task Group member who prepares a CERIF release, it is of use to anyone who wants to experiment with CERIF extensions. The software is written in Java and XSLT. It is made available under the European Union Public License.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    An semantic editor for describing public administration services. Produces ontologies containing a workflow description in owl notation. Can be used for modelling processes in different fields, but was originally focused on semantically describing government services with their corresponding documents, fees and other information items. Launch the editor with Java Web Start at http://ri.tdf.lv
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    OpenSHORE is an XML based Semantic Document Repository (SDR) with a free definable meta model that builds up a semantic network from sections and relations in documents. The acronym SHORE means Semantic Hypertext Object Repository.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Ontea - Pattern based Semantic Annotation Platform. Ontea search or create semantic meta data from text or documents using pattern based approaches. Implementation currently includes regular expressions (regex) patterns
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    The SAWSDL4J project is an attempt to provide a clean object model for SAWSDL documents. SAWSDL is the specification from W3C that governs the subject of attaching semantic references to standard WSDL descriptions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    *** PROJECT NOT MAINTAINED ANYMORE *** RDFStats generates statistics for RDF datasets behind SPARQL-endpoints and within single RDF documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Crow - Computational Representation Of Whatever. A platform for the integration and mining of complex and distributed data. Represents cross-linked semantic web documents as a network of software objects and offers easy ways to filter, and sort them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    A distributed directory/registry for WAN and LAN environments for storing any kind of leased resources (JAVA proxies, .NET proxies, documents or whatever that can be properly categorized) whose can be categorized in a semantic manner.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    SPDL stands for Semantic Personal Digital Library and it is an open source Java project for managing PDF documents. This project allows user to express informations on documents, to classify and retrieve documents in different ways.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Yadoda is a personal digital library: user can create his own ontology and a db of digital documents (pdf,ps,mp3,images) that can be enriched with metadata (author,date,title). User can create semantic relations between documents and navigate them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB