Showing 37 open source projects for "metadata management"

View related business solutions
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Apache Polaris

    Apache Polaris

    Apache Polaris, the interoperable, open source catalog

    Apache Polaris is an open-source metadata catalog and data management service designed to manage Apache Iceberg tables in modern data lakehouse environments. It provides a centralized catalog that allows multiple compute engines and analytics systems to interact with the same datasets through a standardized interface. By implementing the Iceberg REST catalog API, Polaris enables distributed data platforms to access shared table metadata without tightly coupling storage systems and query engines. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    DataHub

    DataHub

    The Metadata Platform for your Data and AI Stack

    DataHub is an open source metadata platform that helps organizations discover, understand, and trust their data assets at scale. It models data as a richly connected graph spanning datasets, dashboards, pipelines, ML features, and services, so users can explore relationships like lineage and ownership across tools and domains. The platform focuses on continuous metadata ingestion from many sources, treating metadata as a stream that stays fresh as systems change. A modern web UI and search...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    BigQuery Utils

    BigQuery Utils

    Useful scripts, udfs, views, and other utilities for migration

    BigQuery Utils is a large utility repository focused on helping users operate, optimize, and migrate workloads in BigQuery through reusable assets rather than a single application. It brings together scripts, user-defined functions, views, stored procedures, dashboards, notebooks, and supporting tools that address common data warehouse and analytics tasks. The repository is especially useful for organizations that need practical building blocks for migration from other database systems,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Semantic Type Detection

    Semantic Type Detection

    Metadata/data identification Java library

    Metadata/data identification Java library. Identifies Base Type (e.g. Boolean, Double, Long, String, LocalDate, LocalTime, ...) and Semantic Type information (e.g. Gender, Age, Color, Country, ...). Extensive country/language support. Extensible via user-defined plugins. Comprehensive Profiling support. Large set of built-in Semantic Types (extensible via JSON defined plugins). Extensive Profiling metrics (e.g. Min, Max, Distinct, signatures, …) Sufficiently fast to be used inline. See Speed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Genie

    Genie

    Distributed Big Data Orchestration Service

    Genie is a completely open source distributed job orchestration engine developed by Netflix. Genie provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Spark, Presto, Sqoop and more. It also provides APIs for managing the metadata of many distributed processing clusters and the commands and applications which run on them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    LogicalDOC Document Management - DMS

    LogicalDOC Document Management - DMS

    smart and open source document management system

    LogicalDOC is both document management and collaboration system. The software is loaded with many functions and allows organizing, index, retrieving, controlling and distributing important business documents securely and safely for any organization and individual. Gone are the days when companies used paper-based processes such as printing, mailing and manual filing of paper documents; our document management system replaces all of this with electronic procedures that allow your...
    Leader badge
    Downloads: 156 This Week
    Last Update:
    See Project
  • 8
    gravitino

    gravitino

    Unified metadata lake for data & AI assets.

    Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    OpenKM Community Edition is a free Document Management System (DMS) that helps businesses control the production, storage, management and distribution of electronic documents, boosting effectiveness and productivity. It integrates document management, collaboration and advanced search into one easy-to-use solution, including administration tools for user roles, access control, security levels, activity logs and automation setup. With OpenKM Community Edition you can: Collect information...
    Leader badge
    Downloads: 428 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 10
    ResCarta

    ResCarta

    Archive your personal history

    ResCarta Toolkit offers an open source solution to creating, storing, viewing, and searching digital collections. Applications in the toolkit let users create and edit metadata, convert data to open standard ResCarta format, index and host collections.
    Leader badge
    Downloads: 40 This Week
    Last Update:
    See Project
  • 11
    Web based cataloging and dedupe application. Highly optimized for processing journal articles. Reads MarcXML and dedupes records using the field 773 combined with a fuzzy search on the title. Written for bibnet.org
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    NL2SBVR

    A SBVR editor to generate and edit SBVR Vocabulary and SBVR Rules.

    NL2SBVR 3.0 is a SBVR editor to generate and edit SBVR Vocabulary and SBVR Rules from natural language description . The tool supports Structured English notation of SBVR and also generates XML metadata of SBVR rules in XSD 1.4. Version ------ Release Date ------------Features ---------------------------------------------------------------------------------- 1.0 ------------ Sep, 2013 ------------ SBVR Editing 2.0 ------------Mar, 2016 ------------ Structured English Notation; SBVR...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Informatica DBMetadata

    Java utility that reads the metadata from table(s)

    Dbmetadata is a Java utility that reads the metadata from table(s) in a specified database and creates the Informatica XML to import into the repository. I created this utility when we were migrating to a new platform and needed a quick way to create flatfile and relational sources and targets that matched the DDL of the table. I also needed to use shortcuts. If you use the import table list, it will create one XML file with all of the tables and shortcuts (if a shortcut folder is specified)...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    COAR-DMS

    COAR-DMS

    DMS for linux, C++ library, server, webUI , SOAP

    COAR-DMS is document management system for 32/64 bit. linux. Acts as library, server and tools. Library features: - storage management, free pages recycling - transaction log - indexing: full text, tags, metadata, document attributes - inverted index - versioning, collaboration - document trees, trees versionning - folders - plugins for auth (PAM,LDAP), db, file types plugins - tags - metadata (key value pairs) - object level security, folders documents ACL, - unix like security (rwx), special authorities - from thousands to tens of billions of documents - dashboard (working copies, new documents) - electronic signs - search statement, syntax like SQL - multithreaded, multiprocess library, Servers: - native HTTP server (libmicrohttp) - SOAP server - WebDAV(planed) - Indexer Python API WebUI GWT, JSP, SOAP-API
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GeoKettle
    GeoKettle is a powerful, metadata-driven spatial ETL (Extract, Transform and Load) tool dedicated to the integration of different data sources for building and updating geospatial databases, data warehouses and services.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    CMIS Input plugin for Pentaho

    CMIS Input plugin for Pentaho

    Allows querying Content Management Systems that use the CMIS.

    Imagine being able to extract from your Enterprise Content Management System, all the metadata of your documents using simple queries with a query language very close to the traditional SQL. Imagine using the information extracted for statistical purposes, for creating reports and, more generally, to analyse your document archives in a way unthinkable until now with the current tools available. All this is possible within the Pentaho Suite, the Open Source Business Intelligence platform, which is useful to the extraction and analysis of structured and semi-structured data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    mydocuments

    Enterprise Content Management System

    ...System architecture follow SOA to ensure scalability. Features Global Unique ID for Documents  Web Based Interfaces Document Folders with UID / Document Sets Text Search Property/Metadata Search Search Options Document Sharing Digital Signatures Security Watermarks Version Management
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Semtinel is an application framework for the semi-automatic creation, maintenance and analysis of hierarchical concept schemes (thesaurus, classification or ontology). Semtinel supports the development of new analysis methods and visualization techniques
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Metadata.net is a collection of website components and tools for creating and using metadata produced by the eResearch Group at ITEE, University of Queensland.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The “Media Crawler” is an extensible Eclipse RCP based desktop application which will crawl a given file system, extract metadata from files, map metadata to internal schemas and store the metadata in a databse. This project is ANDS-funded.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MedioVis is concerned with the user-centred development of next generation visual information seeking systems for novice and nonexpert users. Various Browsing-oriented data exploration strategies are supported (overview & detail, filtering, zooming etc.)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    DocInfoRetriever is a Web_based document full-text search engine based on lucene. It allows you to search the contents and metadata of documents . Supported document formats, likes doc, xls, pdf, odt, jpg...etc.,and torrent files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Ufnasoft Document Management System (Ufnasoft-DMS), Java Document Management System developed in Java Swing, Servlets, Tomcat, and MySQL. The features are Document History, Version Control, User group based permission, Project Level Permission, Docum
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB