Showing 132 open source projects for "metadata"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    PDFPatcher

    PDFPatcher

    A versatile toolkit for PDF manipulation

    PDFPatcher (aka “PDF补丁丁”) is a versatile toolkit for PDF manipulation—editing document metadata, bookmarks, page layout, content restrictions, rotation, compression, merging/splitting, image extraction, and more, all within an intuitive interface. Merge/split PDFs or images, preserve or add bookmarks, and set page dimensions. Batch style/color/target changes, regex/XPath search/replace, mid‑page positioning. Modify PDF metadata, page numbers, links, initial view mode, and remove open actions.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 2
    CSV Lint

    CSV Lint

    CSV Lint plug-in for Notepad++ for syntax highlighting

    CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files. Use CSV Lint for metadata discovery, technical data validation, and reformatting on tabular data files. It is not meant to be a replacement for spreadsheet programs like Excel or SPSS, but rather it's a quality control tool to examine, verify or polish up a dataset before further processing.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 3
    Apache Polaris

    Apache Polaris

    Apache Polaris, the interoperable, open source catalog

    Apache Polaris is an open-source metadata catalog and data management service designed to manage Apache Iceberg tables in modern data lakehouse environments. It provides a centralized catalog that allows multiple compute engines and analytics systems to interact with the same datasets through a standardized interface. By implementing the Iceberg REST catalog API, Polaris enables distributed data platforms to access shared table metadata without tightly coupling storage systems and query engines. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Link-Preview-JS

    Link-Preview-JS

    Extract web links information: title, description, images, videos, etc

    link-preview-js is a lightweight TypeScript library that extracts metadata from URLs or HTML content to generate rich link previews. By parsing Open Graph tags and other metadata, it retrieves information such as titles, descriptions, images, and videos. Designed primarily for Node.js and mobile environments, it facilitates the creation of link previews similar to those found on social media platforms.​
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Joplin

    Joplin

    Open source note taking and to-do app with synchronization

    ...All notes can also be copied, tagged, searched and modified directly from the app or through your own text editor. Notes that are exported from Evernote can be imported into Joplin, be it formatted content, resources, complete metadata or plain Markdown files. When notes are synchronized with cloud services, notebooks, tags and other metadata can easily be moved, inspected or backed up as plain text files. Supported cloud services include Nextcloud, OneDrive, Dropbox and WebDAV. Joplin is available for Windows, Linux, macOS, iOS and Android, with three types: desktop, mobile and terminal. ...
    Downloads: 67 This Week
    Last Update:
    See Project
  • 6
    uv

    uv

    An extremely fast Python package and project manager, written in Rust

    An extremely fast Python package and project manager, written in Rust.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 7
    nb-clean

    nb-clean

    Clean Jupyter notebooks of outputs, metadata, and empty cells

    nb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs, and (optionally) empty cells, preparing them for committing to version control. It provides both a Git filter and pre-commit hook to automatically clean notebooks before they're staged, and can also be used with other version control systems, as a command line tool, and as a Python library. It can determine if a notebook is clean or not, which can be used as a check in your continuous integration pipelines. nb-clean can also be used as a pre-commit hook. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Symfony PropertyInfo

    Symfony PropertyInfo

    Extracts information about PHP class' properties using metadata

    Symfony PropertyInfo is a component that extracts information about the properties of PHP classes, such as their names, types, visibility, and documentation. It is particularly useful in scenarios like serialization, form generation, and validation, where understanding the structure of an object is essential. PropertyInfo can fetch data from PHPDoc annotations, reflection, and type hints, offering flexible integration with Symfony and other systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 16 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    RubyMoney

    RubyMoney

    A Ruby Library for dealing with money and currency conversion

    ...The library introduces a Money class that encapsulates both the numeric value and the associated currency, ensuring that operations are always context-aware and accurate. It also includes a Money::Currency class that stores metadata such as currency codes, symbols, and formatting rules, enabling consistent handling of international currencies. The library supports arithmetic operations, comparisons, and currency conversion through configurable exchange rate providers, making it suitable for both simple and complex financial systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Metacrafter

    Metacrafter

    Metadata and data identification tool and Python library

    Python command line tool and Python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifiable information (PII). Metacrafter is a rule-based tool that helps to label fields of the tables in databases. It scans table and finds person names, surnames, midnames, PII data, basic identifiers like UUID/GUID. These rules written as .yaml files and could be easily extended.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    ...Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and open-source software which means that transparency is the core value of our software development. Source code can be reviewed and improved by anyone from anywhere. Papermerge supports multiple users. Each user can be assigned different permissions to perform only a specific kind of action e.g. view only documents from a specific folder. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 14
    JDF.jl

    JDF.jl

    Julia DataFrames serialization format

    JDF is a DataFrames serialization format with the following goals, fast save and load times, compressed storage on disk, enabled disk-based data manipulation (not yet achieved), and support for machine learning workloads, e.g. mini-batch, sampling (not yet achieved). JDF stores a DataFrame in a folder with each column stored as a separate file. There is also a metadata.jls file that stores metadata about the original DataFrame. Collectively, the column files, the metadata file, and the folder is called a JDF "file". JDF.jl is a pure-Julia solution and there are a lot of ways to do nifty things like compression and encapsulating the underlying struture of the arrays that's hard to do in R and Python. E.g. Python's numpy arrays are C objects, but all the vector types used in JDF are Julia data types.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    nw_wrld

    nw_wrld

    nw_wrld is an event-driven sequencer for triggering visuals

    ...The system is designed to be extensible, letting developers plug in new generation rules or tweak parameters with real-time previews so they can iterate rapidly on world design. It also includes utilities to derive metadata from worlds, such as climate distributions, strategic points of interest, and navigable paths, which can be consumed by gameplay systems or AI agents. For teams building games or simulations, nw_wrld provides a reusable foundation that reduces upfront world design costs while enabling endless variety.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    HDF5.jl

    HDF5.jl

    Save and load data in the HDF5 file format from Julia

    HDF5 stands for Hierarchical Data Format v5 and is closely modeled on file systems. In HDF5, a "group" is analogous to a directory, a "dataset" is like a file. HDF5 also uses "attributes" to associate metadata with a particular group or dataset. HDF5 uses ASCII names for these different objects, and objects can be accessed by Unix-like pathnames, e.g., "/sample1/tempsensor/firsttrial" for a top-level group "sample1", a subgroup "tempsensor", and a dataset "firsttrial". For simple types (scalars, strings, and arrays), HDF5 provides sufficient metadata to know how each item is to be interpreted. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    kb

    kb

    A minimalist command line knowledge base manager

    ...It was created to solve the common problem of having scattered text files or reference materials on disk that are hard to search or categorize, and it surfaces a simple CLI interface with intuitive commands for adding, viewing, editing, and deleting knowledge items. Each entry in kb can be tagged, categorized, given metadata like author or status, and inspected with full-text search or regex-based grepping, helping users quickly find content even across large knowledge collections. While focused on text content, it also supports non-text artifacts such as PDFs and images, which can still be indexed and referenced, and it integrates with editors specified by the user’s $EDITOR environment variable to make detailed editing seamless.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PDF Signature

    PDF Signature

    Free web software for signing PDFs and also organize pages

    Free web software for signing, organizing, editing metadatas or compressing PDFs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    pointblank

    pointblank

    Data quality assessment and metadata reporting for data frames

    ...Sometimes, we want to maintain table information and update it when the table goes through changes. For that, we can use an informant object plus associated functions to help define the metadata entries and present it as a data dictionary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Semantic Type Detection

    Semantic Type Detection

    Metadata/data identification Java library

    Metadata/data identification Java library. Identifies Base Type (e.g. Boolean, Double, Long, String, LocalDate, LocalTime, ...) and Semantic Type information (e.g. Gender, Age, Color, Country, ...). Extensive country/language support. Extensible via user-defined plugins. Comprehensive Profiling support. Large set of built-in Semantic Types (extensible via JSON defined plugins).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    GoldenCheetah

    GoldenCheetah

    Performance Software for Cyclists, Runners, Triathletes and Coaches

    ...Upload and Download with many cloud services including Strava, Withings, and Today's Plan. Import and export data to and from a wide range of bike computers and file formats. Track body measures, and equipment use and set your own metadata to track. GoldenCheetah provides tools for users to develop their own metrics, models, and charts. We believe that cyclists and triathletes should be able to download their power data to the computer of their choice, analyze it in whatever way they see fit, and share their methods of analysis with others.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 22
    Cookiecutter

    Cookiecutter

    A cross-platform command-line utility that creates projects

    Cookiecutter is a command-line utility to create projects from customizable and reusable templates. It helps bootstrap new projects with consistent structure, metadata, licensing, CI configs, and more—streamlining setup for a wide range of software projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Labeler

    Labeler

    Label manager for PRs and Issues based on configurable conditions

    An advanced, all-in-one GitHub Action enabling dynamic labeling of both issues and PRs based on a variety of configurable rules—covering metadata like age, author, branch, file changes, draft status, and more. A flexible tool for automating repository hygiene.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Krylov.jl

    Krylov.jl

    A Julia Basket of Hand-Picked Krylov Methods

    If you use Krylov.jl in your work, please cite it using the metadata given in CITATION.cff.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Yahoo! Finance market data downloader

    Yahoo! Finance market data downloader

    Yahoo! Finance market data downloader

    ...The latest version of yfinance is a complete re-write of the libray, offering a reliable method of downloading historical market data from Yahoo! Finance, up to 1 minute granularity, with a more Pythonic way. The Ticker() module allows you get market and metadata for security, using a Pythonic way.
    Downloads: 7 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB