Showing 129 open source projects for "metadata"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Aim

    Aim

    An easy-to-use & supercharged open-source experiment tracker

    Aim logs all your AI metadata (experiments, prompts, etc) enabling a UI to compare & observe them and SDK to query them programmatically. The Aim standard package comes with all integrations. If you'd like to modify the integration and make it custom, create a new integration package and share with others. Aim is an open-source, self-hosted AI Metadata tracking tool designed to handle 100,000s of tracked metadata sequences.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Khazix Skills

    Khazix Skills

    Digital Life Kazik Open Source AI Skills Collection

    Khazix Skills project is an automation framework designed to transform GitHub repositories into structured, reusable AI agent skills. It acts as a pipeline that analyzes a repository’s metadata, extracts relevant information such as README content and commit hashes, and converts it into a standardized skill format that can be integrated into agent ecosystems. The system emphasizes lifecycle management by embedding versioning, traceability, and metadata directly into generated skill files, allowing future updates and synchronization with the original repository. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    MCP Filesystem Server

    MCP Filesystem Server

    Go server implementing Model Context Protocol (MCP) for filesystem

    Filesystem MCP Server is a Go-based server implementing the Model Context Protocol (MCP) for filesystem operations. It allows for various file and directory manipulations, including reading, writing, moving, and searching files, as well as retrieving file metadata. ​
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Docspell

    Docspell

    Assist in organizing your piles of documents

    ...This makes adding metadata to your documents a lot easier. For machine learning, it relies on the free (GPL) Stanford Core NLP library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    mlx

    mlx

    MLX: An array framework for Apple silicon

    MlX offers a local web interface to browse, download, and run ML models via Hugging Face or local sources. It supports searching by tags or tasks, visualization of model metadata, quick inference demos, automatic setup of runtime environments, and works with PyTorch, TensorFlow, and ONNX. Ideal for researchers exploring and testing models via browser.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    ebook2audiobook

    ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1107+ languages

    ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of underlying TTS backends (XTTSv2, Bark, VITS, Fairseq, Tacotron2, YourTTS and more), which gives flexibility depending on hardware availability, voice preference, and language. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    Claude Skills

    Claude Skills

    Public repository for Agent Skills

    ...Rather than relying on handcrafted prompts every time, Skills teach an AI agent procedural knowledge and task-specific workflows so it can apply that expertise reliably, whether the task involves document creation, data analysis, design generation, or technical automation. Each Skill lives in its own directory with a SKILL.md file containing metadata and instructions, and can include supplemental scripts or assets that the agent uses to perform complex operations when relevant.
    Downloads: 111 This Week
    Last Update:
    See Project
  • 8
    Metarank

    Metarank

    A low code Machine Learning service that personalizes articles

    ...It’s often considered "too risky" to spend 6+ months on an in-house moonshot project to reinvent the wheel without an experienced team and no existing open-source tools. Metarank makes it easy not only for Amazon to do personalization but for everyone else. Ingest historical item listings, clicks and item metadata so Metarank can find hidden dependencies in the data using our simple JSON format.No Machine Learning experience is required, run our CLI tool with a set of features in a YAML configuration. Run Metarank API service, feed it with real-time events and receive a personalized ranking for your items that will boost conversion, click-through rate or any other business-critical metric you define.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    ...Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and open-source software which means that transparency is the core value of our software development. Source code can be reviewed and improved by anyone from anywhere. Papermerge supports multiple users. Each user can be assigned different permissions to perform only a specific kind of action e.g. view only documents from a specific folder. ...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 10
    TFX

    TFX

    TFX is an end-to-end platform for deploying production ML pipelines

    ...TFX pipelines can be orchestrated using Apache Airflow and Kubeflow Pipelines. Both the components themselves and the integrations with orchestration systems can be extended. TFX components interact with an ML Metadata backend that keeps a record of component runs, input and output artifacts, and runtime configuration. This metadata backend enables advanced functionality like experiment tracking or warm starting/resuming ML models from previous runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Gollama

    Gollama

    Go manage your Ollama models

    ...The project is aimed at developers and local AI users who frequently work with multiple Ollama models and want a more efficient operational layer for everyday maintenance. Beyond standard model management, Gollama can display metadata such as size, quantization level, model family, and modification date, which helps users compare models quickly. One of its more distinctive capabilities is a VRAM estimation system that can calculate memory requirements, estimate context limits, and help users choose quantization settings that fit available hardware.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    ...Open Semantic Search includes an ETL framework that can ingest documents, process them through analysis steps, and enrich the data with extracted information such as named entities and metadata. It also supports optical character recognition to extract text from images and scanned documents, including images embedded inside PDF files. It integrates text mining and analytics capabilities that allow users to examine relationships, topics, and structured data within document collections.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    OpenAI Agent Skills

    OpenAI Agent Skills

    Skills Catalog for Codex

    ...It organizes reusable, task-specific workflows, instructions, scripts, and resources into modular skill folders so that an AI agent can reliably perform complex tasks without repeated custom prompting, making agent behavior more predictable and composable. Each skill is defined with clear metadata and instructions organizing how an AI assistant should complete specific tasks ranging from project management to code generation and documentation assistance. The repository supports community contributions, allowing developers to add new skills or update existing ones to keep the catalog relevant and practical for evolving use cases.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Framelink MCP for Figma

    Framelink MCP for Figma

    MCP server enabling AI coding tools to access Figma design data

    ...It allows coding assistants to retrieve structured information from Figma files so they can better translate visual designs into working code. Instead of relying on screenshots or manual descriptions, Figma-Context-MCP accesses layout, styling, and component metadata directly from the Figma API and presents it in a simplified format optimized for AI models. This transformation reduces unnecessary metadata and focuses on the most relevant design attributes, helping AI coding agents produce more accurate UI implementations. Developers can integrate the server with compatible tools such as AI-assisted IDE environments that support MCP-based integrations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MCP Server Giphy

    MCP Server Giphy

    An implementation of Giphy integration with Model Context Protocol

    The MCP Server Giphy is a Model Context Protocol (MCP) server that enables AI models to search, retrieve, and utilize GIFs from the Giphy platform. It facilitates seamless integration of Giphy's vast GIF library into AI applications, enhancing their expressive capabilities. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 18
    Apache Hamilton

    Apache Hamilton

    Helps data scientists define testable self-documenting dataflows

    ...This approach encourages modular, testable, and maintainable data pipelines because each transformation is isolated and easily unit tested. The framework also automatically tracks lineage and metadata about how data is produced, which improves debugging, reproducibility, and transparency in data workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    OpenMLSys-ZH

    OpenMLSys-ZH

    Machine Learning Systems: Design and Implementation

    ...It helps bridge language barriers in open machine learning systems by providing side-by-side translation or localized explanations. The repository includes scripts or tooling to keep translation synchronized with upstream changes, versioning, and possibly translation metadata (contributors, timestamp). Users can browse or clone the translated documentation to follow along with the original content, deploy examples, or understand system internals in their preferred language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Embedding Atlas

    Embedding Atlas

    Tool that provides interactive visualizations for large embeddings

    Embedding Atlas is an open-source tool by Apple that provides scalable, interactive visualizations for large embedding datasets. It enables users to visualize, cross-filter, and search through embeddings alongside rich metadata, all in real time using modern web-based technologies. In addition to the command line tool, Embedding Atlas is also available as a Jupyter widget. Finally, components from Embedding Atlas are also available in an npm package. Order-independent transparency ensuring accurate rendering despite overlapping points.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    AI Deadlines

    AI Deadlines

    AI conference deadline countdowns

    ...The repository powers a website that displays countdown timers and structured information for top research conferences across subfields such as computer vision, natural language processing, machine learning, and robotics. The project maintains a curated dataset of conferences that includes metadata such as submission deadlines, abstract deadlines, event dates, conference locations, and related information. Researchers and students use the platform to plan their paper submissions and manage academic schedules without manually tracking multiple conference announcements. The repository includes configuration files and data sources that allow contributors to add or update conferences through pull requests, enabling community-driven maintenance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Stable Diffusion Web UI Extensions

    Stable Diffusion Web UI Extensions

    Extension index for stable-diffusion-webui

    This repository serves as the official index used by the Stable Diffusion Web UI to discover and install extensions. It aggregates metadata for hundreds of community plugins—image utilities, ControlNet tools, upscalers, prompt helpers, animation suites—so users can browse and add capabilities directly from the UI. The index maintains short descriptions, tags, and repository links, enabling quick filtering by purpose or workflow. It also standardizes submission format so extension authors can contribute entries that the Web UI can parse reliably. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Shadcn UI v4 MCP Server

    Shadcn UI v4 MCP Server

    A mcp server to allow LLMS gain context about shadcn ui component

    Shadcn UI v4 MCP Server is a Model Context Protocol server that enables AI assistants to access, retrieve, and utilize shadcn/ui component libraries within development workflows, effectively bridging UI component systems with AI-driven coding tools. It provides structured access to component source code, demos, metadata, and reusable UI blocks, allowing AI agents to generate accurate and production-ready interface implementations. The server supports multiple frontend frameworks including React, Svelte, Vue, and React Native, making it highly versatile for cross-platform development. It includes smart caching and efficient GitHub API usage to optimize performance and handle rate limits during component retrieval. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    tldw Server

    tldw Server

    Your Personal Research Multi-Tool

    ...The name “tldw” reflects the phrase “too long; didn’t watch,” which refers to tools that condense lengthy videos, articles, or documents into concise summaries. The server component typically acts as the core infrastructure that manages summaries, metadata, and retrieval operations for client applications or user interfaces. In practical deployments, a system like this can support AI-powered summarization pipelines that process transcripts, articles, or other long-form material and store condensed versions for easier consumption. The mirrored project hosted on SourceForge exists to preserve the availability of the code and provide an alternative download location for developers and researchers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB