Showing 10 open source projects for "format raw"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    OpenHealth

    OpenHealth

    AI health assistant for private, local data-driven insights mgmt

    ...OpenHealth also includes a data parsing layer that transforms raw medical inputs into structured datasets, making them usable for analysis and AI-driven insights. OpenHealth separates data ingestion, processing, and AI interaction, enabling flexibility in integrating different models and data sources.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    GROBID

    GROBID

    A machine learning software for extracting information

    ...Header extraction and parsing from article in PDF format. The extraction here covers the usual bibliographical information (e.g. title, abstract, authors, affiliations, keywords, etc.). References extraction and parsing from articles in PDF format, around .87 F1-score against on an independent PubMed Central set of 1943 PDF containing 90,125 references, and around .89 on a similar bioRxiv set of 2000 PDF (using the Deep Learning citation model).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Framelink MCP for Figma

    Framelink MCP for Figma

    MCP server enabling AI coding tools to access Figma design data

    Figma-Context-MCP is an open source server that connects Figma design data with AI-powered coding tools through the Model Context Protocol (MCP). It allows coding assistants to retrieve structured information from Figma files so they can better translate visual designs into working code. Instead of relying on screenshots or manual descriptions, Figma-Context-MCP accesses layout, styling, and component metadata directly from the Figma API and presents it in a simplified format optimized for...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    BIG-bench

    BIG-bench

    Beyond the Imitation Game collaborative benchmark for measuring

    ...Tasks are intentionally heterogeneous: some are multiple-choice with exact scoring, others are free-form generation judged by model-based or human evaluation. The suite provides a common JSON task format and an evaluation harness so research groups can contribute new tasks and reproduce results consistently. It emphasizes robustness analysis—looking at scale trends, calibration, and areas where models systematically fail—to guide model development beyond raw accuracy. BIG-bench is as much a community process as a dataset, encouraging open sharing of tasks and findings to keep evaluations fresh and comprehensive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    fe4ml-zh

    fe4ml-zh

    Feature Engineering for Machine Learning

    fe4ml-zh is an open-source project that provides a Chinese translation and structured documentation of the book Feature Engineering for Machine Learning. The repository aims to make advanced feature engineering concepts accessible to a broader audience by translating the content and organizing it into readable documentation and code examples. Feature engineering is a critical component of machine learning pipelines because it determines how raw data is transformed into features that...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Piano transcription

    Piano transcription

    Task of transcribing piano recordings into MIDI files

    Piano transcription is an open-source high-resolution piano transcription system by ByteDance that converts raw audio recordings of piano performance into symbolic MIDI files — detecting note onsets, offsets, pitch, velocity, and even pedal usage. The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music. By using this transcription tool,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Image GPT

    Image GPT

    Large-scale autoregressive pixel model for image generation by OpenAI

    Image-GPT is the official research code and models from OpenAI’s paper Generative Pretraining from Pixels. The project adapts GPT-2 to the image domain, showing that the same transformer architecture can model sequences of pixels without altering its fundamental structure. It provides scripts to download pretrained checkpoints of different model sizes (small, medium, large) trained on large-scale datasets and includes utilities for handling color quantization with a 9-bit palette....
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    WikiSQL

    WikiSQL

    A large annotated semantic parsing corpus for developing NL interfaces

    A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is the dataset released along with our work Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. Regarding tokenization and Stanza, when WikiSQL was written 3-years ago, it relied on Stanza, a CoreNLP python wrapper that has since been deprecated. If you'd still like to use the tokenizer, please use the docker image. We do not anticipate switching...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    Optex Analyzer is a software to analyze and compare algorithms to solve approximately optimization problems. It has a GUI that allows select a set of input files containing raw algorithm results. The analysis is shown with tables and charts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB