Showing 888 open source projects for "data quality"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1
    LosslessCut

    LosslessCut

    The swiss army knife of lossless video/audio editing

    ...The main feature is lossless trimming and cutting of video and audio files, which is great for saving space by rough-cutting your large video files taken from a video camera, GoPro, drone, etc. It lets you quickly extract the good parts from your videos and discard many gigabytes of data without doing a slow re-encode and thereby losing quality. Or you can add a music or subtitle track to your video without needing to encode. Everything is extremely fast because it does an almost direct data copy, fueled by the awesome FFmpeg which does all the grunt work. Lossless merge/concatenation of arbitrary files (with identical codecs parameters, e.g. from the same camera). ...
    Downloads: 321 This Week
    Last Update:
    See Project
  • 2
    Seurat

    Seurat

    R toolkit for single cell genomics

    Seurat is a comprehensive R toolkit for single-cell genomics analysis, introduced by the Satija Lab at NYGC. It supports quality control, normalization, clustering, integration of multimodal data (e.g., scRNA‑seq, spatial, CITE‑seq), and visualization. Seurat v5 introduces scalable workflows and spatial transcriptomics support, commonly used in academic and industry research for single-cell studies.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    ggpubr

    ggpubr

    'ggplot2' Based Publication Ready Plots

    ggpubr is an R package that provides easy-to-use wrapper functions around ggplot2 to create publication-ready visualizations with minimal code. It streamlines plot creation for researchers and analysts, allowing features such as statistical annotation, theme customization, and plot arrangement with fewer lines of code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    ALVR - Air Light VR

    ALVR - Air Light VR

    Stream VR games from your PC to your headset via Wi-Fi

    ...It allows users to run PC-based VR applications while using devices such as standalone headsets, effectively bridging the gap between high-performance desktop VR and portable hardware. The system works by encoding video output from the PC, streaming it over Wi-Fi, and decoding it on the headset in real time, while also transmitting input data such as head tracking and controller movements back to the PC. ALVR supports low-latency streaming through optimized encoding techniques and network configurations, making it suitable for interactive VR experiences. It also includes customizable settings for bitrate, resolution, and performance tuning, allowing users to balance quality and responsiveness. ...
    Downloads: 108 This Week
    Last Update:
    See Project
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 5
    RenderCV

    RenderCV

    LaTeX CV generator from a YAML/JSON input file

    RenderCV is a LaTeX CV/resume framework. It allows you to create a high-quality CV as a PDF from a YAML file with full Markdown syntax support and complete control over the LaTeX code. RenderCV offers built-in LaTeX and Markdown templates ready to produce high-quality CVs. However, the templates are entirely arbitrary and can easily be updated to leverage RenderCV's capabilities with your custom CV themes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AWS IoT FleetWise Edge

    AWS IoT FleetWise Edge

    AWS IoT FleetWise Edge Agent

    ...Improve electric vehicle (EV) battery range estimates with crowdsourced environmental data, such as weather and driving conditions, from nearby vehicles. Collect select data from nearby vehicles and use it to notify drivers of changing road conditions, such as lane closures or construction. Use near real-time data to proactively detect and mitigate fleet-wide quality issues.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Compose.jl

    Compose.jl

    Declarative vector graphics

    Compose is a vector graphics library for Julia. It forms the basis for the statistical graphics system Gadfly. Compose is a declarative vector graphics system written in Julia. It's designed to simplify the creation of complex graphics and serves as the basis of the Gadfly data visualization package.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    ...But beyond manual editing, it also offers a programmable layer so developers can write scripts to batch process documents, generate templated reports, or extract structured data from PDFs for integration in workflows. The design emphasizes quality and compatibility: output PDFs render accurately across readers, preserve metadata, and support interactive elements like hyperlinks and form fields.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 9

    JSON for Modern C++

    JSON that's part of C++

    ...While there may be dozens of JSON libraries out there, JSON for C++ stands out with a focus on three things: an intuitive syntax, trivial integration and serious testing. Using the operator magic of modern C++, this library makes JSON feel like a first class data type. With trivial integration, the entire code is made up of a single header file json.hpp, no dependencies, no complex build system required. It's been heavily unit-tested covering 100% of the code, and follows the Core Infrastructure Initiative (CII) best practices to ensure the highest quality at all times. Among its many features are JSON pointers, JSON patches, Iterators, SAX parsing and various container operations.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Best-of Python

    Best-of Python

    A ranked list of awesome Python open-source libraries

    ...Correctly generate plurals, ordinals, indefinite articles; convert numbers. Libraries for loading, collecting, and extracting data from a variety of data sources and formats. Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Matplot++

    Matplot++

    Matplot++: A C++ Graphics Library for Data Visualization

    Data visualization can help programmers and scientists identify trends in their data and efficiently communicate these results with their peers. Modern C++ is being used for a variety of scientific applications, and this environment can benefit considerably from graphics libraries that attend the typical design goals toward scientific data visualization.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    LaTeX2e Kernel Code Repository
    LaTeX is a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation. LaTeX is the de facto standard for the communication and publication of scientific documents. LaTeX is available as free software.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Matplotlib

    Matplotlib

    matplotlib: plotting with Python

    Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. Matplotlib ships with several add-on toolkits, including 3D plotting with mplot3d, axes helpers in axes_grid1 and axis helpers in axisartist. A large number of third party packages extend and build on Matplotlib functionality, including several higher-level plotting interfaces (seaborn, HoloViews, ggplot, ...), and a...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 15
    Laminar

    Laminar

    Open-source all-in-one platform for engineering AI products

    Laminar is an open source all-in-one platform for engineering best-in-class LLM products. Data governs the quality of your LLM application. Laminar helps you collect it, understand it, and use it. When you trace your LLM application, you get a clear picture of every step of execution and simultaneously collect invaluable data. You can use it to set up better evaluations, as dynamic few-shot examples, and for fine-tuning. All traces are sent in the background via gRPC with minimal overhead. ...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 16
    Another Redis Desktop Manager

    Another Redis Desktop Manager

    A faster, better and more stable Redis desktop manager

    Quality-of-life features include JSON viewers, search and filter tools, favorite connections, and dark mode. For everyday operations and troubleshooting, it offers a friendlier alternative to the command line without hiding Redis’s power.
    Downloads: 64 This Week
    Last Update:
    See Project
  • 17
    BestBlogs

    BestBlogs

    A collection of top programming

    BestBlogs is an open-source project designed to aggregate, organize, and surface high-quality blog content from across the web, helping users discover valuable articles in a structured and accessible way. The platform focuses on curating content based on relevance, quality, and usefulness rather than simply indexing large volumes of information, making it particularly useful for developers, researchers, and knowledge seekers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AntV Infographic

    AntV Infographic

    Declarative engine for generating AI-powered infographic visuals

    AntV Infographic is a declarative infographic generation and rendering framework designed to transform structured data into visually rich infographic outputs. It provides a custom domain-specific language that allows developers and AI systems to describe infographic layouts in a concise and human-readable syntax. It focuses on simplifying data storytelling by enabling fast creation of professional-quality visuals without requiring complex design workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Synthea Patient Generator

    Synthea Patient Generator

    Synthetic Patient Population Simulator

    SyntheaTM is an open-source, synthetic patient generator that models the medical history of synthetic patients. Our mission is to provide high-quality, synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. The models used to generate synthetic patients are informed by numerous academic publications. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    DataProfiler

    DataProfiler

    Extract schema, statistics and entities from datasets

    DataProfiler is an AI-powered tool for automatic data analysis and profiling, designed to detect patterns, anomalies, and schema inconsistencies in structured and unstructured datasets. The DataProfiler is a Python library designed to make data analysis, monitoring, and sensitive data detection easy. Loading Data with a single command, the library automatically formats & loads files into a DataFrame. Profiling the Data, the library identifies the schema, statistics, entities (PII / NPI), and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Toloka-Kit

    Toloka-Kit

    Toloka-Kit is a Python library for working with Toloka API

    ...There’s no need to validate JSON files and work with them directly. Support of both synchronous and asynchronous (via async/await) executions. Streaming support: build complex pipelines which send and receive data in real-time. For example, you can pass data between two related projects: one for data labeling, and another for its validation. AutoQuality feature which automatically finds the best fitting quality control rules for your project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    ...InLong was originally built at Tencent, which has served online businesses for more than 8 years, to support massive data (data scale of more than 80 trillion pieces of data per day) reporting services in big data scenarios. The entire platform has integrated 5 modules: Ingestion, Convergence, Caching, Sorting, and Management, so that the business only needs to provide data sources, data service quality, data landing clusters and data landing formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Elide

    Elide

    Elide is a Java library that lets you stand up a GraphQL/JSON-API

    ...Make instances of your new model accessible through a top level collection or restrict access only through relationships to other models. And thats it, you are ready to deploy and query your data with JSON or GraphQL requests. Quickly build and deploy production-quality web services that expose your data as a service. Elide APIs support complex filtering rules, sorting, pagination, subscriptions, and text search. Controlling access to your data is as simple as defining your rules and annotating your models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Union Pandera

    Union Pandera

    Light-weight, flexible, expressive statistical data testing library

    ...Validate the functions that produce your data by automatically generating test cases for them. Integrate seamlessly with the Python ecosystem. Overcome the initial hurdle of defining a schema by inferring one from clean data, then refine it over time. Identify the critical points in your data pipeline, and validate data going in and out of them. Build confidence in the quality of your data by defining schemas for complex data objects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    AutoViz

    AutoViz

    Automatically Visualize any dataset, any size

    AutoViz is a Python data visualization library designed to automate exploratory data analysis by generating multiple visualizations with minimal code. The primary goal of the project is to help data scientists and analysts quickly understand patterns, relationships, and anomalies within datasets without manually writing complex plotting code. With a single command, the library can automatically generate dozens of charts and graphs that reveal insights into the structure and quality of the data.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo