Showing 888 open source projects for "data quality"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Cap

    Cap

    Open source Loom alternative. Beautiful, shareable screen recordings

    ...Cap emphasizes user ownership and privacy by allowing recordings to be stored in custom S3 buckets, ensuring data security.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Computer Science courses video lectures

    Computer Science courses video lectures

    List of Computer Science courses with video lectures

    This repository is a curated list of full-length computer science video lecture series across many universities and MOOC platforms, helping learners assemble their own curriculum. The list spans foundational topics like algorithms, data structures, operating systems, computer networks, machine learning, and more, all delivered via lectures rather than just textual tutorials. The contributor guidelines encourage adding high-quality courses (not just casual tutorials) so the list remains academically oriented. Because it’s updated and community maintained, the collection grows with new offerings and helps learners evaluate what courses are available before starting. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Cesium

    Cesium

    An open-source JavaScript library for world-class 3D globes and maps

    CesiumJS is an open source JavaScript library for creating world-class 3D globes and maps with the best possible performance, precision, visual quality, and ease of use. Developers across industries, from aerospace to smart cities to drones, use CesiumJS to create interactive web apps for sharing dynamic geospatial data. Built on open formats, CesiumJS is designed for robust interoperability and scaling for massive datasets. CesiumJS is released under the Apache 2.0 license and is free for both commercial and non-commercial use. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Vanilla.PDF

    Vanilla.PDF

    Cross-platform SDK for creating and modifying PDF documents

    Vanilla.PDF is a modern, high-performance, open-source C++17 SDK designed for creating, editing, signing, and analyzing PDF documents across multiple platforms. It requires no external runtime dependencies, making it lightweight and ideal for embedding into desktop applications, servers, or automation pipelines. The SDK offers full cross-platform support including Windows, Linux, macOS, and Android, with builds available for major compilers and architectures. Vanilla.PDF supports advanced...
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    APIPark

    APIPark

    APIPark is the #1 open-source AI Gateway and Developer Portal

    ...API lifecycle management helps standardize the process of managing APIs, including traffic forwarding, load balancing, and managing different versions of publicly accessible APIs. This improves API quality and maintainability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    refinery

    refinery

    Open-source choice to scale, assess and maintain natural language data

    The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact. You are one of the people we've built refinery for. refinery helps you to build better NLP models in a data-centric approach. Semi-automate your labeling, find low-quality subsets in your training data, and monitor your data in one place. refinery doesn't get rid of manual labeling, but it makes sure that your valuable time is spent well. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Awesome Privacy

    Awesome Privacy

    A curated list of privacy & security-focused software and services

    Awesome Privacy is a curated directory of privacy-respecting alternatives to mainstream apps and services, organized across many categories like browsers, search, email, messaging, cloud storage, and operating systems. It aims to help you choose tools that reduce tracking, fingerprinting, and data collection without sacrificing usability. Each entry highlights the project’s core properties—such as open source status, end-to-end encryption, and platform availability—so you can evaluate trade-offs quickly. Because product landscapes change fast, the list emphasizes ongoing maintenance and community discussion around quality and trust. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Reflex

    Reflex

    Interactive programs without callbacks or side-effects

    Reflex apps automatically react to changing data. This keeps every interaction current, accurately representing the relationship between your data and the real world. Reflex components are modular and reusable. If your requirements change, your app can quickly and easily be reworked. The modularity of Reflex lets you iterate quickly, without wasting code. Reflex has been built to seamlessly support interfaces on desktop, mobile, web, and other platforms, all in Haskell. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Amplication

    Amplication

    Amplication is an open‑source development tool

    Amplication is an open‑source development tool. It helps you develop quality Node.js applications without spending time on repetitive coding tasks. Easily create data models and configure role‑based access control with a simple and intuitive UI or CLI. Continuously push the generated application to your GitHub repository. Get a Docker container with your database, a Node.js application, and a React client.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    DINOv2

    DINOv2

    PyTorch code and models for the DINOv2 self-supervised learning

    DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval, detection, and segmentation—often requiring little or no fine-tuning. The repository includes code for training, evaluating, and feature extraction, with utilities to run k-NN or linear evaluation baselines to assess representation quality.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    state-in-url

    state-in-url

    Share complex React state between any components and sync to the URL

    Easily share complex states between unrelated React components, with IDE autocomplete and TS validation. Without any hassle or boilerplate. state-in-URL Simple state management with optional URL sync. Share complex states between unrelated React components, TS-friendly, NextJS compatible. Most users don't care about URLs, so, can use them to store your app state.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    AWS MCP Servers

    AWS MCP Servers

    Helping you get the most out of AWS, wherever you use MCP

    AWS MCP Servers are a collection of remotely hosted, fully-managed Model Context Protocol (MCP) servers by AWS, providing AI applications with real-time access to AWS documentation, API references, best practices, and infrastructure-management capabilities via natural-language workflows. An MCP Server is a lightweight program that exposes specific capabilities through the standardized Model Context Protocol. Host applications (such as chatbots, IDEs, and other AI tools) have MCP clients that...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    sqlite-utils

    sqlite-utils

    Python CLI utility and library for manipulating SQLite databases

    ...The project also embraces an ecosystem of plugins, so you can add custom SQL functions, extra commands, or UIs (including a terminal UI) via separate packages. Because it’s designed by someone who uses SQLite heavily in real projects, the tool includes many small quality-of-life features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    QAnything

    QAnything

    Question and Answer based on Anything

    QAnything is a local knowledge-base question-answering system designed to let users ask questions over many kinds of files and databases. It supports offline installation, making it useful for organizations that need private document analysis without sending data to external services. Users can upload local files and receive fast, reliable answers based on the indexed content. The system supports formats such as PDF, Word, PowerPoint, Excel, Markdown, email, text, images, CSV, and web links. Its retrieval process uses a two-stage vector and reranking approach to maintain answer quality as the knowledge base grows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Helicone

    Helicone

    Open source LLM-Observability Platform for Developers

    Open source LLM-Observability Platform for Developers. One-line integration for monitoring, metrics, evals, agent tracing, prompt management, playground, etc. Supports OpenAI SDK, Vercel AI SDK, Anthropic SDK, LiteLLM, LLamaIndex, LangChain, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MaxKB

    MaxKB

    Open-source platform for building enterprise-grade agents

    MaxKB (Max Knowledge Brain) is an open-source platform for building enterprise-grade AI agents with strong knowledge retrieval, RAG pipelines, and workflow orchestration. It focuses on practical deployments such as customer support, internal knowledge bases, research assistants, and education, bundling tools for data ingestion, chunking, embedding, retrieval, and answer synthesis. The system exposes flexible tool-use (including MCP), supports multi-model backends, and provides dashboards for...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    PaperQA2

    PaperQA2

    High accuracy RAG for answering questions from scientific documents

    PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    jimmer

    jimmer

    A revolutionary ORM framework for both java and kotlin.

    ...CDC solutions decoupled from specific caching technologies are transparent to business code. Requires no special prior knowledge - veterans of any ORM can quickly and painlessly migrate. The learning curve is scientifically friendly, serving as a high-quality reference for new learners of ORM usage.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    LLM Course

    LLM Course

    Course to get into Large Language Models (LLMs)

    ...The materials also cover inference optimization and quantization to make serving LLMs feasible on commodity GPUs or even CPUs, which is crucial for side projects and startups. Evaluation is treated as a first-class topic, with examples of automatic and human-in-the-loop methods to catch regressions and verify quality beyond simple loss values. By the end, students have a mental model and a practical toolkit for iterating on datasets, training configs, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DeepSeek-V3.2-Exp

    DeepSeek-V3.2-Exp

    An experimental version of DeepSeek model

    DeepSeek-V3.2-Exp is an experimental release of the DeepSeek model family, intended as a stepping stone toward the next generation architecture. The key innovation in this version is DeepSeek Sparse Attention (DSA), a sparse attention mechanism that aims to optimize training and inference efficiency in long-context settings without degrading output quality. According to the authors, they aligned the training setup of V3.2-Exp with V3.1-Terminus so that benchmark results remain largely...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    MedicalGPT

    MedicalGPT

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

    MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The Open Source Computer Science Degree

    The Open Source Computer Science Degree

    Video discussing this curriculum

    ...It is widely used by self-learners and developers seeking to strengthen foundational knowledge. Overall, it democratizes access to high-quality computer science education.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Spiral Framework

    Spiral Framework

    High-Performance PHP Framework for large scale applications

    Born out of real-world software development projects, Spiral Framework is a modern PHP framework designed to power faster, cleaner, superior software development. Due to its design and sophisticated application server, Spiral Framework will execute your code up to 10 times faster than Laravel or Symfony without compromising code quality or compatibility with commonly-used libraries. Spiral Framework provides all the tools you need to write secure applications with embedded encryption, CSRF...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Deepkit Framework

    Deepkit Framework

    A new full-featured and high-performance TypeScript framework

    A new full-featured and high-performance TypeScript framework for enterprise applications. High-quality TypeScript libraries and next-gen backend framework. Leverage TypeScript types to the fullest, in completely new ways. A runtime type system for JavaScript, powered by TypeScript types. Deepkit's type compiler makes it possible to use the whole TypeScript type system in any JavaScript runtime for the first time enabling completely new ways of writing data-driven applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    TypeORM

    TypeORM

    ORM for TypeScript and JavaScript (ES7, ES6, ES5)

    ...Its goal is to always support the latest JavaScript features and provide additional features that help you to develop any kind of application that uses databases - from small applications with a few tables to large scale enterprise applications with multiple databases. TypeORM supports both Active Record and Data Mapper patterns, unlike all other JavaScript ORMs currently in existence, which means you can write high-quality, loosely coupled, scalable, maintainable applications the most productive way. TypeORM is highly influenced by other ORMs, such as Hibernate, Doctrine and Entity Framework. Supports both DataMapper and ActiveRecord (your choice). ...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo