Showing 12 open source projects for "data quality"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 1
    Synthetic Data Kit

    Synthetic Data Kit

    Tool for generating high quality Synthetic datasets

    Synthetic Data Kit is a CLI-centric toolkit for generating high-quality synthetic datasets to fine-tune Llama models, with an emphasis on producing reasoning traces and QA pairs that line up with modern instruction-tuning formats. It ships an opinionated, modular workflow that covers ingesting heterogeneous sources (documents, transcripts), prompting models to create labeled examples, and exporting to fine-tuning schemas with minimal glue code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Synthetic Data Vault (SDV)

    Synthetic Data Vault (SDV)

    Synthetic Data Generation for tabular, relational and time series data

    The Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily learn single-table, multi-table and timeseries datasets to later on generate new Synthetic Data that has the same format and statistical properties as the original dataset. Synthetic data can then be used to supplement, augment and in some cases replace real data when training Machine Learning models. Additionally, it enables the testing of Machine Learning or other data dependent...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Gretel Synthetics

    Gretel Synthetics

    Synthetic data generators for structured and unstructured text

    Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    ...You also customize the process to include your own work. Select any of the publicly available datasets from the SDV project, or input your own data. Choose from any of the SDV synthesizers and baselines. Or write your own custom machine learning model. In addition to performance and memory usage, you can also measure synthetic data quality and privacy through a variety of metrics. Install SDGym using pip or conda. We recommend using a virtual environment to avoid conflicts with other software on your device.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Synthea Patient Generator

    Synthea Patient Generator

    Synthetic Patient Population Simulator

    SyntheaTM is an open-source, synthetic patient generator that models the medical history of synthetic patients. Our mission is to provide high-quality, synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. The models used to generate synthetic patients are informed by numerous academic publications. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    DBFeeder

    DBFeeder

    Highly Customizable Test Data Generator

    DBFeeder is a great tool to generate synthetic testdata for Oracle Databases and it is ideal for companies who wants to outsource development. Thanks to his original approach, data can be highly customizable and it even fits primary and foreign keys constraints of tables.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    benerator is a framework for creating realistic and valid high-volume test data, used for load and performance testing and showcase setup. Data is generated from an easily configurable metadata model and exported to databases, XML, CSV or flat files.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Ava: Testdata Xsl

    Ava: Testdata Xsl

    generates Testdata on base of excel: creates xml,excel,csv,html,sql,+

    this tool for test-data-generation receives an 'excel-sheet' as primary input. second important paramter is the 'number of test-records to produce'. The excel-data will be reused as long data is needed. This tool is hightly paramatrisazable by the use of 'xsl scripts'. data can be created, updated, modified and finally exported in a format of your choice Main Fuctions: (1) Generates Testdata (excel, xsl, xml) (2) Exports generated testdata in multiple formats (csv, excel, html,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    A Data Generator

    A tool to generate synthetic test data useful to Record matchers

    With growing amount of information from multiple sources it has become very hard to relate information to the correct real life entities. Record matching software try to solve this by machine learning techniques. To do this effectively, its necessary to train the record matcher with proper test data which is identical to real life data. Hence, there is a need for a data generator to create the synthetic data to be used for evaluating the quality and capability of record matching software. A data generator creates qualitative test data considering various the real life data glitches entered through various means like human data entry, voice dictation and data scanning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 10
    DATA Gen™

    DATA Gen™

    DATA Gen™ - Test Data Generator to generate realistic test data.

    DATA Gen™ Test Data Generator offers facilities to automate the task of creating test data for new or existing data bases. It helps lower the programming effort required, while reducing manual test data generation errors and the ripple effect that they cause on production systems, users and maintenance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    JRandO is a test data generator or better test object generator framework. It can be used in JUnit tests or in performance test (for e.g. using JMeter). It may also be useful in anonymization of data or in a simulation environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Sample code for JRandO project. (testdata generator, test data generator, test object generator, simulation)
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo