Showing 16 open source projects for "data quality"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 1
    Arize Phoenix

    Arize Phoenix

    Uncover insights, surface problems, monitor, and fine tune your LLM

    Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative) are an amazing technology that will power many of future ML use cases. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    The open-source tool for building high-quality datasets and computer vision models. Nothing hinders the success of machine learning systems more than poor-quality data. And without the right tools, improving a model can be time-consuming and inefficient. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    ...Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    AutoViz

    AutoViz

    Automatically Visualize any dataset, any size

    AutoViz is a Python data visualization library designed to automate exploratory data analysis by generating multiple visualizations with minimal code. The primary goal of the project is to help data scientists and analysts quickly understand patterns, relationships, and anomalies within datasets without manually writing complex plotting code. With a single command, the library can automatically generate dozens of charts and graphs that reveal insights into the structure and quality of the data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DeepVariant

    DeepVariant

    DeepVariant is an analysis pipeline that uses a deep neural networks

    ...DeepTrio extends DeepVariant's functionality, allowing it to utilize the power of neural networks to predict genomic variants in trios or duos. See this page for more details and instructions on how to run DeepTrio. Out-of-the-box use for PCR-positive samples and low quality sequencing runs, and easy adjustments for different sequencing technologies and non-human species.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    fastdup

    fastdup

    An unsupervised and free tool for image and video dataset analysis

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Pedalboard

    Pedalboard

    A Python library for audio

    ...It supports the most popular audio file formats and a number of common audio effects out of the box and also allows the use of VST3® and Audio Unit formats for loading third-party software instruments and effects. pedalboard was built by Spotify’s Audio Intelligence Lab to enable using studio-quality audio effects from within Python and TensorFlow. Internally at Spotify, pedalboard is used for data augmentation to improve machine learning models and to help power features like Spotify’s AI DJ and AI Voice Translation. pedalboard also helps in the process of content creation, making it possible to add effects to audio without using a Digital Audio Workstation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    FLAML

    FLAML

    A fast library for AutoML and tuning

    ...It frees users from selecting learners and hyperparameters for each learner. For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space, and metric), or full customization (arbitrary training and evaluation code). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    crème de la crème of AI courses

    crème de la crème of AI courses

    This repository is a curated collection of links to various courses

    crème de la crème of AI courses is an open-source repository that serves as a curated directory of high-quality educational resources related to artificial intelligence, machine learning, and modern data science. The project aggregates links to online courses, tutorials, lecture series, and learning materials from universities, research labs, and independent educators. The repository organizes courses by topic, difficulty level, format, and release year, allowing learners to quickly identify relevant material depending on their experience and interests. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DPM-Solver

    DPM-Solver

    Fast ODE Solver for Diffusion Probabilistic Model Sampling

    DPM-Solver is a machine learning research implementation focused on accelerating the sampling process in diffusion probabilistic models used for generative AI tasks. Diffusion models are powerful generative systems capable of producing high-quality images and other data, but traditional sampling methods often require hundreds or thousands of computational steps. The project introduces a specialized numerical solver designed to approximate the diffusion process using a small number of high-order integration steps. By reformulating the sampling problem as the solution of a diffusion-related ordinary differential equation, the solver can produce high-quality samples much more efficiently. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    CausalNex

    CausalNex

    A Python library that helps data scientists to infer causation

    CausalNex is a Python library that uses Bayesian Networks to combine machine learning and domain expertise for causal reasoning. You can use CausalNex to uncover structural relationships in your data, learn complex distributions, and observe the effect of potential interventions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    NLP Best Practices

    NLP Best Practices

    Natural Language Processing Best Practices & Examples

    In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Image Quality Assessment

    Image Quality Assessment

    Convolutional Neural Networks to predict aesthetic quality of images

    Image Quality Assessment is an open-source deep learning project that implements neural models for predicting the aesthetic and technical quality of digital images. The repository provides an implementation inspired by the NIMA (Neural Image Assessment) research approach, which uses convolutional neural networks trained on human-annotated datasets to estimate image quality scores.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    textgenrnn

    textgenrnn

    Easily train your own text-generating neural network

    With textgenrnn you can easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code. A modern neural network architecture that utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality. Train on and generate text at either the character-level or word-level. Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs. Train on any generic input text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MatchZoo

    MatchZoo

    Facilitating the design, comparison and sharing of deep text models

    The goal of MatchZoo is to provide a high-quality codebase for deep text matching research, such as document retrieval, question answering, conversational response ranking, and paraphrase identification. With the unified data processing pipeline, simplified model configuration and automatic hyper-parameters tunning features equipped, MatchZoo is flexible and easy to use. Preprocess your input data in three lines of code, keep track parameters to be passed into the model. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo