Showing 12 open source projects for "talend data quality"

View related business solutions
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 1
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    The open-source tool for building high-quality datasets and computer vision models. Nothing hinders the success of machine learning systems more than poor-quality data. And without the right tools, improving a model can be time-consuming and inefficient. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    ...Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Arize Phoenix

    Arize Phoenix

    Uncover insights, surface problems, monitor, and fine tune your LLM

    Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative) are an amazing technology that will power many of future ML use cases. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • 5
    fastdup

    fastdup

    An unsupervised and free tool for image and video dataset analysis

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DeepVariant

    DeepVariant

    DeepVariant is an analysis pipeline that uses a deep neural networks

    ...DeepTrio extends DeepVariant's functionality, allowing it to utilize the power of neural networks to predict genomic variants in trios or duos. See this page for more details and instructions on how to run DeepTrio. Out-of-the-box use for PCR-positive samples and low quality sequencing runs, and easy adjustments for different sequencing technologies and non-human species.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    FLAML

    FLAML

    A fast library for AutoML and tuning

    ...It frees users from selecting learners and hyperparameters for each learner. For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space, and metric), or full customization (arbitrary training and evaluation code). ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Pedalboard

    Pedalboard

    A Python library for audio

    ...It supports the most popular audio file formats and a number of common audio effects out of the box and also allows the use of VST3® and Audio Unit formats for loading third-party software instruments and effects. pedalboard was built by Spotify’s Audio Intelligence Lab to enable using studio-quality audio effects from within Python and TensorFlow. Internally at Spotify, pedalboard is used for data augmentation to improve machine learning models and to help power features like Spotify’s AI DJ and AI Voice Translation. pedalboard also helps in the process of content creation, making it possible to add effects to audio without using a Digital Audio Workstation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    CausalNex

    CausalNex

    A Python library that helps data scientists to infer causation

    CausalNex is a Python library that uses Bayesian Networks to combine machine learning and domain expertise for causal reasoning. You can use CausalNex to uncover structural relationships in your data, learn complex distributions, and observe the effect of potential interventions.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenance—typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • 10
    NLP Best Practices

    NLP Best Practices

    Natural Language Processing Best Practices & Examples

    In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    textgenrnn

    textgenrnn

    Easily train your own text-generating neural network

    With textgenrnn you can easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code. A modern neural network architecture that utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality. Train on and generate text at either the character-level or word-level. Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs. Train on any generic input text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MatchZoo

    MatchZoo

    Facilitating the design, comparison and sharing of deep text models

    The goal of MatchZoo is to provide a high-quality codebase for deep text matching research, such as document retrieval, question answering, conversational response ranking, and paraphrase identification. With the unified data processing pipeline, simplified model configuration and automatic hyper-parameters tunning features equipped, MatchZoo is flexible and easy to use. Preprocess your input data in three lines of code, keep track parameters to be passed into the model. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB