Data processing for and with foundation models
SDG is a specialized framework
The open-source tool for building high-quality datasets
Uncover insights, surface problems, monitor, and fine tune your LLM
Synthetic Data Generation for tabular, relational and time series data
Training data (data labeling, annotation, workflow) for all data types
The standard data-centric AI package for data quality and ML
Create HTML profiling reports from pandas DataFrame objects
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Benchmarking synthetic data generation methods
Open-source DORA metrics platform for engineering teams
Curated list of classic, high-quality computer science books
A high-quality tool for convert PDF to Markdown and JSON
Synthetic data curation for post-training and data extraction
Flexible Photo Recrafting While Preserving Your Identity
AWS IoT FleetWise Edge Agent
Open-source all-in-one platform for engineering AI products
Automatically Visualize any dataset, any size
A collection of top programming
Declarative engine for generating AI-powered infographic visuals
Extract schema, statistics and entities from datasets
Local-first, open-source alternative to Anthropic's Claude Design
Toloka-Kit is a Python library for working with Toloka API
A high-quality rapid TTS voice cloning model
Collaborative & Open-Source Quality Assurance for all AI models