Data processing for and with foundation models
SDG is a specialized framework
The open-source tool for building high-quality datasets
Uncover insights, surface problems, monitor, and fine tune your LLM
Training data (data labeling, annotation, workflow) for all data types
The standard data-centric AI package for data quality and ML
Create HTML profiling reports from pandas DataFrame objects
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Benchmarking synthetic data generation methods
Open-source DORA metrics platform for engineering teams
Curated list of classic, high-quality computer science books
A high-quality tool for convert PDF to Markdown and JSON
Synthetic data curation for post-training and data extraction
Flexible Photo Recrafting While Preserving Your Identity
AWS IoT FleetWise Edge Agent
Open-source all-in-one platform for engineering AI products
A collection of top programming
Declarative engine for generating AI-powered infographic visuals
A high-quality rapid TTS voice cloning model
Extract schema, statistics and entities from datasets
Toloka-Kit is a Python library for working with Toloka API
Automatically Visualize any dataset, any size
ETL framework to index data for AI, such as RAG
An unsupervised and free tool for image and video dataset analysis
Claude Code skill for generating production-quality SVG+PNG technical