Data processing for and with foundation models
SDG is a specialized framework
Uncover insights, surface problems, monitor, and fine tune your LLM
The open-source tool for building high-quality datasets
Training data (data labeling, annotation, workflow) for all data types
The standard data-centric AI package for data quality and ML
Create HTML profiling reports from pandas DataFrame objects
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Benchmarking synthetic data generation methods
Open-source DORA metrics platform for engineering teams
A high-quality tool for convert PDF to Markdown and JSON
Curated list of classic, high-quality computer science books
Synthetic data curation for post-training and data extraction
Flexible Photo Recrafting While Preserving Your Identity
AWS IoT FleetWise Edge Agent
Automatically Visualize any dataset, any size
A collection of top programming
Declarative engine for generating AI-powered infographic visuals
Extract schema, statistics and entities from datasets
A high-quality rapid TTS voice cloning model
Toloka-Kit is a Python library for working with Toloka API
ETL framework to index data for AI, such as RAG
Unsplash images made available for research and machine learning
An unsupervised and free tool for image and video dataset analysis
Claude Code skill for generating production-quality SVG+PNG technical