The open-source tool for building high-quality datasets
Benchmarking synthetic data generation methods
Uncover insights, surface problems, monitor, and fine tune your LLM
Training data (data labeling, annotation, workflow) for all data types
Autonomous research from idea to paper. Chat an Idea. Get a Paper 🦞
Create HTML profiling reports from pandas DataFrame objects
Curated list of classic, high-quality computer science books
3D reconstruction software
The standard data-centric AI package for data quality and ML
Focus on prompting and generating
The official Python SDK for the ElevenLabs API
Open Source Document Management System for Digital Archives
Stable Diffusion web UI
Asynchronous multi-platform robot framework written in Python
Multi-agent autonomous startup system for Claude Code
Develop software autonomously
A curated list of data mining papers about fraud detection
Tooling for the Common Objects In 3D dataset
Create Customized Software using Natural Language Idea
Open source machine learning framework to automate text conversations
DeepCode: Open Agentic Coding
Open-source choice to scale, assess and maintain natural language data
General proxy performance testing tool based on Clash using Telegram
A ranked list of awesome machine learning Python libraries
SDG is a specialized framework