Git-based data version control for machine learning workflows
Deep Research framework, combining language models with tools
Label Studio is a multi-type data labeling and annotation tool
Open source annotation tool for machine learning practitioners
An industrial grade federated learning framework
Renderer for the harmony response format to be used with gpt-oss
AI coding assistant skill (Claude Code, Codex, OpenCode, OpenClaw)
The open-source tool for building high-quality datasets
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
A high-quality tool for convert PDF to Markdown and JSON
PaddlePaddle End-to-End Development Toolkit
A Python Automated Machine Learning tool that optimizes ML
ExtractThinker is a Document Intelligence library for LLMs
Effortless data labeling with AI support from Segment Anything
Create custom engineering agents for your codebase
The Clay Foundation Model - An open source AI model and interface
Multi-Agent daTa geneRation Infra and eXperimentation framework
GLM-4 series: Open Multilingual Multimodal Chat LMs
Democratizing AI scientists with ToolUniverse
Extract schema, statistics and entities from datasets
Build GenAI application quick and easy
Comprehensive paid advertising audit & optimization skill
An unsupervised and free tool for image and video dataset analysis
PandasAI is a Python library that integrates generative AI