[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences
LLM abstractions that aren't obstructions
Python framework for adversarial attacks, and data augmentation
Full-text IPFS-friendly and WASM-compatible Search in Rust
A general purpose syntax highlighter in pure Go
OCR model for complex documents with layout-aware structured outputs
A high-quality PDF to Markdown tool based on large language model
Long-form streaming TTS system for multi-speaker dialogue generation
RAG-Anything: All-in-One RAG Framework
Marrying Grounding DINO with Segment Anything & Stable Diffusion
PersonaPlex code
Open-Sora: Democratizing Efficient Video Production for All
Simple, Pythonic building blocks to evaluate LLM applications
Multi-lingual large voice generation model, providing inference
Framework for building, orchestrating, and deploying AI agents
Python crawler for collecting and downloading Sina Weibo user data
A single Gradio + React WebUI with extensions for ACE-Step
Build Vision Agents quickly with any model or video provider
Authoring Books and Technical Documents with R Markdown
Using AI models to automatically provide commentary and edit videos
Qwen3-ASR is an open-source series of ASR models
An open source implementation of CLIP
A Multi-Modal World Model for Reconstructing, Generating, Simulation
A wiki system with complex functionality for simple integration
Algorithms for outlier, adversarial and drift detection