Fast and memory-efficient exact attention
Voice Recognition to Text Tool
HunyuanVideo: A Systematic Framework For Large Video Generation Model
RAPIDS Machine Learning Library
A high-quality rapid TTS voice cloning model
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Fast, flexible and easy to use probabilistic modelling in Python
A nearly-live implementation of OpenAI's Whisper
Sharp Monocular Metric Depth in Less Than a Second
Fast inference engine for Transformer models
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Making large AI models cheaper, faster and more accessible
CodeGeeX2: A More Powerful Multilingual Code Generation Model
MNN is a blazing fast, lightweight deep learning framework
ChatGLM2-6B: An Open Bilingual Chat LLM
Software that uses AI to perform real-time voice conversion
Effortless data labeling with AI support from Segment Anything
The GPU-powered AI application database
High-performance neural network inference framework for mobile
C++ library for high performance inference on NVIDIA GPUs
Unified web UI for training and running open models locally
A generic, simple and fast implementation of Deepmind's AlphaZero
NeurIPS2025 Spotlight] Quantized Attention
Lightning fast C++/CUDA neural network framework
Multi-lingual large voice generation model, providing inference