A simple native web interface that uses ChatTTS to synthesize text
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Ready-to-use OCR with 80+ supported languages
Ling is a MoE LLM provided and open-sourced by InclusionAI
Controllable and fast Text-to-Speech for over 7000 languages
The most intuitive, flexible, way for researchers to build models
A straightforward method for training your LLM
Photorealistic Synthetic Dataset for Holistic Indoor Scene
Code for running inference with the SAM 3D Body Model 3DB
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Powerful open source image generation model
Code for the paper Language Models are Unsupervised Multitask Learners
Inference code for Llama models
Semantic segmentation models, datasets & losses implemented in PyTorch
Large-scale autoregressive pixel model for image generation by OpenAI
Tools to download and cleanup Common Crawl data
Deep learning person re-identification in PyTorch
PyTorch implementation of BigGAN with pretrained weights
3D ResNets for Action Recognition (CVPR 2018)