Official DeiT repository
[CVPR 2025 Best Paper Award] VGGT
Foundational Models for State-of-the-Art Speech and Text Translation
Provides code for running inference with the SegmentAnything Model
Memory-efficient and performant finetuning of Mistral's models
Analyze computation-communication overlap in V3/R1
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Diffusion Transformer with Fine-Grained Chinese Understanding
Transformers4Rec is a flexible and efficient library
The React for Voice and Chat, build apps for Alexa, Google Assistant
Bailing is a voice dialogue robot similar to GPT-4o
Open source text-to-speech tool, supports extra-long text
Workflow and speech recognition app
Build Vision Agents quickly with any model or video provider
An Open Source text-to-speech system built by inverting Whisper
Lightning-fast, on-device TTS, running natively via ONNX
Speech-AI-Forge is a project developed around TTS generation model
Interface for OuteTTS models
A free + OSS logo generator powered by Flux on Together AI
Learn AI and LLMs from scratch using free resources
Reverse-engineered Python API for Google Gemini web app
Get a ChatGPT plugin up and running in under 5 minutes
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Concatenate a directory full of files into a single prompt
Scalable machine learning for time series forecasting