VITS2 backbone with multilingual-bert
A fast TTS architecture with conditional flow matching
SOTA discrete acoustic codec models with 40/75 tokens per second
Controllable and fast Text-to-Speech for over 7000 languages
One-click deployment (including offline integration package)
A TTS model capable of generating ultra-realistic dialogue
Pokee Deep Research Model Open Source Repo
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Volcano Engine Reinforcement Learning for LLMs
DeepMind model for tracking arbitrary points across videos & robotics
An alignment auditing agent capable of exploring alignment hypothesis
code for Mesh R-CNN, ICCV 2019
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Language modeling in a sentence representation space
Inference Llama 2 in one file of pure C
The repository provides code for running inference with SAM 2
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
kaldi-asr/kaldi is the official location of the Kaldi project
Evals is a framework for evaluating LLMs and LLM systems
The ChatGPT Retrieval Plugin lets you easily find personal documents
Educational framework exploring multi-agent orchestration
Designed for text embedding and ranking tasks
Implementation of the Surya Foundation Model for Heliophysics
Revolutionizes the way users interact with Autogen
A machine learning library for detecting anomalies in signals