A TTS model capable of generating ultra-realistic dialogue
Repo of Qwen2-Audio chat & pretrained large audio language model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
OCR expert VLM powered by Hunyuan's native multimodal architecture
Large-language-model & vision-language-model based on Linear Attention
Chinese Llama-3 LLMs) developed from Meta Llama 3
A state-of-the-art open visual language model
Audiocraft is a library for audio processing and generation
High-Resolution Image Synthesis with Latent Diffusion Models
LLM powered fuzzing via OSS-Fuzz
The ChatGPT Retrieval Plugin lets you easily find personal documents
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Build high-performance AI models with modular building blocks
Open Multilingual Multimodal Chat LMs
A Conversational Speech Generation Model
TechNews365 OS Admin AI intègre un Assistant Vocal IA 100% local !
Powerful open source image generation model
Release for Improved Denoising Diffusion Probabilistic Models
Official DeiT repository
Towards Human-Level Text-to-Speech through Style Diffusion
Code for the paper Language Models are Unsupervised Multitask Learners
A fast, powerful, and simple hierarchical vision transformer
VITS2 backbone with multilingual-bert
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project