Empowering Code Generation with OSS-Instruct
An Open-source Framework for Data-centric Language Agents
Long-form streaming TTS system for multi-speaker dialogue generation
Pre-trained Deep Learning models and demos
Advanced techniques for RAG systems
This repository contains the official implementation of FastVLM
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
An open-source photo thumbnail service by globo.com
SOTA discrete acoustic codec models with 40/75 tokens per second
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
GitLab automatic code review tool based on large models
Multi-Agent daTa geneRation Infra and eXperimentation framework
Interface for OuteTTS models
A TTS model capable of generating ultra-realistic dialogue
Chinese and English multimodal conversational language model
Static code analysis
Towards Human-Sounding Speech
Python CLI utility and library for manipulating SQLite databases
Qwen3-omni is a natively end-to-end, omni-modal LLM
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
The toolkit to test, validate, and evaluate your models and surface
AI based photo editing website for changing image background
Turn your website into a GIF
Chinese Llama-3 LLMs) developed from Meta Llama 3