Enable AI to control your desktop, mobile and HMI devices
Generating Immersive, Explorable, and Interactive 3D Worlds
The Open Source Cowork Desktop to Unlock Your Exceptional Productivity
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Qwen2.5-VL is the multimodal large language model series
Create UIs for your machine learning model in Python in 3 minutes
Claude Code skill that researches any topic across Reddit + X
A research prototype of a human-centered web agent
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
ChatGLM2-6B: An Open Bilingual Chat LLM
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon
An unsupervised and free tool for image and video dataset analysis
Build Vision Agents quickly with any model or video provider
A batteries-included library for building AI-powered software
Turn your website into a GIF
Diffusion Transformer with Fine-Grained Chinese Understanding
Visual Instruction Tuning: Large Language-and-Vision Assistant
Speech-AI-Forge is a project developed around TTS generation model
A high-quality rapid TTS voice cloning model
Private chat with local GPT with document, images, video, etc.
One-click deployment (including offline integration package)
Qwen3-omni is a natively end-to-end, omni-modal LLM
Smart Thermodynamic Modeling with Graph Neural Networks