Open Source Speech Language Model
State-of-the-art (SoTA) text-to-video pre-trained model
A security scanner for custom LLM applications
Qwen3 is the large language model series developed by Qwen team
An advanced paper search agent powered by large language models
LLM-based Reinforcement Learning audio edit model
text and image to video generation: CogVideoX (2024) and CogVideo
Modular AI image and video generation web UI with extensible tools
Weaving the Digital Agent Galaxy
Long-form streaming TTS system for multi-speaker dialogue generation
Video understanding codebase from FAIR for reproducing video models
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Management of Yandex Station and other smart home devices
Diversity-driven optimization and large-model reasoning ability
Repo of Qwen2-Audio chat & pretrained large audio language model
StudioOllamaUI is a local, portable interface for Ollama
Implementation of NÜWA, attention network for text to video synthesis
Large-scale linear classification, regression and ranking in Python
Dual LSTM Encoder for Dialog Response Generation