code for Mesh R-CNN, ICCV 2019
An AI-powered security review GitHub Action using Claude
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Qwen-Image is a powerful image generation foundation model
Qwen3-omni is a natively end-to-end, omni-modal LLM
Tiny vision language model
The official PyTorch implementation of Google's Gemma models
Programmatic access to the AlphaGenome model
A state-of-the-art open visual language model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
26m function call model that runs on incredibly small devices
Open Source Speech Language Model
Long-form streaming TTS system for multi-speaker dialogue generation
Hunyuan Translation Model Version 1.5
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
Implementation of "MobileCLIP" CVPR 2024
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Personalize Any Characters with a Scalable Diffusion Transformer
Achieving 3+ generation speedup on reasoning tasks
Ultra-Efficient LLMs on End Device