Tool for visualizing and tracking your machine learning experiments
HunyuanVideo: A Systematic Framework For Large Video Generation Model
An elegent pytorch implement of transformers
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models
A simple screen parsing tool towards pure vision based GUI agent
Fast inference engine for Transformer models
From Images to High-Fidelity 3D Assets
DSPy: The framework for programming—not prompting—language models
The largest collection of PyTorch image encoders / backbones
BitNet: Scaling 1-bit Transformers for Large Language Models
The repository provides code for running inference with SAM 2
1 min voice data can also be used to train a good TTS model
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
AirLLM 70B inference with single 4GB GPU
Inference script for Oasis 500M
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Distribute and run LLMs with a single file
Research code artifacts for Code World Model (CWM)
A TTS model capable of generating ultra-realistic dialogue
Global weather forecasting model using graph neural networks and JAX
OCR expert VLM powered by Hunyuan's native multimodal architecture
Ready-to-use OCR with 80+ supported languages
Qwen3.5 is the large language model series developed by Qwen team
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Image generation model with single-stream diffusion transformer