26m function call model that runs on incredibly small devices
Official repository for LTX-Video
Awesome multilingual OCR toolkits based on PaddlePaddle
Implementation of "MobileCLIP" CVPR 2024
Native and Compact Structured Latents for 3D Generation
Python inference and LoRA trainer package for the LTX-2 audio–video
Tiny vision language model
Z80-μLM is a 2-bit quantized language model
The most powerful local music generation model
AlphaFold 3 inference pipeline
Python SDK for Claude Agent
PyTorch code and models for the DINOv2 self-supervised learning
Stable Diffusion with Core ML on Apple Silicon
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Unified Multimodal Understanding and Generation Models
Multimodal embedding and reranking models built on Qwen3-VL
Text and image to video generation: CogVideoX and CogVideo
Generate Any 3D Scene in Seconds
Foundation Models for Time Series
RGBD video generation model conditioned on camera input
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
GLM-4 series: Open Multilingual Multimodal Chat LMs
Open-source large language model family from Tencent Hunyuan
DeepMind model for tracking arbitrary points across videos & robotics