HunyuanVideo: A Systematic Framework For Large Video Generation Model
DeepVariant is an analysis pipeline that uses a deep neural networks
Geometric deep learning extension library for PyTorch
Jittor is a high-performance deep learning framework
Tooling for the Common Objects In 3D dataset
code for Mesh R-CNN, ICCV 2019
Open source framework for deep learning satellite and aerial imagery
A simple but complete full-attention transformer
Qwen3-omni is a natively end-to-end, omni-modal LLM
Machine Learning Pipelines for Kubeflow
Generate 3D objects conditioned on text or images
MMEditing is a low-level vision toolbox based on PyTorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis
A fast, powerful, and simple hierarchical vision transformer
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Integrate ChatGPT into your own discord bot
CLIP + FFT/DWT/RGB = text to image/video
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
MII makes low-latency and high-throughput inference possible
The standard data-centric AI package for data quality and ML
Create HTML profiling reports from pandas DataFrame objects
Refer and Ground Anything Anywhere at Any Granularity
PyTorch code and models for V-JEPA self-supervised learning from video
PyTorch code and models for the DINOv2 self-supervised learning
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming