Video understanding codebase from FAIR for reproducing video models
PyTorch code and models for the DINOv2 self-supervised learning
An AI-powered security review GitHub Action using Claude
Dataset of GPT-2 outputs for research in detection, biases, and more
DeepSeek Coder: Let the Code Write Itself
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Multimodal Diffusion with Representation Alignment
Chat & pretrained large audio language model proposed by Alibaba Cloud
A series of math-specific large language models of our Qwen2 series
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Qwen2.5-VL is the multimodal large language model series
Qwen3-omni is a natively end-to-end, omni-modal LLM
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Foundation Models for Time Series
Uncommon Objects in 3D dataset
Hackable and optimized Transformers building blocks
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Programmatic access to the AlphaGenome model
The Clay Foundation Model - An open source AI model and interface
Phi-3.5 for Mac: Locally-run Vision and Language Models
Official code for Style Aligned Image Generation via Shared Attention
The official PyTorch implementation of Google's Gemma models
DeepMind model for tracking arbitrary points across videos & robotics
Implementation of "MobileCLIP" CVPR 2024