AlphaFold 3 inference pipeline
Release for Improved Denoising Diffusion Probabilistic Models
Ling is a MoE LLM provided and open-sourced by InclusionAI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Large Multimodal Models for Video Understanding and Editing
Generating Immersive, Explorable, and Interactive 3D Worlds
A series of math-specific large language models of our Qwen2 series
DeepSeek Coder: Let the Code Write Itself
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Official implementation of Watermark Anything with Localized Messages
CLIP, Predict the most relevant text snippet given an image
Pushing the Limits of Mathematical Reasoning in Open Language Models
Repo of Qwen2-Audio chat & pretrained large audio language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
ICLR2024 Spotlight: curation/training code, metadata, distribution
Capable of understanding text, audio, vision, video
VMZ: Model Zoo for Video Modeling
Towards Real-World Vision-Language Understanding
Foundation Models for Time Series
Hackable and optimized Transformers building blocks
tiktoken is a fast BPE tokeniser for use with OpenAI's models
A state-of-the-art open visual language model
High-resolution models for human tasks
Chat & pretrained large audio language model proposed by Alibaba Cloud