GLM-4-Voice | End-to-End Chinese-English Conversational Model
Lets make video diffusion practical
Hackable and optimized Transformers building blocks
tiktoken is a fast BPE tokeniser for use with OpenAI's models
A Family of Open Sourced Music Foundation Models
Agentic, Reasoning, and Coding (ARC) foundation models
High-resolution models for human tasks
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Visual Causal Flow
Reference PyTorch implementation and models for DINOv3
Qwen3-ASR is an open-source series of ASR models
Open-Source Financial Large Language Models
FAIR Sequence Modeling Toolkit 2
Research code artifacts for Code World Model (CWM)
Programmatic access to the AlphaGenome model
Inference code for scalable emulation of protein equilibrium ensembles
Models for object and human mesh reconstruction
Generate Any 3D Scene in Seconds
gpt-oss-120b and gpt-oss-20b are two open-weight language models
An experimental version of DeepSeek model
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
An AI-powered security review GitHub Action using Claude
CogView4, CogView3-Plus and CogView3(ECCV 2024)