Generating Immersive, Explorable, and Interactive 3D Worlds
Robust Speech Recognition Across Languages, Dialects
Bidirectional token-classification model for identifiable info
Video Object and Interaction Deletion
Renderer for the harmony response format to be used with gpt-oss
Large-language-model & vision-language-model based on Linear Attention
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Designed for text embedding and ranking tasks
The Clay Foundation Model - An open source AI model and interface
Unified Multimodal Understanding and Generation Models
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Tiny vision language model
Inference code for scalable emulation of protein equilibrium ensembles
Inference script for Oasis 500M
Open-weight, large-scale hybrid-attention reasoning model
Hunyuan Translation Model Version 1.5
Tool for exploring and debugging transformer model behaviors
CLIP, Predict the most relevant text snippet given an image
Open Source Speech Language Model
Open-source industrial-grade ASR models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Netease Youdao's open-source embedding and reranker models
Ultra-Efficient LLMs on End Device
Pretrained time-series foundation model developed by Google Research
PyTorch code and models for the DINOv2 self-supervised learning