Chinese and English multimodal conversational language model
Fast-stable-diffusion + DreamBooth
VMZ: Model Zoo for Video Modeling
CLIP, Predict the most relevant text snippet given an image
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Project Lyra: Open Generative 3D World Models
Foundation model for image generation
Large-language-model & vision-language-model based on Linear Attention
Diversity-driven optimization and large-model reasoning ability
Open-source deep-learning framework
Phi-3.5 for Mac: Locally-run Vision and Language Models
Open-source large language model family from Tencent Hunyuan
An Efficient Agentic Model for Computer Use
Robust Speech Recognition Across Languages, Dialects
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Pretrained time-series foundation model developed by Google Research
PyTorch code and models for the DINOv2 self-supervised learning
DeepMind model for tracking arbitrary points across videos & robotics
Designed for text embedding and ranking tasks
Miso TTS is an 8 billion, highly emotive text-to-speech model
The Clay Foundation Model - An open source AI model and interface
Open-source image generative foundation model
Open-source industrial-grade ASR models
Ling is a MoE LLM provided and open-sourced by InclusionAI
Netease Youdao's open-source embedding and reranker models