A Unified Framework for Text-to-3D and Image-to-3D Generation
Fast and Universal 3D reconstruction model for versatile tasks
A PyTorch library for implementing flow matching algorithms
GPT4V-level open-source multi-modal model based on Llama3-8B
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
The ChatGPT Retrieval Plugin lets you easily find personal documents
Ling is a MoE LLM provided and open-sourced by InclusionAI
Kimi K2 is the large language model series developed by Moonshot AI
Qwen3-TTS is an open-source series of TTS models
Chat & pretrained large vision language model
Phi-3.5 for Mac: Locally-run Vision and Language Models
Revolutionizing Database Interactions with Private LLM Technology
A Family of Open Sourced Music Foundation Models
Open-source framework for intelligent speech interaction
Implementation of the Surya Foundation Model for Heliophysics
Chat & pretrained large audio language model proposed by Alibaba Cloud
Large-language-model & vision-language-model based on Linear Attention
Safety reasoning models built-upon gpt-oss
New set of lightweight state-of-the-art, open foundation models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets
Multi-modal large language model designed for audio understanding
Large Multimodal Models for Video Understanding and Editing
The official PyTorch implementation of Google's Gemma models
State-of-the-art Image & Video CLIP, Multimodal Large Language Models