Hunyuan Translation Model Version 1.5
Block Diffusion for Ultra-Fast Speculative Decoding
Revolutionizing Database Interactions with Private LLM Technology
Industrial-level controllable zero-shot text-to-speech system
Sharp Monocular Metric Depth in Less Than a Second
An AI-powered security review GitHub Action using Claude
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A Powerful Native Multimodal Model for Image Generation
The official PyTorch implementation of Google's Gemma models
Open-source large language model family from Tencent Hunyuan
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Generate Any 3D Scene in Seconds
CodeGeeX2: A More Powerful Multilingual Code Generation Model
The Clay Foundation Model - An open source AI model and interface
GPT4V-level open-source multi-modal model based on Llama3-8B
Generating Immersive, Explorable, and Interactive 3D Worlds
Foundation model for image generation
Official repository for LTX-Video
CLIP, Predict the most relevant text snippet given an image
ChatGLM-6B: An Open Bilingual Dialogue Language Model
OCR expert VLM powered by Hunyuan's native multimodal architecture
A state-of-the-art open visual language model
Chinese and English multimodal conversational language model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning