ChatGLM-6B: An Open Bilingual Dialogue Language Model
Capable of understanding text, audio, vision, video
Qwen-Image is a powerful image generation foundation model
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Tool for exploring and debugging transformer model behaviors
High-Resolution Image Synthesis with Latent Diffusion Models
Repo of Qwen2-Audio chat & pretrained large audio language model
Example Discord bot written in Python that uses the completions API
Revolutionizing Database Interactions with Private LLM Technology
Lets make video diffusion practical
Sharp Monocular Metric Depth in Less Than a Second
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Tongyi Deep Research, the Leading Open-source Deep Research Agent
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Inference code for scalable emulation of protein equilibrium ensembles
Fast and Universal 3D reconstruction model for versatile tasks
Pokee Deep Research Model Open Source Repo
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Global weather forecasting model using graph neural networks and JAX
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
High-resolution models for human tasks
GPT4V-level open-source multi-modal model based on Llama3-8B
Renderer for the harmony response format to be used with gpt-oss
Inference framework for 1-bit LLMs
Tooling for the Common Objects In 3D dataset