A Powerful Native Multimodal Model for Image Generation
Revolutionizing Database Interactions with Private LLM Technology
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Achieving 3+ generation speedup on reasoning tasks
Advancing Open-source World Models
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Programmatic access to the AlphaGenome model
Video Object and Interaction Deletion
Open-source deep-learning framework
Code for running inference with the SAM 3D Body Model 3DB
Sharp Monocular Metric Depth in Less Than a Second
Provides convenient access to the Anthropic REST API from any Python 3
DeepSeek Coder: Let the Code Write Itself
Phi-3.5 for Mac: Locally-run Vision and Language Models
LTX-Video Support for ComfyUI
Multimodal Diffusion with Representation Alignment
Easy Docker setup for Stable Diffusion with user-friendly UI
FAIR Sequence Modeling Toolkit 2
PyTorch code and models for the DINOv2 self-supervised learning
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Controllable & emotion-expressive zero-shot TTS
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Open-source framework for intelligent speech interaction
GLM-4 series: Open Multilingual Multimodal Chat LMs
An Efficient Agentic Model for Computer Use