Miso TTS is an 8 billion, highly emotive text-to-speech model
Recovering the Visual Space from Any Views
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
A theoretical reconstruction of the Claude Mythos architecture
Qwen-Image is a powerful image generation foundation model
Python bindings for llama.cpp
HY-Motion model for 3D character animation generation
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Advancing Open-source World Models
A Powerful Native Multimodal Model for Image Generation
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Revolutionizing Database Interactions with Private LLM Technology
Open-source large language model family from Tencent Hunyuan
A Systematic Framework for Interactive World Modeling
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Provides convenient access to the Anthropic REST API from any Python 3
Multimodal Diffusion with Representation Alignment
Open-Source Financial Large Language Models
tiktoken is a fast BPE tokeniser for use with OpenAI's models
A Customizable Image-to-Video Model based on HunyuanVideo
A Multi-Modal World Model for Reconstructing, Generating, Simulation
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large Multimodal Models for Video Understanding and Editing
RGBD video generation model conditioned on camera input