Diffusion Transformer with Fine-Grained Chinese Understanding
Contexts Optical Compression
Official implementation of DreamCraft3D
Sharp Monocular Metric Depth in Less Than a Second
Multimodal model achieving SOTA performance
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large-language-model & vision-language-model based on Linear Attention
Detect faces in an image
Software that can generate photos from paintings
Small 3B-base multimodal model ideal for custom AI on edge hardware
Compact 8B multimodal instruct model optimized for edge deployment
Efficient 14B multimodal instruct model with edge deployment and FP8