Awesome multilingual OCR toolkits based on PaddlePaddle
Contexts Optical Compression
Accurate × Fast × Comprehensive
OCR expert VLM powered by Hunyuan's native multimodal architecture
Visual Causal Flow
Repo of Qwen2-Audio chat & pretrained large audio language model
Official inference repo for FLUX.2 models
A Powerful Native Multimodal Model for Image Generation
Official code for Style Aligned Image Generation via Shared Attention
Code release for ConvNeXt V2 model