Awesome multilingual OCR toolkits based on PaddlePaddle
Generating Immersive, Explorable, and Interactive 3D Worlds
Library for OCR-related tasks powered by Deep Learning
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Web interface for generating images using Stable Diffusion models
Open source machine learning framework to automate text conversations
A community-supported supercharged version of paperless
Chat & pretrained large audio language model proposed by Alibaba Cloud
Capable of understanding text, audio, vision, video
Designed for text embedding and ranking tasks
Qwen3-omni is a natively end-to-end, omni-modal LLM
State-of-the-art TTS model under 25MB
Chat & pretrained large vision language model
Repo of Qwen2-Audio chat & pretrained large audio language model
Han Language Processing
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Stable Diffusion built-in to Blender
Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting
Python tool for converting files and office documents to Markdown
SoTA open-source TTS
Implementation of Make-A-Video, new SOTA text to video generator
Qwen2.5-VL is the multimodal large language model series
The official repo of Qwen chat & pretrained large language model
⚡ Building applications with LLMs through composability ⚡
lightweight package to simplify LLM API calls