Awesome multilingual OCR toolkits based on PaddlePaddle
Python SDK for Claude Agent
Visual Causal Flow
From Images to High-Fidelity 3D Assets
Video Object and Interaction Deletion
Qwen3.5 is the large language model series developed by Qwen team
RGBD video generation model conditioned on camera input
Open Source Speech Language Model
ChatGPT interface with better UI
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
A multimodal model for brain response prediction
Claude Code image, a one-stop open source transit service
Controllable & emotion-expressive zero-shot TTS
Long-form streaming TTS system for multi-speaker dialogue generation
Open-source framework for intelligent speech interaction
State of the art LLM and coding model
Audio foundation model excelling in audio understanding
Qwen3-ASR is an open-source series of ASR models
Contexts Optical Compression
Pushing the Limits of Mathematical Reasoning in Open Language Models
Analyze computation-communication overlap in V3/R1
Foundational Models for State-of-the-Art Speech and Text Translation
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Example Discord bot written in Python that uses the completions API