State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Qwen3-Coder is the code version of Qwen3
Venom is the most complete javascript library for Whatsapp
The leading agent orchestration platform for Claude
A Foundation Model for the Language of Financial Markets
Multilingual Document Layout Parsing in a Single Vision-Language Model
Code release for Cut and Learn for Unsupervised Object Detection
The standard data-centric AI package for data quality and ML
Automatic SQL injection and database takeover tool
Bailing is a voice dialogue robot similar to GPT-4o
Statistical machine intelligence and learning engine
Chinese XLNet pre-trained model
Framework for building neural networks
Refer and Ground Anything Anywhere at Any Granularity
Convert AI papers to GUI
End-to-end speech processing toolkit
PyTorch code and models for VJEPA2 self-supervised learning from video
Language modeling in a sentence representation space
Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
Qwen3-omni is a natively end-to-end, omni-modal LLM
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding
Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant
Chat & pretrained large vision language model
Graphical User Interface Face Anonymization Tool