Determined, deep learning training platform
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
LLM-based agent for general purpose software engineering tasks
Large Multimodal Models for Video Understanding and Editing
On-device Speech-to-Intent engine powered by deep learning
Powering Amazon custom machine learning chips
Venom is the most complete javascript library for Whatsapp
Open-source MCP server that gives your coding agent
Large-language-model & vision-language-model based on Linear Attention
Qwen3-omni is a natively end-to-end, omni-modal LLM
Inference script for Oasis 500M
Document Image Parsing via Heterogeneous Anchor Prompting”
Framework for building neural networks
StreamSpeech is a seamless model for offline speech recognition
Toolkit for audio, music, and speech generation
Advanced techniques for RAG systems
The best ChatGPT that $100 can buy
A secure sandbox environment for malware developers and red teamers
A Model Context Protocol server for searching and analyzing arXiv
4M: Massively Multimodal Masked Modeling
Guiding Instruction-based Image Editing via Multimodal Large Language
Refer and Ground Anything Anywhere at Any Granularity
Supercharge Your LLM with the Fastest KV Cache Layer
A Model Context Protocol (MCP) Gateway & Registry
The official Meta Llama 3 GitHub site