Run PyTorch LLMs locally on servers, desktop and mobile
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
MCP integration platforms for AI agents to use tools at any scale
Building a Secure and Interoperable Future for AI-Driven Payments
The AI-powered coding wizard
An open sourced end-to-end VLM-based GUI Agent
Talk to Your AI Agents from Anywhere
On-device Speech-to-Intent engine powered by deep learning
UI-TARS-desktop version that can operate on your local personal device
Reading book source
A trainable PyTorch reproduction of AlphaFold 3
Implementation of "MobileCLIP" CVPR 2024
Python Crypto Bot (PyCryptoBot)
A toolkit to optimize ML models for deployment for Keras & TensorFlow
Multimodal Agents as Smartphone Users, an LLM-based multimodal agent
Adds support for Yandex Smart Home (Alice voice assistant)
A python tool that uses GPT-4, FFmpeg, and OpenCV
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Real-World Centric Foundation GUI Agents
A neural network that transforms a design mock-up into static websites
InvokeAI is a leading creative engine for Stable Diffusion models
Offline speech recognition API for Android, iOS, Raspberry Pi
Data science on data without acquiring a copy
Build Vision Agents quickly with any model or video provider
The most powerful Android RPA agent framework