Opensource browser using agents
ContextGem: Effortless LLM extraction from documents
Benchmarking Multimodal Agents for Open-Ended Tasks
Visual Instruction Tuning: Large Language-and-Vision Assistant
SOTA Open Source TTS
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
AI-powered video clipping and highlight generation
Get a ChatGPT plugin up and running in under 5 minutes
AIMET is a library that provides advanced quantization and compression
Bash is all you need, write a claude code with only 16 line code
Controllable & emotion-expressive zero-shot TTS
Real-World Centric Foundation GUI Agents
Context data platform for building observable, self-learning AI agents
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Multi-lingual large voice generation model, providing inference
Pokee Deep Research Model Open Source Repo
Inference Llama 2 in one file of pure C
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
RL research on Android devices
A coding-free framework built on PyTorch
Evaluation and Tracking for LLM Experiments
JAX-based neural network library
Fast, flexible and easy to use probabilistic modelling in Python
The most intuitive, flexible, way for researchers to build models
Fast, powerful, git-native ticket tracking in a single bash script