Driving with Graph Visual Question Answering
A system for agentic LLM-powered data processing and ETL
Multimodal Agents as Smartphone Users, an LLM-based multimodal agent
Fast, powerful, git-native ticket tracking in a single bash script
95% token savings. 155x faster queries. 16 languages
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Agent S: an open agentic framework that uses computers like a human
An efficient forwarding service designed for LLMs
A step-by-step guide to build your own AI agent
Framework for building AI-powered interactive digital humans and agent
Tools for merging pretrained large language models
Real-World Centric Foundation GUI Agents
Uncommon Objects in 3D dataset
Data Lake for Deep Learning. Build, manage, and query datasets
Python package built to ease deep learning on graph
A Universal Customization Method for Single and Multi Conditioning
Build cross-modal and multimodal applications on the cloud
High-Fidelity and Controllable Generation of Textured 3D Assets
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Automatically translates the text of a video based on a subtitle file
Private chat with local GPT with document, images, video, etc.
Full stack AI software engineer
Mice speech to text with MX Cinnamon OS ISO
Deep universal probabilistic programming with Python and PyTorch
GPU environment management and cluster orchestration