Real-time voice interactive digital human
Concatenate a directory full of files into a single prompt
A Multi-Modal World Model for Reconstructing, Generating, Simulation
GEO-first SEO skill for Claude Code
All-in-one AI productivity platform with agents, workflows, and IM
Unifying 3D Mesh Generation with Language Models
I Agent designed to interact with ROS1- and ROS2-based robotics system
Learn to build your Second Brain AI assistant with LLMs
Outcome driven agent development framework that evolves
Provider-agnostic, open-source evaluation infrastructure
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
SOTA discrete acoustic codec models with 40/75 tokens per second
Unified Multimodal Understanding and Generation Models
AI discovers 520000 stable inorganic crystal structures for research
code for Mesh R-CNN, ICCV 2019
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
PyTorch code and models for VJEPA2 self-supervised learning from video
An AI-powered security review GitHub Action using Claude
An open sourced end-to-end VLM-based GUI Agent
A Powerful Native Multimodal Model for Image Generation
Educational framework exploring multi-agent orchestration
Designed for text embedding and ranking tasks
A series of math-specific large language models of our Qwen2 series
Leveraging BERT and c-TF-IDF to create easily interpretable topics
Simplest working implementation of Stylegan2