Real-time voice interactive digital human
Context-aware desktop AI assistant that understands screen content
Concatenate a directory full of files into a single prompt
Marrying Grounding DINO with Segment Anything & Stable Diffusion
Offical Implementation for "Recursive Multi-Agent Systems"
Making RAG Simpler with Small and Open-Sourced Language Models
Bailing is a voice dialogue robot similar to GPT-4o
Open-Sora: Democratizing Efficient Video Production for All
Extension of Google Research’s PaperBanana
Retrieval and Retrieval-augmented LLMs
A Pioneering Open-Source Alternative to GPT-4o
Python tool for crawling and extracting structured data from news site
Large-language-model & vision-language-model based on Linear Attention
Implementation of Make-A-Video, new SOTA text to video generator
Autoregressive Model Beats Diffusion
StarVector is a foundation model for SVG generation
General-purpose image editing model that delivers high-fidelity
Diffusion Transformer with Fine-Grained Chinese Understanding
Phi-3.5 for Mac: Locally-run Vision and Language Models
Python CLI utility and library for manipulating SQLite databases
Python crawler for collecting and downloading Sina Weibo user data
"Big Model" trains a visual multimodal VLM with 26M parameters
AI-Powered Personalized Learning Assistant
Community-maintained approach to improving access to GitHub services
A system for agentic LLM-powered data processing and ETL