Open source NLP guide with models, methods, and real use cases
A suite of advanced multi-modal LLMs
Open Source Speech Language Model
Implementing large models into scenario-based applications
SQL-Driven RAG Engine
AI-assisted storyboard and video generation tool
Framework for building real-time voice and multimodal AI agents
Open-source multi-speaker long-form text-to-speech model
Towards Human-Sounding Speech
Knowledge Graph Generation from Any Text
The python library for real-time communication
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Semantic search and document parsing tools for the command line
Automated translation solution for visual novels
Web-based tool converts GitHub repository contents
Build Vision Agents quickly with any model or video provider
Improve your resumes with Resume Matcher
Fast multimodal LLM for real-time voice interaction and AI apps
Diffusion Transformer with Fine-Grained Chinese Understanding
Large-language-model & vision-language-model based on Linear Attention
Visual Causal Flow
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
AI tool that turns Hacker News posts into daily podcast updates
AI tool for automatic batch short video creation and editing
Running large language models on a single GPU