Build multimodal language agents for fast prototype and production
borb is a library for reading, creating and manipulating PDF files
Han Language Processing
A library to help you make the most out of your Pixoo 64
LLM
A sound cloning tool with a web interface, using your voice
Qwen3-omni is a natively end-to-end, omni-modal LLM
Create videos with Stable Diffusion
Fast multimodal LLM for real-time voice interaction and AI apps
Open Source Speech Language Model
Multimodal-Driven Architecture for Customized Video Generation
A Web UI for easy subtitle using whisper model
LaTeX source and supporting code for Think Python, 2nd edition
21 Lessons, Get Started Building with Generative AI
Lightning-fast, on-device TTS, running natively via ONNX
Unified web UI for training and running open models locally
Audiocraft is a library for audio processing and generation
Multimodal AI chat app with dynamic conversation routing
Automated translation solution for visual novels
PersonaPlex code
Collection of Gemma 3 variants that are trained for performance
A modular graph-based Retrieval-Augmented Generation (RAG) system
Controllable & emotion-expressive zero-shot TTS
An easy-to-use backup tool for GNU Linux using rsync in the back
Network analysis in Python