Make bilingual epub books Using AI translate
OCR model for complex documents with layout-aware structured outputs
Framework for building real-time voice and multimodal AI agents
The most powerful local music generation model
Personal mini-web in text
The most accurate natural language detection library for Python
Toolkit for conversational AI
Using AI models to automatically provide commentary and edit videos
Capable of understanding text, audio, vision, video
Open source machine learning framework to automate text conversations
An easy-to-use backup tool for GNU Linux using rsync in the back
High-Resolution Image Synthesis with Latent Diffusion Models
Deep Research framework, combining language models with tools
Context-aware desktop AI assistant that understands screen content
StreamSpeech is a seamless model for offline speech recognition
A Web UI for easy subtitle using whisper model
A Multi-Modal World Model for Reconstructing, Generating, Simulation
End-to-end speech processing toolkit
A Unified Framework for Text-to-3D and Image-to-3D Generation
A very simple framework for state-of-the-art NLP
Ark pixel font - Open source Pan-CJK pixel font
A high-quality PDF to Markdown tool based on large language model
Long-form streaming TTS system for multi-speaker dialogue generation
Stable Diffusion built-in to Blender
Turn words into chords