OCR model for complex documents with layout-aware structured outputs
Document (PDF, Word, PPTX ...) extraction and parse API
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Visual Causal Flow
Contexts Optical Compression
MCScanX: Multiple Collinearity Scan toolkit X version
PDFCraft is a free, privacy-focused PDF toolkit
Use Microsoft Edge's online text-to-speech service from Python
Automated YouTube Shorts pipeline
Clean network diagrams, One-time setup, zero upkeep
AI tool for automatic batch short video creation and editing
A TTS that fits in your CPU (and pocket)
Self-hosted collection of powerful web-based tools for everyday tasks
Modular AI image and video generation web UI with extensible tools
Skills shared by Baoyu for improving daily work efficiency with Claude
Implementing large models into scenario-based applications
Qwen3-ASR is an open-source series of ASR models
Automated translation solution for visual novels
95% token savings. 155x faster queries. 16 languages
AI-assisted storyboard and video generation tool
End-to-end speech processing toolkit
Audiocraft is a library for audio processing and generation
Semantic search and document parsing tools for the command line
Python library for scraping and analyzing online news articles easily
General-purpose image editing model that delivers high-fidelity