Document (PDF, Word, PPTX ...) extraction and parse API
Mozc - a Japanese Input Method Editor designed for multi-platform
A robust, efficient, low-latency speech-to-text library
Qwen3-ASR is an open-source series of ASR models
Library for OCR-related tasks powered by Deep Learning
Underthesea - Vietnamese NLP Toolkit
Han Language Processing
Turn colors into words
Open source AI model for generating full songs from lyrics prompts
SOTA Open Source TTS
Autoregressive Model Beats Diffusion
Paste Markdown and AI responses into Word Excel instantly fast
A community-supported supercharged version of paperless
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Easy-to-use and powerful NLP library with Awesome model zoo
A very simple framework for state-of-the-art NLP
Rich is a Python library for rich text and beautiful formatting
Implementation of Make-A-Video, new SOTA text to video generator
Turn words into colors
Open source libraries and APIs to build custom preprocessing pipelines
Models for the spaCy Natural Language Processing (NLP) library
Stanford NLP Python library for many human languages
Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
An open sourced end-to-end VLM-based GUI Agent
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD