Multilingual Automatic Speech Recognition with word-level timestamps
A PyTorch-based Speech Toolkit
Formula recognition based on LaTeX-OCR and ONNXRuntime
A full spaCy pipeline and models for scientific/biomedical documents
Open Source Computer Vision Library
Voice Recognition to Text Tool
Replace OpenAI GPT with another LLM in your app
Training data (data labeling, annotation, workflow) for all data types
Crowdsourcing platform for full text transcription and tagging
Book_4_Matrix Power | The Iris Book: From Addition, Subtraction
Han Language Processing
Toolkit for conversational AI
CLI tool to extract (meta)data from PDF and manipulate PDF files
Framework for building real-time voice and multimodal AI agents
Open source AI VTuber platform with voice chat and Live2D avatars
OCRmyPDF adds an OCR text layer to scanned PDF files
Accurate × Fast × Comprehensive
Enhances Tesseract OCR output using LLMs (local or API)
Recognition and resolution of numbers, units, date/time, etc.
A high-quality tool for convert PDF to Markdown and JSON
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
A proof-of-concept jupyter extension which converts english queries
2D and 3D Face alignment library build using pytorch
Semantic search and workflows for medical/scientific papers