High accuracy RAG for answering questions from scientific documents
A simple native web interface that uses ChatTTS to synthesize text
Towards Human-Level Text-to-Speech through Style Diffusion
A Powerful Native Multimodal Model for Image Generation
Python bindings for MuPDF's rendering library.
Open source machine learning framework to automate text conversations
Underthesea - Vietnamese NLP Toolkit
An open-source toolkit for monitoring Language Learning Models (LLMs)
Video-based AI memory library. Store millions of text chunks in MP4
Tools like web browser, computer access and code runner for LLMs
Network analysis in Python
CLIP, Predict the most relevant text snippet given an image
Qwen3-omni is a natively end-to-end, omni-modal LLM
Dataset of GPT-2 outputs for research in detection, biases, and more
Knowledge Agents and Management in the Cloud
A fast, powerful, CommonMark compliant, extensible Markdown processor
Label Studio is a multi-type data labeling and annotation tool
Toolkit for conversational AI
Capable of understanding text, audio, vision, video
JavaScript parser and stringifier for YAML
Controllable and fast Text-to-Speech for over 7000 languages
Implementation of Video Diffusion Models
A nearly-live implementation of OpenAI's Whisper
PDF to Markdown with vision models
Main repository for the Sphinx documentation builder