Clone a voice in 5 seconds to generate arbitrary speech in real-time
Label Studio is a multi-type data labeling and annotation tool
Library for OCR-related tasks powered by Deep Learning
Audiocraft is a library for audio processing and generation
Persian NLP Toolkit
Web interface for generating images using Stable Diffusion models
Compute distance between sequences
Python bindings for MuPDF's rendering library.
Generating Immersive, Explorable, and Interactive 3D Worlds
Dataset of GPT-2 outputs for research in detection, biases, and more
Implementation of Phenaki Video, which uses Mask GIT
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Qwen-Image is a powerful image generation foundation model
Open source no-code system for text annotation and building of text
High accuracy RAG for answering questions from scientific documents
Underthesea - Vietnamese NLP Toolkit
An open-source toolkit for monitoring Language Learning Models (LLMs)
Crowdsourcing platform for full text transcription and tagging
CLIP, Predict the most relevant text snippet given an image
Toolkit for conversational AI
The most accurate natural language detection library for Python
Stanford NLP Python library for many human languages
MTEB: Massive Text Embedding Benchmark
Math OCR model that outputs LaTeX and markdown
Implementation of Video Diffusion Models