Open source no-code system for text annotation and building of text
OCR software, free and offline
Official inference repo for FLUX.1 models
Code for running inference and finetuning with SAM 3 model
CLIP, Predict the most relevant text snippet given an image
Spark-TTS Inference Code
Wan2.2: Open and Advanced Large-Scale Video Generative Model
An Open Source text-to-speech system built by inverting Whisper
A simple, high-quality voice conversion tool focused on ease of use
Wan2.1: Open and Advanced Large-Scale Video Generative Model
SOTA Open Source TTS
High-Resolution Image Synthesis with Latent Diffusion Models
A Powerful Native Multimodal Model for Image Generation
Library for OCR-related tasks powered by Deep Learning
Official inference repo for FLUX.2 models
Code for the paper "Evaluating Large Language Models Trained on Code"
Offline Text To Speech synthesis for python
NLP Cloud serves high performance pre-trained or custom models for NER
Ready-to-use OCR with 80+ supported languages
Offline inference engine for art, real-time voice conversations
Designed for text embedding and ranking tasks
Audiocraft is a library for audio processing and generation
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Sample code and notebooks for Generative AI on Google Cloud
Collection of Gemma 3 variants that are trained for performance