Less Code, Lower Barrier, Faster Deployment
NLTK Source
A full spaCy pipeline and models for scientific/biomedical documents
Language modeling in a sentence representation space
Video-based AI memory library. Store millions of text chunks in MP4
Reading book source
The Classical Language Toolkit
Topic Modelling for Humans
A fast TTS architecture with conditional flow matching
The simplest, fastest repository for training/finetuning models
Your Fully-Automated Personal AI Assistant
A New Axis of Sparsity for Large Language Models
Code release for Cut and Learn for Unsupervised Object Detection
Traditional Mandarin LLMs for Taiwan
Chinese XLNet pre-trained model
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Chinese Llama-3 LLMs) developed from Meta Llama 3
SOTA discrete acoustic codec models with 40/75 tokens per second
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
A subtitle generator for Japanese Adult Videos.
Aligns tokens in two versions of a text with differing tokenization.
Unofficial Parallel WaveGAN
Resources, corpora, and tools for Chinese natural language processing
All-in-one text de-duplication
Code release for "Detecting Twenty-thousand Classes