NLTK Source
Curated list of datasets and tools for post-training
Information hub for our project training the largest possible LLMs
Language modeling in a sentence representation space
Your Fully-Automated Personal AI Assistant
A fast TTS architecture with conditional flow matching
A New Axis of Sparsity for Large Language Models
Code release for Cut and Learn for Unsupervised Object Detection
Traditional Mandarin LLMs for Taiwan
Chinese XLNet pre-trained model
SOTA discrete acoustic codec models with 40/75 tokens per second
Chinese Llama-3 LLMs) developed from Meta Llama 3
Quick guide (especially) for trending instruction finetuning dataset
Unofficial Parallel WaveGAN
Resources, corpora, and tools for Chinese natural language processing
Code release for "Detecting Twenty-thousand Classes
Editing large language models within 10 seconds
Repo for external large-scale work
RAG on Paul Graham's essays
A list of accessible speech corpora for ASR, TTS
Classical piano MIDI dataset
PyTorch original implementation of Cross-lingual Language Model
A Chinese information extraction tool
DeepMind's Tacotron-2 Tensorflow implementation
THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/