A minimal implementation of diffusion models of text: learns a diffusion model of a given text corpus, allowing to generate text samples from the learned model. The main idea was to retain just enough code to allow training a simple diffusion model and generating samples, remove image-related terms, and make it easier to use. To train a model, run scripts/train.sh. By default, this will train a model on the simple corpus. However, you can change this to any text file using the --train_data argument. Note that you may have to increase the sequence length (--seq_len) if your corpus is longer than the simple corpus. The other default arguments are set to match the best setting I found for the simple corpus.
Features
- Training from scratch on the greetings dataset
- Experiments with using pre-trained models and embeddings
- Controllable Generation
- A minimal implementation of diffusion models of text
- Generate text samples from the learned model
- Opportunities for further minimization
License
MIT LicenseFollow Minimal text diffusion
Other Useful Business Software
$300 Free Credits for Your Google Cloud Projects
Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of Minimal text diffusion!