A minimal implementation of diffusion models of text: learns a diffusion model of a given text corpus, allowing to generate text samples from the learned model. The main idea was to retain just enough code to allow training a simple diffusion model and generating samples, remove image-related terms, and make it easier to use. To train a model, run scripts/train.sh. By default, this will train a model on the simple corpus. However, you can change this to any text file using the --train_data argument. Note that you may have to increase the sequence length (--seq_len) if your corpus is longer than the simple corpus. The other default arguments are set to match the best setting I found for the simple corpus.

Features

  • Training from scratch on the greetings dataset
  • Experiments with using pre-trained models and embeddings
  • Controllable Generation
  • A minimal implementation of diffusion models of text
  • Generate text samples from the learned model
  • Opportunities for further minimization

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Minimal text diffusion

Minimal text diffusion Web Site

Other Useful Business Software
$300 Free Credits for Your Google Cloud Projects Icon
$300 Free Credits for Your Google Cloud Projects

Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
Start Free Trial
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Minimal text diffusion!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Text Generators, Python Generative AI

Registered

2023-03-23