VQGAN-CLIP has been in vogue for generating art using deep learning. Searching the r/deepdream subreddit for VQGAN-CLIP yields quite a number of results. Basically, VQGAN can generate pretty high-fidelity images, while CLIP can produce relevant captions for images. Combined, VQGAN-CLIP can take prompts from human input, and iterate to generate images that fit the prompts. Thanks to the generosity of creators sharing notebooks on Google Colab, the VQGAN-CLIP technique has seen widespread circulation. However, for regular usage across multiple sessions, I prefer a local setup that can be started up rapidly. Thus, this simple Streamlit app for generating VQGAN-CLIP images on a local environment. Be advised that you need a beefy GPU with lots of VRAM to generate images large enough to be interesting. (Hello Quadro owners!).
Features
- VQGAN-CLIP: streamlit run app.py, launches web app on localhost:8501 if available
- CLIP guided diffusion: streamlit run diffusion_app.py, launches web app on localhost:8501 if available
- python gallery.py, launches a gallery viewer on localhost:5000
- In the web app, select settings on the sidebar, key in the text prompt, and click run to generate images using VQGAN-CLIP
- CLIP guided diffusion
- Output and gallery viewer