... as the denoising network) To train DALLE-2 is a 3 step process, with the training of CLIP being the most important. To train CLIP, you can either use x-clip package, or join the LAION discord, where a lot of replication efforts are already underway. Then, you will need to train the decoder, which learns to generate images based on the image embedding coming from the trained CLIP.