Aphantasia
CLIP + FFT/DWT/RGB = text to image/video
... (including multi-language from SBERT), continuous mode to process phrase lists (e.g. illustrating lyrics), pan/zoom motion with smooth interpolation. Direct RGB pixels optimization (very stable) depth-based 3D look (courtesy of deKxi, based on AdaBins), complex queries:
text and/or image as main prompts, separate text prompts for style and to subtract (avoid) topics. Starting/resuming process from saved parameters or from an image.