Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details only in the small number of denoising steps.

We train all components from scratch on 115M image-text pairs including COYO-100M, CC3M, and CC12M. In the case of Prior and Decoder, we use ViT-L/14 provided by OpenAI’s CLIP repository. Unlike the original implementation of unCLIP, we replace the trainable transformer in the decoder into the text encoder in ViT-L/14 for efficiency. In the case of the SR module, we first train the model using the DDPM objective in 1M steps, followed by additional 234K steps to fine-tune the additional component.

Project Samples

Project Activity

See All Activity >

License

CreativeML Open RAIL-M

Follow Karlo

Karlo Web Site

Other Useful Business Software
Go From AI Idea to AI App Fast Icon
Go From AI Idea to AI App Fast

One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Karlo!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Image Generators

Registered

2023-06-08