In this project a photorealistic image to text generation and inpainting systems is implemented using diffusion based generative models. The system uses pretrained Open AI GLIDE (Guided Language to Image Diffusion for Generation and Editing) and CLIP (Contrastive Language–Image Pre-training).
Please feel free to contact me at sindhurakshit@yahoo.com for colab notebok of this system.
Photorealistic text-to-image generation -
Interactive GUI for inpainting allows to play with a number of source images and changes in applied mask location, mask size along with different guiding text increases the fun and variety.