Skip to content

Latest commit

 

History

History
53 lines (42 loc) · 2.22 KB

README.md

File metadata and controls

53 lines (42 loc) · 2.22 KB

Paper: HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View (arXiv March 2024)

Please cite our paper if you find it useful.

@article{kothandaraman2023aerialbooth,
  title={AerialBooth: Mutual Information Guidance for Text Controlled Aerial View Synthesis from a Single Image},
  author={Kothandaraman, Divya and Zhou, Tianyi and Lin, Ming and Manocha, Dinesh},
  journal={arXiv preprint arXiv:2311.15478},
  year={2023}
}

Using the code

Datasets: The datasets, AerialBooth-Real and AerialBooth-Syn datasets can be found in the ./dataset/ folder.
Models: The pytorch code for the models are available in the ./models/ folder.
  models/aerialbooth - Model definition for AerialBooth
  models/aerialbooth_viewarg - Provides support for generating any arbitrary text-controlled view
      models/mutual_information - functions for computation of mutual information and earthmovers' distance
  models/aerialdiffusion_lora - Model definition for Aerial Diffusion LoRA
  models/dreambooth_lora - Model definition for DreamBooth LoRA
  models/imagic - Model definition for Imagic LoRA
Training scripts:
  Use train_aerialbooth_batch.py to perform optimization and generate the aerial-view image of a given input image.
  Use train_aerialbooth_view.py to perform optimization and generate the arbitrary text-controlled views of a given input image.
Computing the quantitative metrics:
Use eval_metrics_best_batch to compute the CLIP, SSCD and DINO scores of the generated images.

Dependencies

torch
cv2
diffusers
numpy
scipy
accelerate
packaging
transformers

Method

Acknowledgements

This codebase is heavily borrowed from https://github.com/huggingface/diffusers/blob/main/examples/community/imagic_stable_diffusion.py.