Skip to content

Latest commit

 

History

History
207 lines (157 loc) · 9.25 KB

README.md

File metadata and controls

207 lines (157 loc) · 9.25 KB

splatter360

Official implementation of Splatter-360: Generalizable 360 Gaussian Splatting for Wide-baseline Panoramic Images

News

2024.12.15 we upload the preprocess code of HM3D and Replica dataset.

Citation

If you find this repo useful, please give me a star.

And if you use our code, please cite the following bibtex:

@article{chen2024splatter,
  title={Splatter-360: Generalizable 360$\^{}$\{$$\backslash$circ$\}$ $ Gaussian Splatting for Wide-baseline Panoramic Images},
  author={Chen, Zheng and Wu, Chenming and Shen, Zhelun and Zhao, Chen and Ye, Weicai and Feng, Haocheng and Ding, Errui and Zhang, Song-Hai},
  journal={arXiv preprint arXiv:2412.06250},
  year={2024}
}

Installation

To get started, create a conda virtual environment using Python 3.10+ and install the requirements:

conda create -n splat360 python=3.10
conda activate splat360
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Acquiring Datasets

Replica: Download replica_dataset.zip (rgb and depth files) and replica_dataset_pt.zip (scene indices) from BaiduNetDisk or OneDrive and unzip them in the same directory. Revise dataset.roots and dataset.rgb_roots respectively in config/experiment/replica.yaml according to your storage directory.

HM3D: As HM3D training set is too large(about 2~3T), we upload the preprocess code to make our training and test code. You can make your training set by yourself. (We encourage the following researchers to refine our preprocessing code to save storage space.)

We will upload our HM3D testset.

Running the Code

Evaluation

To render novel views and compute evaluation metrics from a pretrained model,

# eval on HM3D
output_dir="./outputs/splat360_log_depth_near0.1-100k/"
checkpoint_path="./checkpoints/hm3d.ckpt"
CUDA_VISIBLE_DEVICES=0 python -m src.main \
    +experiment=hm3d \
    model.encoder.shim_patch_size=8\
    model.encoder.downscale_factor=8\
    model.encoder.depth_sampling_type="log_depth" \
    output_dir=$output_dir \
    dataset.near=0.1 \
    mode="test" \
    dataset/view_sampler=evaluation \
    checkpointing.load=$checkpoint_path \
    dataset.view_sampler.index_path="assets/evaluation_index_hm3d.json"\
    test.eval_depth=true

  • the rendered novel views will be stored under outputs/test

To render videos from a pretrained model, run the following

# HM3D render video
output_dir="./outputs/splat360_log_depth_near0.1-100k/"
checkpoint_path="./checkpoints/hm3d.ckpt"
CUDA_VISIBLE_DEVICES=0 python -m src.main \
    +experiment=hm3d \
    model.encoder.shim_patch_size=8 \
    model.encoder.downscale_factor=8 \
    model.encoder.depth_sampling_type="log_depth" \
    output_dir=$output_dir \
    dataset.near=0.1 \
    mode="test" \
    dataset/view_sampler=evaluation\ 
    checkpointing.load=$checkpoint_path\
    dataset.view_sampler.index_path="assets/evaluation_index_hm3d_video.json" \
    test.save_video=true \
    test.save_image=false \
    test.compute_scores=false \
    test.eval_depth=true

Training



# download the backbone pretrained weight from unimath and save to 'checkpoints/'
wget 'https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-resumeflowthings-scannet-5d9d7964.pth' -P checkpoints

# download the pretrained weight of depth-anything and save to 'checkpoints/'
wget https://huggingface.co/depth-anything/Depth-Anything-V2-Small/resolve/main/depth_anything_v2_vits.pth -P checkpoints


# Our models are trained with 8 V100 (32GB) GPU.
max_steps=100000
output_dir="./outputs/splat360_log_depth_near0.1-100k/"
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m src.main \
     +experiment=hm3d data_loader.train.batch_size=1 \
     model.encoder.shim_patch_size=8 \
     model.encoder.downscale_factor=8 \
     trainer.max_steps=$max_steps \
     model.encoder.depth_sampling_type="log_depth" \
     output_dir=$output_dir \
     dataset.near=0.1

Cross-Dataset Generalization

We use the default model trained on HM3D to conduct cross-dataset evaluations. To evaluate them, e.g., on Replica, run the following command

output_dir="./outputs/splat360_log_depth_near0.1-100k/"
# eval on Replica
checkpoint_path="./checkpoints/hm3d.ckpt"
CUDA_VISIBLE_DEVICES=0 python -m src.main \
    +experiment=replica \
    model.encoder.shim_patch_size=8 \
    model.encoder.downscale_factor=8 \
    model.encoder.depth_sampling_type="log_depth" \
    output_dir=$output_dir \
    dataset.near=0.1 \
    mode="test" \
    dataset/view_sampler=evaluation \
    checkpointing.load=$checkpoint_path \
    dataset.view_sampler.index_path="assets/evaluation_index_replica.json"\
    test.eval_depth=true

Acknowledgements

The project is largely based on pixelSplat, MVSplat, and PanoGRF and has incorporated numerous code snippets from UniMatch. Many thanks to these projects for their excellent contributions!