Skip to content

Commit

Permalink
update website
Browse files Browse the repository at this point in the history
  • Loading branch information
baegwangbin committed Mar 2, 2024
1 parent 92f29aa commit 6b1199f
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 20 deletions.
Binary file added docs/img/fig_comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 20 additions & 20 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
<nav class="navbar is-light" role="navigation" aria-label="main navigation">
<div class="container is-max-desktop">
<div class="navbar-brand">
<a class="navbar-item" href="https://www.imperial.ac.uk/dyson-robotics-lab/">
<a class="navbar-item" href="https://www.imperial.ac.uk/dyson-robotics-lab/" target="_blank" rel="noopener noreferrer">
<img src="img/logo/logo-dyson.png" alt="Dyson Robotics Lab" style="height: 2.0rem;">
</a>
<a role="button" class="navbar-burger" aria-label="menu" aria-expanded="false" data-target="navbarBasicExample">
Expand All @@ -35,12 +35,12 @@
</div>
<div id="navbarBasicExample" class="navbar-menu">
<div class="navbar-start">
<a class="navbar-item" href="https://www.imperial.ac.uk/">
<a class="navbar-item" href="https://www.imperial.ac.uk/" target="_blank" rel="noopener noreferrer">
<img src="img/logo/logo-imperial.png" alt="Imperial College London" style="height: 1.0rem;">
</a>
</div>
<div class="navbar-end">
<a class="navbar-item" href="https://cvpr.thecvf.com/Conferences/2024">
<a class="navbar-item" href="https://cvpr.thecvf.com/Conferences/2024" target="_blank" rel="noopener noreferrer">
<img src="img/logo/logo-cvpr.png" alt="CVPR 2024" style="height: 2.0rem;">
</a>
</div>
Expand All @@ -60,26 +60,26 @@ <h1 class="title is-2 is-size-3-mobile is-spaced has-text-centered">
</p>
<p class="subtitle is-6 has-text-centered authors mt-5" style="line-height: 1.5;">
<span>
<a href="https://www.baegwangbin.com">Gwangbin&nbsp;Bae</a>
<a href="https://www.baegwangbin.com" target="_blank" rel="noopener noreferrer">Gwangbin&nbsp;Bae</a>
</span>
<span>
<a href="https://www.doc.ic.ac.uk/~ajd/">Andrew&nbsp;J.&nbsp;Davison</a>
<a href="https://www.doc.ic.ac.uk/~ajd/" target="_blank" rel="noopener noreferrer">Andrew&nbsp;J.&nbsp;Davison</a>
</span>
</p>
<p class="subtitle is-6 has-text-centered authors mt-5" style="line-height: 1.5;">
Dyson Robotics Lab, Imperial College London
</p>
</div>
<div class="container is-max-desktop has-text-centered mt-5">
<a href="https://github.com/baegwangbin/DSINE/raw/main/paper.pdf" class="button is-rounded is-link is-light mr-2">
<a href="https://github.com/baegwangbin/DSINE/raw/main/paper.pdf" class="button is-rounded is-link is-light mr-2" target="_blank" rel="noopener noreferrer">
<span class="icon"><i class="fas fa-file-pdf"></i></span>
<span>Paper</span>
</a>
<a href="https://arxiv.org/" class="button is-rounded is-link is-light mr-2">
<a href="https://arxiv.org/" class="button is-rounded is-link is-light mr-2" target="_blank" rel="noopener noreferrer">
<span class="icon"><i class="ai ai-arxiv"></i></span>
<span>arXiv (coming soon)</span>
</a>
<a href="https://github.com/baegwangbin/DSINE" class="button is-rounded is-link is-light">
<a href="https://github.com/baegwangbin/DSINE" class="button is-rounded is-link is-light" target="_blank" rel="noopener noreferrer">
<span class="icon"><i class="fab fa-github"></i></span>
<span>Code</span>
</a>
Expand All @@ -96,7 +96,7 @@ <h1 class="title is-4">
<ul>
<li>We discuss the inductive biases needed for surface normal estimation and propose to (1) utilize the <b>per-pixel ray direction</b> and (2) estimate the surface normals by <b>learning the relative rotation between nearby pixels</b>.</li>
<li>With the right inductive biases, models can be trained with much less number of images. Our model is trained only on <b>160K images, for 12 hours, on a single NVIDIA 4090 GPU</b>.
In comparison, <a href="https://docs.omnidata.vision/pretrained.html#Pretrained-Models/">Omnidata V2</a> (which is based on <a href="https://github.com/isl-org/DPT">DPT</a> architecture) is trained on <b>12M images, for 2 weeks, on four NVIDIA V100 GPUs</b>.
In comparison, <a href="https://docs.omnidata.vision/pretrained.html#Pretrained-Models/" target="_blank" rel="noopener noreferrer">Omnidata V2</a> (which is based on <a href="https://github.com/isl-org/DPT" target="_blank" rel="noopener noreferrer">DPT</a> architecture) is trained on <b>12M images, for 2 weeks, on four NVIDIA V100 GPUs</b>.
</li>
</ul>
</p>
Expand All @@ -110,7 +110,7 @@ <h1 class="title is-4">
Demo
</h1>
<div class="content has-text-justified-desktop">
<p>The input videos are from <a href="https://davischallenge.org/">DAVIS</a>. The predictions are made per-frame.</p>
<p>The input videos are from <a href="https://davischallenge.org/" target="_blank" rel="noopener noreferrer">DAVIS</a>. The predictions are made per-frame.</p>
</div>
<iframe style="display: block; margin: auto;" width="768" height="432" src="https://www.youtube.com/embed/8_tCSWVK4VM" frameborder="0" allowfullscreen></iframe>
</div>
Expand All @@ -122,13 +122,13 @@ <h1 class="title is-4">
</h1>
<div class="content has-text-justified-desktop">
<p>In recent years, the usefulness of surface normal estimation methods has been demonstrated in various areas of computer vision, including
<a href="https://github.com/lllyasviel/ControlNet-v1-1-nightly">image generation</a>,
<a href="https://sites.google.com/view/monograsp">object grasping</a>,
<a href="https://shikun.io/projects/prismer">multi-task learning</a>,
<a href="https://baegwangbin.github.io/IronDepth/">depth estimation</a>,
<a href="https://nicer-slam.github.io/">simultaneous localization and mapping</a>,
<a href="https://www.ollieboyne.com/FOUND/">human body shape estimation</a>,
and <a href="https://florianlanger.github.io/SPARC/">CAD model alignment</a>.
<a href="https://github.com/lllyasviel/ControlNet-v1-1-nightly" target="_blank" rel="noopener noreferrer">image generation</a>,
<a href="https://sites.google.com/view/monograsp" target="_blank" rel="noopener noreferrer">object grasping</a>,
<a href="https://shikun.io/projects/prismer" target="_blank" rel="noopener noreferrer">multi-task learning</a>,
<a href="https://baegwangbin.github.io/IronDepth/" target="_blank" rel="noopener noreferrer">depth estimation</a>,
<a href="https://nicer-slam.github.io/" target="_blank" rel="noopener noreferrer">simultaneous localization and mapping</a>,
<a href="https://www.ollieboyne.com/FOUND/" target="_blank" rel="noopener noreferrer">human body shape estimation</a>,
and <a href="https://florianlanger.github.io/SPARC/" target="_blank" rel="noopener noreferrer">CAD model alignment</a>.
However, despite the growing demand for accurate surface normal estimation models, there has been little discussion on the right inductive biases needed for the task.</p>
</div>
</div>
Expand All @@ -152,7 +152,7 @@ <h1 class="title is-4">
</center>
<div class="content has-text-justified-desktop">
<p>
It also gives us the range of normals that would be <b style="color:blue">visible</b>, effectively <b>halving the output space</b>. We incorporate such a bias by introducing a <b>Ray ReLU</b> activation.
It also gives us the range of normals that would be <b>visible</b>, effectively <b>halving the output space</b>. We incorporate such a bias by introducing a <b>Ray ReLU</b> activation.
</p>
<p>
We also propose to recast surface normal estimation as <b>rotation estimation</b>. At first, this may sound like we are over-complicating the task.
Expand Down Expand Up @@ -200,8 +200,8 @@ <h1 class="title is-4">
Results
</h1>
<div class="content has-text-justified-desktop">
<p>Here we provide a comparison between <a href="https://docs.omnidata.vision/pretrained.html#Pretrained-Models/">Omnidata V2</a> (left) and ours (right).
The input images (shown at the top-left corner) are in-the-wild images from the <a href="https://oasis.cs.princeton.edu/">OASIS</a> dataset.
<p>Here we provide a comparison between <a href="https://docs.omnidata.vision/pretrained.html#Pretrained-Models/" target="_blank" rel="noopener noreferrer">Omnidata V2</a> (left) and ours (right).
The input images (shown at the top-left corner) are in-the-wild images from the <a href="https://oasis.cs.princeton.edu/" target="_blank" rel="noopener noreferrer">OASIS</a> dataset.
Despite being trained on significantly fewer images, our model shows stronger generalization capability.
</p>
</div>
Expand Down

0 comments on commit 6b1199f

Please sign in to comment.