-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
236 lines (215 loc) · 10.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description"
content="Gripper-Aware Grasping: End-Effector Shape Context for Cross-Gripper Generalization">
<meta name="author" content="Alina Sarmiento,
Anthony Simeonov,
Pulkit Agrawal">
<title>Gripper-Aware Grasping: End-Effector Shape Context for Cross-Gripper Generalization</title>
<!-- Bootstrap core CSS -->
<!--link href="bootstrap.min.css" rel="stylesheet"-->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css"
integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">
<!-- Custom styles for this template -->
<link href="offcanvas.css" rel="stylesheet">
</head>
<body>
<div class="jumbotron jumbotron-fluid">
<div class="container"></div>
<h2>Gripper-Aware Grasping</br> End-Effector Shape Context for Cross-Gripper Generalization</h2>
<h3>IROS IPPC 2023</h3>
<hr>
<p class="authors">
<a href="https://alinasarmiento.github.io/"> Alina Sarmiento</a>,
<a href="https://anthonysimeonov.github.io/"> Anthony Simeonov</a>,
<a href="http://people.csail.mit.edu/pulkitag/"> Pulkit Agrawal</a></br>
Massachusetts Institute of Technology</br>
</p>
<div class="btn-group" role="group" aria-label="Top menu">
<a class="btn btn-primary" href="https://arxiv.org/abs/2307.04751">Paper</a>
<a class="btn btn-primary" href="https://github.com/anthonysimeonov/rpdiff">Code</a>
</div>
</div>
<div class="container">
<div class="section">
<div class="vcontainer">
<iframe class='video' src="https://www.youtube.com/embed/x9noTl_aqu0" frameborder="0"
allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
</div>
<hr>
<p>
In cluttered, unmodeled environments, many learned manipulation pipelines rely on some
inherent knowledge of robot and end-effector extents to predict or solve for feasible grasp poses
and motion plans. However, these models become specific to one robot geometry and cannot effectively
generalize to other end-effectors that have different feasible grasp distributions.
We present Gripper-Aware GraspNet, a learned pipeline for grasping and manipulating unknown objects in highly
occluded environments, conditioned on gripper geometry.
This method builds off of prior work in learned 6D grasp generation that was previously limited to
specific gripper geometries and can predict grasps that utilize a wide range of gripper extents. We
demonstrate results on cluttered tabletop picking from a single view pointcloud, and show results that
utilize full gripper extents across different end-effectors in simulation. We also show a qualitative
improvement on grasp diversity when using different grippers in the real world.
</p>
</div>
<div class="section">
<h2>Grasp Generation On Different Grippers</h2>
<hr>
<p>
Grasp generation across various gripper extents.
</p>
<h3>Mug/Rack-multi</h3>
<div class="row justify-content-center">
<div class="col-md-4">
<video width="100%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/mugrack_miscviz_k5_2_c.mp4" type="video/mp4">
</video>
</div>
<div class="col-md-4">
<video width="100%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/mugrack_miscviz_k5_3_c.mp4" type="video/mp4">
</video>
</div>
<div class="col-md-4">
<video width="100%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/mugrack_miscviz_k5_1_c.mp4" type="video/mp4">
</video>
</div>
</div>
<h3>Can/Cabinet</h3>
<div class="row justify-content-center">
<div class="col-md-4">
<video width="100%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/cancabinet_miscviz_k5_2_c.mp4" type="video/mp4">
</video>
</div>
<div class="col-md-4">
<video width="100%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/cancabinet_miscviz_k5_3_c.mp4" type="video/mp4">
</video>
</div>
<div class="col-md-4">
<video width="100%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/cancabinet_miscviz_k5_1_c.mp4" type="video/mp4">
</video>
</div>
</div>
</div>
<div class="section">
<h2>Real-world Multi-modal Rearrangement via Pick-and-Place</h2>
<hr>
<p>
Rearrangement in the real world using the Franka Panda arm. Each task features scene
objects that offer multiple placement locations. RPDiff is used to produce a set of
candidate placements and one of the predicted solutions is executed. Multiple
executions in sequence show the ability to find multiple diverse solutions. Our
neural network is trained in simulation and directly deployed in the real world
(we do observe some performance gap due to sim2real distribution shift).
</p>
<!-- <h3>Book/Bookshelf</h3> -->
<div class="row align-items-center">
<div class="col justify-content-center text-center">
<h3>Book/Bookshelf</h3>
<video width="60%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/book_bookshelf_seq1_8x_rdc_lout.mp4" type="video/mp4">
</video>
</div>
</div>
<!-- <h3>Mug/Rack-multi</h3> -->
<div class="row align-items-center">
<div class="col justify-content-center text-center">
<h3>Mug/Rack-multi</h3>
<video width="60%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/mug_rack_multi_seq1_8x_lout.mp4" type="video/mp4">
</video>
</div>
</div>
<!-- <h3>Can/Cabinet</h3> -->
<div class="row align-items-center">
<div class="col justify-content-center text-center">
<h3>Can/Cabinet</h3>
<video width="60%" playsinline="" autoplay="" loop="" preload="" muted="">
<source src="img/can_cabinet_seq1_10x_lout.mp4" type="video/mp4">
</video>
</div>
</div>
</div>
<div class="section">
<h2>External Related Projects</h2>
<hr>
<p>
Check out other projects related to diffusion models, iterative prediction, and rearrangement<br>
</p>
<div class='row vspace-top'>
<div class="col-sm-3">
<img src='img/external/structdiff.png' class='img-fluid'>
</div>
<div class="col">
<div class='paper-title'>
<a href="https://structdiffusion.github.io/">StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects</a>
</div>
<div>
Combines a diffusion model and an object-centric transformer to construct structures given
partial-view point clouds and high-level language goals, such as "set the table" and "make a line".
Using use one multi-task model, this allows building physically-valid structures without
step-by-step instructions.
</div>
</div>
</div>
</div>
<div class="section">
<h2>Paper</h2>
<hr>
<div>
<div class="list-group">
<a href="https://arxiv.org/abs/2307.04751"
class="list-group-item">
<img src="img/paper-thumb.png" style="width:100%; margin-right:-20px; margin-top:-10px;">
</a>
</div>
</div>
</div>
<div class="section">
<h2>Bibtex</h2>
<hr>
<div class="bibtexsection">
@article{simeonov2023rpdiff,
author = {Sarmiento, Alina
and Simeonov, Anthony
and Agrawal, Pulkit},
title = {Gripper-Aware Grasping: End-Effector Shape Context
for Cross-Gripper Generalization},
journal={arXiv preprint arXiv:2307.04751},
year={2023}
}
</div>
</div>
<hr>
<!-- <footer>
<h2>Acknowledgements</h2>
<p>
We would like to thank NVIDIA Seattle Robotics Lab members and the MIT Improbable AI Lab for their valuable feedback and support in developing this project.
In particular, we would like to acknowledge Idan Shenfeld, Anurag Ajay, and Antonia Bronars for helpful suggestions on improving the clarity of the draft.
This work was partly supported by Sony Research Awards and Amazon Research Awards. Anthony Simeonov is supported in part by the NSF Graduate Research Fellowship.
</p>
<p>Send feedback and questions to <a href="https://anthonysimeonov.github.io">Anthony Simeonov</a></p>
<div class="row justify-content-center">
<p>Website template recycled from <a href="https://www.vincentsitzmann.com/siren/">SIREN</a></p>
</div>
</footer> -->
</div>
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"
integrity="sha384-DfXdz2htPH0lsSSs5nCTpuj/zy4C+OGpamoFVy38MVBnE+IbbVYUew+OrCXaRkfj"
crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/npm/popper.js@1.16.0/dist/umd/popper.min.js"
integrity="sha384-Q6E9RHvbIyZFJoft+2mJbHaEWldlvI9IOYy5n3zV9zzTtmI3UksdQRVvoxMfooAo"
crossorigin="anonymous"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/js/bootstrap.min.js"
integrity="sha384-OgVRvuATP1z7JjHLkuOU7Xw704+h835Lr+6QL9UvYjZE3Ipu6Tp75j7Bh/kR0JKI"
crossorigin="anonymous"></script>
</body>
</html>