Poster - SUN RGB-D

SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite
Shuran Song
Motivation
Samuel P. Lichtenberg
Example Scenes
Jianxiong Xiao
Data Capturing
RGB-D (39.0)
PLACESCNN +
RBF kernel
Acknowledgement
This work is supported by gift funds from Intel.
3. Side
4. Front
3. Side
4. Front
Kinect v2
3D annotaion
dining room
2D segmentation
3750
Manhattan Box (0.99)
Convex Hull (0.90)
Geometric Context (0.27)
Ground Truth
Manhattan Box (0.811)
Convex Hull (0.85)
Geometric Context (0.57)
Manhattan Box (0.72)
Convex Hull (0.43)
Room Layout
Estimation
bathroom
5000
RGB-D (23.0)
Ground Truth
Ground Truth
bathtub
Examples of 2D and 3D Annotation
17712
D (20.1)
bed
bookshelf
box
chair
counter
desk
door
dresser
garbage bin
lamp
monitor
night stand
Geometric Context (0.61)
pillow
sink
sofa
table
tv
toilet
Ground truth
Kinect v1
3D annotaion
office
Asus Xtion
2D segmentation
RGB (19.7)
Semantic
Segmentation
(a) object distribution
(b) scene distribution
Kinect v2
SUN3D (ASUS Xtion)
Intel RealSense
NYUv2 (Kinect v1)
2500
bathroom(6.4%)
others(8.0%)
office
(11.0%)
0
IoU: 77.0 Rr: 0.25 Rg: 0.25 Pg: 0.5
IoU 63.9 Rr: 0.333 Rg: 0.667 Pg:1
IoU: 53.1 Rr: 0.111 Rg : 0.111 Pg: 0.5
IoU:60 Rr: 0.50 Rg : 0.0.50 Pg: 0.5
IoU: 53.1 Rr: 0.333 Rg: 0.333 Pg: 0.125
IoU: 78.8 Rr: 1 Rg: 1 Pg: 0.5
IoU: 57.3 Rr :0.33 Rg: 0.667 Pg:0.125
IoU 50.7 Rr: 0.333 Rg: 0.333 Pg : 0.375
IoU: 54.6 Rr : 0.333 Rg : 0.333 Pg: 0.125
rest space(6.3%)
classroom
(9.3%)
1250
Total Scene
Understanding
IoU 72.9 Rr: 0.333 Rg: 0.667 Pg: 0.667
Sliding Shapes
[NYU] Indoor Segmentation and Support Inference from RGBD Images. N. Silberman, P. Kohli, D. Hoiem and R. Fergus. In ECCV, 2012.
[SUN3D] SUN3D: A database of big spaces reconstructed using SfM and object
labels. J. Xiao, A. Owens, and A. Torralba. In ICCV, 2013.
[B3DO] A category-level 3-d object dataset: Putting the kinect to work. A.
Janoch, S. Karayev, Y. Jia, J. T. Barron, M. Fritz, K. Saenko, and T. Darrell. In ICCV
Workshop, 2011.
2. Top
furniture store
(11.3%)
Statistics of Semantic Annotation
living room(6.0%)
kitchen(5.6%)
corridor(3.8%)
lab(3.0%)
conference room(2.6%)
dining area(2.4%)
dining room(2.3%)
discussion area(2.0%)
home office(1.9%)
study space(1.9%)
library(1.4%)
lecture theatre(1.2%)
computer room(1.0%)
bedroom(12.6%)
Effective free space
Outside the room
Inside some objects
Beyond cutoff distance
3D RCNN
References
1. Image
kitchen
SUN RGB-D
color
9,+%
:33.%;1<3=0%
Intel Realsense
raw depth
)*%+,-.,/0123/% )*%+,-.,/0123/%
'*%456,70%
rest space
Annotation Tool for 3D Object and 3D Room Layout
Xtion
Kinect v1
Kinect v2
0.5
4
4.5
7.1× 1.4× 2 11× 2.3× 2.7 9.8× 2.7× 2.7
2.5W USB
12.96W
115W
640× 480 640× 480
512× 424
640× 480 640× 480 1920× 1080
GIST +
RBF kernel
kitchen
lab
lecture theatre
library
living room
Kinect v2 and Battery Capturing Setup
RealSense
weight (pound)
0.077
size (inch) 5.2× 0.25× 0.75
power
2.5W USB
depth resolution
628× 468
color resolution 1920× 1080
refined depth
:;<&=($
>446(?@64$
2. Top
ch
a
ta ir
b
de le
pi sk
llo
w
so
fa
be
d
ga ca bo
rb bi x
ag ne
e t
b
so la in
fa mp
ch
ai
r
m she
on lf
dr ito
aw r
fra er
m
sid s e
e ink
ta
b
pa le
pe
r
tra
ni sh tv
gh c
t s an
ta
n
bo d
bo d ok
o o
co k sh or
m el
p f
dr ute
es r
cu ser
rta
to in
ile
ra t
c
ki k
tc e b k
he yb ag
n oa
ca rd
bi
n
co
ffe c et
e pu
t
w f abl
hi ri e
te dg
bo e
di p ard
ni rin
ng t
ta er
b
st le
o
bo ol
t
to tle
w
e
cu l
ha
co ng p pla p
m in ain nt
pu g ti
te cab ng
r m in
on et
ito
un mir r
kn ror
o
la wn
p
st
ee b top
l c oa
ab rd
in
e
tra t
ov y
be en
cl nch
ot
he
bo s
ki c w
tc oo l
ca hen ke
rto to r
n ol
bo
x
,#!$-./01$
!&"''(%
#%
bathroom
bedroom
classroom
computer room
conference room
corridor
dining area
dining room
discussion area
furniture store
3D Object
Detection and
Pose Estimation
1. Image
RGB-D Sensors
raw points
,23&$
,&4567$89'&$
!"#$%&'()$*+$
!"##$%
!%
RealSense
Annotation
bedroom
SUN3D
refined points
B3DO
We introduce SUN RGB-D, a Pascal-scale RGB-D scene
understanding dataset, which has 2D and 3D annotation for both objects and rooms.
:;<&=($A65&$
83/,%
-66B$>446(?@64$ 83/,%
D (27.7)
kitchen
lab
lecture theatre
library
living room
Scene
Classification
conference room classroom
LSUN
10,195,373
home office
ImageNet
131,067
log10
Another problem of existing RGB-D datasets is that
most of them are only labeled in 2D.
NYU depth V2
RGB (38.1)
rest space
RGB datasets
PASCAL VOC
11,530
bathroom
bedroom
classroom
computer room
conference room
corridor
dining area
dining room
discussion area
furniture store
home office
SUNRGBD
10,335
NYU
1,449
Benchmark Tasks
Example Objects
RGB-D sensors have also enabled rapid progress for
scene understanding. However, the small dataset size
has become a major bottleneck.
RGB-D
Data & Code:
http://rgbd.cs.princeton.edu
# matched pairs with a correct category label
Precision =
# prediction boxes
Recall
Free Space IoU
=
# matched pairs with a correct category label
# ground truth boxes
Precision Recall for all objects