poster - Center for Machine Perception

Efficient Image Detail Mining
Andrej Mikulík
Filip Radenović
Ondřej Chum
Jiří Matas
Center for Machine Perception, Czech Technical University in Prague
Novel Image Mining Problem Formulations
Given a query and a dataset, for every pixel in the query image:
i. Find the database image with the maximum resolution depicting the pixel
detail size
ii. Find the frequency with which it is photographed in detail
37.3x
27.0x
22.8x
21.9x
21.6x
Highest Resolution Transform (HRT)
Related Approaches
O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Totall recall: Automatic query
expansion with a generative feature model for object retrieval. In Proc. ICCV, 2007
Term Frequency –
Inverse Document
Frequency (TF-IDF)
Scoring
1–3%
3 – 10 %
Geometric Consistency Test
A. Large scale image retrieval
Bag-of-Words
(BOW)
Representation
0–1%
Spatial Verification
(SP)
Query Expansion
(QE)
EASY
• No region of interest provided
• Seemingly harmless regions, such as railings in the corner of the image,
can expand into enormous number of false positive images
• Novel mechanism for detecting and eliminating inconsistencies:
 Aj,i – mapping from result image i to result image j, obtained during
this stage
 Aq,i and Aq,j – mappings from result images i and j to the query
image q, respectively, obtained in the initial zoom-in query
 Aq,i ≈ Aq,j Aj,i – it holds for a consistent pair of result images i and j
Aq,i
query
rank:
1
2
64
32
65
DIFFICULT
query
rank:
1
2048
16384
Aq,j
81368
B. Large scale zooming retrieval
A. Mikulik, O. Chum, and J.Matas. Image retrieval for online browsing in large image
collections. In Similarity Search and Applications, 2013
Bag-of-Words
(BOW)
Representation
TF-IDF scoring in
Document at a
Time (DAAT) order*
Spatial Verification
(SP) + re-ranking
based on scale
change
Aj,i
Query Expansion
(QE)
q
j
Experimental Dataset
region of interest provided
* Scores are re-weighted to prefer desired change in scale (zoom-in or zoom-out)
•
•
•
•
620,000 images
1.3 x 109 features – Hessian-affine features described by SIFT
16M visual word, two level k-means vocabulary
6 landmarks – manually annotated
Performance and Results
Our Approach: Hierarchical Query Expansion
Large Scale Zooming
Retrieval
Detail Image
Clustering
Geometrical
Consistency Test
Expanded Zoom-in
Query on Every
Cluster
Detail Image Clustering
1.
2.
3.
4.
5.
i
Find a pixel in the query covered by the largest number of images
Select the image with the highest scale change as a cluster seed
Add images with at least 50% overlap with the seed to the cluster
Remove the cluster and if there is more images jump to 1.
Each cluster is subject to geometric consistency test
Notre Dame