Neelanjan Bhowmik, Kamel Guissous

Fast image retrieval in
large image collections
POEME
Neelanjan Bhowmik, Kamel Guissous
http ://recherche.ign.fr/labos/matis/
{neelanjan.bhowmik, kamel.guissous}@ign.fr
Introduction
With the hasty growing of the image contents, it is imperative to develop effectual search processes,
such as Content-Based Image Retrieval (CBIR) to access voluminous, complex & unstructured data.
Contribution 1i Fusion of Inverted Indices (FII):
– An efficient fusion method for image detectors and descriptors, is proposed for image retrieval.
– FII, developed on Inverted Multi-Index approach, allows to fuse any number of image detectors and
descriptors by integrating their responses to a query in a finer subdivisions.
Contribution 2i Visual Saliency:
– Visual saliency is incorporated during image retrieval to get more precise information and discard
redundant information of query images.
– Only the keypoints in the saliency region are considered.
Applications:
– Query by example image retrieval from large image dataset, specifically for the museum collections.
– Image based localization from a large scale streetview repository.
.... ....
.... ....
.... ....
.... ....
Image Dataset
Feature Points
Indexing
&
Similarity Measurement
Query Image
Feature Extraction
****
****
****
Retrieved images similar to query
Context
Feature Points
– CBIR analyzes the contents of the image rather than metadata.
– Visual saliency was started as a biologically inspired process for focusing visual attention to certain
parts of an image which contain relevant information, thus reducing the complexity of scene analysis.
– Distinguished features (i.e. shape, color, texture etc.) are extracted from images to measure the resemblance.
– Fusion of detectors and descriptors is a better way to describe image content with more information.
Visual Saliency
Global Workflow
– A novel visual saliency approach, based on the orientations of the segments detected in the image.
– Segments are detected using the LSD (Line Segment Detector) algorithm.
– Directions of segments (i.e. one, two and multi) in a local window are studied in order to generate
the saliency maps for each image, based on the analysis of the distribution of these orientations.
– Saliency maps are used to accelerate and improve the performance of query based image retrieval.
Image Dataset
O
F
F
L
I
N
E
Detector
Descriptor A
S
T
A
G
E
Descriptor B
Codebook
generation
Training Matrix
(TMA)
Query Image (Q)
Inverted
Unique Indices
IUIA
Training Matrix
(TMB)
Inverted
Unique Indices
Codebook
(CBA)
Codebook
(CBB)
IUIB
Visual Saliency
Saliency Image
Detector
Evaluations
Primitives
filtred
KnnLA
Descriptor A
Descriptor B
QA1
QA2
QAn
QB1
QB2
QBn
SearchKnn
CLA
– Image retrieval experiments are conducted based on our FII proposal considering without and with
visual saliency information.
– Mean average precision (mAP) is presented to measure of quality across the multiple queries by
averaging the average precision (images retrieved that are relevant to the query).
– Best retrieval results achieved with SIFT, SURF & Shape Context (SC) descriptors fusions with visual
saliency information.
KnnLB
mAP & Average retrieval time with the fusion of different descriptors
Reference - all points
Randomly selected points
Paris_DB†
Descriptors
mAP
Time(s)
mAP
Time(s)
% of points used
SIFT
0.498
0.211
SIFT-SURF
0.544
0.419
SIFT-SURF-SC
0.540
0.736
0.480
0.635
88%
SURF-SC
0.522
0.408
CLB
MultiSequence
FLAB
Voting
mAP & Average retrieval time with the fusion of different descriptors
With visual saliency information
†
Paris_DB
Two + Multi (d] )
One + Two + Multi (d] )
Descriptors
mAP
Time(s)
% of Points
mAP
Time(s)
% of Points
SIFT-SURF
0.564
0.359
0.546
0.406
88%
98%
SIFT-SURF-SC
0.582
0.637
0.583
0.755
SURF-SC
0.561
0.343
0.534
0.383
FqL
Retrieved Images
Legend
= Images
= Files
= Programs
†
Public benchmark consisting of 6412 images collected from Flickr by searching for 12 particular Paris
landmarks.
(d] : Direction) is the number of main directions in local window.
= Pre-process
Conclusions
Project Organization
– Project ANR CONTINT "POEME", European Project KET ENIAC "THINGS2DO".
– Working at MATIS Lab, IGN/SR & Nicéphore Cité, Chalon-sur-Saône.
– Under supervision of Valérie Gouet-Brunet, Head of MATIS Lab, IGN.
– Proposed fusion approach has demonstrated its superiority comparing state-of-the-art.
– The strategy of fusion brings distinctiveness during nearest neighbor search hence enhance the performance.
– The fusion of different image characteristics achieved by visual saliency enhance the content representation.
– The use of visual saliency reduces the irrelevant information, thus retrieval computation time reduces
with increased accuracy.
Journée de la Recherche 2015