What Makes an Image Look Good?

Objective Image and Video Quality Assessment –
SSIM, SSIMplus, High Dynamic Range Image Tone
Mapping, and Multi-Exposure Image Fusion
Zhou Wang
Department of Electrical and Computer Engineering
University of Waterloo
2015
The University of Waterloo
• 50+ Years of Growth
– Founded in 1957
– 35,000+ students, 1,100+ Faculty
• Reputation of Innovation
–
–
–
–
World’s largest Co-op program
Inventor-Owned I.P. Policy
20%+ Canadian university spin-offs
20 best places in the world to launch a
business – Fortune Magazine
– Open Text, Maplesoft, Blackberry,
Dalsa, Desire2Learn, Miovision, …
• ICT at UWaterloo
Most Innovative
University
in Canada
23 straight years
– Largest Engineering/CS/Math programs in Canada
– Largest amount of research funding in ICT in Canada
– 20+ Canada Research Chairs, IEEE Fellows, …
Prof. Wang’s Collaborators and Team
Past/Current Collaborators
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Prof. Alan C. Bovik, UT-Austin
Prof. Eero P. Simoncelli, NYU
Prof. David Zhang, HK PolyU
Prof. Lei Zhang, HK PolyU
Prof. Edward Vrscay, U Waterloo
Prof. Guangzhe Fan, U Waterloo
Prof. David Koff, McMaster U
Prof. Jiying Zhao, U Ottawa
Prof. Wen Gao, Peking U
Prof. Siwei Ma, Peking U
Prof. Weisi Lin, NTU
Dr. Hamid R. Sheikh, Samsung
Dr. Ligang Lu, IBM
Dr. Mehul Sampat, UCSF
Dr. Umesh Rajashekar, NYU
……
Past Team Members
•
•
•
•
•
4 Postdocs
8 PhD Students
8 Master’s Students
4 visiting PhD Students
10+ Undergrad Research
Assistants/Research
Associates/Visiting Scholars
Current Team
•
•
•
•
•
•
4 Postdocs
5 PhD Students
2 Master’s Students
1 Visiting PhD Student
1 Research Associate
2 Undergrad Research Assistants
Research Focus
Image/Video
Perceptual
Communication
Transmitter
signal source
sensing &
recording
Channel
processing, storage
& transmission
Receiver
reconstruction
& displaying
perception &
understanding
Research Focus
• Perceptual image/video quality assessment (I/VQA)
– Full-reference/reduced-reference/no-reference I/VQA
– I/VQA for compression, blockiness, blur/sharpness, denoising
interpolation/superresolution, tone-mapping, fusion, windowing,
color-to-gray conversion, dehazing, retargeting, medical imaging …
– I/VQA for 3D, high-dynamic-range, high-frame-rate, screen content
• Perceptual image/video optimization
– Perceptually inspired compression
– Perceptually inspired denoising, deblurring, reconstruction,
interpolation/superres, tone-mapping, fusion, retargeting, …
– Perceptually inspired recognition/classification
• Perceptual image/video transmission
– Quality-aware image/video
– Perceptual quality-driven adaptive streaming
SSIM and SSIMplus
Image/Video Quality Assessment (I/VQA)
• The structural similarity (SSIM) index (2004)
Original
Compressed
Absolute error map
SSIM map
– Most cited/used “perceptual I/VQA metric” in the literature/industry
– Better quality predictor than MSE/PSNR, fast, intuitive quality map,
differentiable, locally convex, transform invariance, distance metric, …
– Extendable: MS-SSIM, IW-SSIM, CW-SSIM, FSIM, RR-SSIM, …
SSIM as a “Generic” I/VQA Model
Original, MSE=0, SSIM=1
MSE=144, SSIM=0.913
MSE=144, SSIM=0.988
MSE=142, SSIM=0.662
MSE=144, SSIM=0.694
MSE=144, SSIM=0.840
MAD Competition – MSE vs. SSIM
Limitations of SSIM
• Limitations in All Real Applications
– Reference images may not be available
– Reference images may not have perfect-quality (noisy, blurry,
compressed, etc.)
– Meanings of scores not intuitive
• Limitation in Image Acquisition Applications
– Reference images may not have the same dynamic range
– Reference images may not have the same exposure levels
– There may be multiple reference images (with different
exposures, resolutions, spatial alignment, etc.)
• Limitations in Visual Communication Applications
– Network and receiving device conditions (packet loos, delay,
bandwidth, decoding speed, buffer, etc.) not considered
– Display device/resolution/viewing condition not considered
From Quality to Quality-of-Experience (QoE)
Video
Hosting
Server
video
stream
Network
Consumer
Device
video
M
M
Video Quality Assessment
Video QoE Assessment
• Understand the experience of end consumers
– Critical for user engagement
• Optimize the experience of end consumers
– Optimize video preparation at hosting server
– Optimize resource allocation in the network
– Optimize streaming strategy at the receiver
Main Challenge: A Meaningful QoE Model
• Desirable properties
–
–
–
–
–
–
High accuracy
High speed
Meaningful assessment across display device/viewing condition
Meaningful assessment across display resolution
Meaningful assessment across content
Localized quality indicator
• Do current VQA methods satisfactory?
– How do PSNR, SSIM, MS-SSIM, MOVIE, VQM perform?
– How do commercial products perform?
SSIMplus
• Comparisons with SSIM
–
–
–
–
–
–
–
Built upon SSIM philosophy; more advanced perceptual modeling
Higher accuracy
Higher speed
Display device/viewing condition adapted assessment
Display resolution adapted assessment
Strong cross-content property
Localized quality indicator
• Form more details
https://ece.uwaterloo.ca/~z70wang/research/ssimplus/
– Demo …
Subjective Test
• Video content
– 1920x1080 and 1136x640, 24fps, compressed by H.264
• Display device and viewing condition
• Subjective experiment
– 30 naïve subjects
– Compute mean opinion scores (MOSs) after removing outliers
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: SSIM
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: PSNR
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: VQM
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: MOVIE
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: Tektronix DMOS
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: Video Clarity DMOS
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: SSIMplus
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
Test Result: All
PLCC: Pearson’s correlation coefficient
MAE: mean absolute error
RMS: root mean squared error
SRCC: Spearman’s rank-order correlation coefficient
KRCC: Kendall’s rank-order correlation coefficient
[Rehman, Zeng & Wang, Electronic Imaging ‘15]
High Dynamic Range (HDR)
Image Tone Mapping (TMO)
and
Multi-Exposure Image Fusion (MEF)
HDR Tone Mapping vs. MEF
HDR image
Construction
MEF
TMO
IQA Problem: Which TMO is the Best?
Reference image: HDR, Test image: LDR
No effective solution in the literature
Tone Mapped Image Quality Index (TMQI)
Tone mapped images
and multi-scale quality
maps indicating local
structural detail loss
and distortions
[Yaganeh & Wang, IEEE
Trans Image Proc. ‘13]
Application of TMQI:
Parameter Tuning in TMO
[Drago ‘03]
b = 0.1, TMQI = 0.89
b = 0.8, TMQI = 0.90
b = 1, TMQI = 0.84
[Yaganeh & Wang, IEEE Trans Image Proc. ‘13]
Application of TMQI:
Fusion of Tone Mapped Images
TMO1, TMQI = 0.80
Local quality
weighted
averaging
Fused image, TMQI = 0.94
TMO2, TMQI = 0.87
[Yaganeh & Wang, IEEE Trans Image Proc. ‘13]
Application of TMQI:
Optimal Design of TMOs
Given HDR image X, find the Maximum Structural
Fidelity (MSF) tone-mapped image:
Assuming smooth behavior of the SF function, use
gradient ascent method
[Yaganeh & Wang, ICASSP ’13]
[Yaganeh & Wang, ICASSP ’13]
initial image by
Gamma (2.2)
mapping
10 iterations
30 iterations
50 iterations
[Ma, Yaganeh, Zeng & Wang, ’14]
100 iterations
IQA Problem: Which MEF is the Best?
MEF-IQA
Multi-exposure fused
images and multi-scale
quality maps indicating
local structural detail loss
and distortions
[Ma, Zeng & Wang, ’15]
Application of MEF-IQA:
Parameter Tuning in TMO
[Ma, Zeng & Wang, ’15]
Application of MEF-IQA:
Optimal Design of MEF Algorithms
by [Song, TIP’12], one of the top
performers in the literature
by proposed method
[Ma & Wang, ’15]
THE END
Thank you!
https://ece.uwaterloo.ca/~z70wang/