Objective Image and Video Quality Assessment – SSIM, SSIMplus, High Dynamic Range Image Tone Mapping, and Multi-Exposure Image Fusion Zhou Wang Department of Electrical and Computer Engineering University of Waterloo 2015 The University of Waterloo • 50+ Years of Growth – Founded in 1957 – 35,000+ students, 1,100+ Faculty • Reputation of Innovation – – – – World’s largest Co-op program Inventor-Owned I.P. Policy 20%+ Canadian university spin-offs 20 best places in the world to launch a business – Fortune Magazine – Open Text, Maplesoft, Blackberry, Dalsa, Desire2Learn, Miovision, … • ICT at UWaterloo Most Innovative University in Canada 23 straight years – Largest Engineering/CS/Math programs in Canada – Largest amount of research funding in ICT in Canada – 20+ Canada Research Chairs, IEEE Fellows, … Prof. Wang’s Collaborators and Team Past/Current Collaborators • • • • • • • • • • • • • • • • Prof. Alan C. Bovik, UT-Austin Prof. Eero P. Simoncelli, NYU Prof. David Zhang, HK PolyU Prof. Lei Zhang, HK PolyU Prof. Edward Vrscay, U Waterloo Prof. Guangzhe Fan, U Waterloo Prof. David Koff, McMaster U Prof. Jiying Zhao, U Ottawa Prof. Wen Gao, Peking U Prof. Siwei Ma, Peking U Prof. Weisi Lin, NTU Dr. Hamid R. Sheikh, Samsung Dr. Ligang Lu, IBM Dr. Mehul Sampat, UCSF Dr. Umesh Rajashekar, NYU …… Past Team Members • • • • • 4 Postdocs 8 PhD Students 8 Master’s Students 4 visiting PhD Students 10+ Undergrad Research Assistants/Research Associates/Visiting Scholars Current Team • • • • • • 4 Postdocs 5 PhD Students 2 Master’s Students 1 Visiting PhD Student 1 Research Associate 2 Undergrad Research Assistants Research Focus Image/Video Perceptual Communication Transmitter signal source sensing & recording Channel processing, storage & transmission Receiver reconstruction & displaying perception & understanding Research Focus • Perceptual image/video quality assessment (I/VQA) – Full-reference/reduced-reference/no-reference I/VQA – I/VQA for compression, blockiness, blur/sharpness, denoising interpolation/superresolution, tone-mapping, fusion, windowing, color-to-gray conversion, dehazing, retargeting, medical imaging … – I/VQA for 3D, high-dynamic-range, high-frame-rate, screen content • Perceptual image/video optimization – Perceptually inspired compression – Perceptually inspired denoising, deblurring, reconstruction, interpolation/superres, tone-mapping, fusion, retargeting, … – Perceptually inspired recognition/classification • Perceptual image/video transmission – Quality-aware image/video – Perceptual quality-driven adaptive streaming SSIM and SSIMplus Image/Video Quality Assessment (I/VQA) • The structural similarity (SSIM) index (2004) Original Compressed Absolute error map SSIM map – Most cited/used “perceptual I/VQA metric” in the literature/industry – Better quality predictor than MSE/PSNR, fast, intuitive quality map, differentiable, locally convex, transform invariance, distance metric, … – Extendable: MS-SSIM, IW-SSIM, CW-SSIM, FSIM, RR-SSIM, … SSIM as a “Generic” I/VQA Model Original, MSE=0, SSIM=1 MSE=144, SSIM=0.913 MSE=144, SSIM=0.988 MSE=142, SSIM=0.662 MSE=144, SSIM=0.694 MSE=144, SSIM=0.840 MAD Competition – MSE vs. SSIM Limitations of SSIM • Limitations in All Real Applications – Reference images may not be available – Reference images may not have perfect-quality (noisy, blurry, compressed, etc.) – Meanings of scores not intuitive • Limitation in Image Acquisition Applications – Reference images may not have the same dynamic range – Reference images may not have the same exposure levels – There may be multiple reference images (with different exposures, resolutions, spatial alignment, etc.) • Limitations in Visual Communication Applications – Network and receiving device conditions (packet loos, delay, bandwidth, decoding speed, buffer, etc.) not considered – Display device/resolution/viewing condition not considered From Quality to Quality-of-Experience (QoE) Video Hosting Server video stream Network Consumer Device video M M Video Quality Assessment Video QoE Assessment • Understand the experience of end consumers – Critical for user engagement • Optimize the experience of end consumers – Optimize video preparation at hosting server – Optimize resource allocation in the network – Optimize streaming strategy at the receiver Main Challenge: A Meaningful QoE Model • Desirable properties – – – – – – High accuracy High speed Meaningful assessment across display device/viewing condition Meaningful assessment across display resolution Meaningful assessment across content Localized quality indicator • Do current VQA methods satisfactory? – How do PSNR, SSIM, MS-SSIM, MOVIE, VQM perform? – How do commercial products perform? SSIMplus • Comparisons with SSIM – – – – – – – Built upon SSIM philosophy; more advanced perceptual modeling Higher accuracy Higher speed Display device/viewing condition adapted assessment Display resolution adapted assessment Strong cross-content property Localized quality indicator • Form more details https://ece.uwaterloo.ca/~z70wang/research/ssimplus/ – Demo … Subjective Test • Video content – 1920x1080 and 1136x640, 24fps, compressed by H.264 • Display device and viewing condition • Subjective experiment – 30 naïve subjects – Compute mean opinion scores (MOSs) after removing outliers [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: SSIM [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: PSNR [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: VQM [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: MOVIE [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: Tektronix DMOS [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: Video Clarity DMOS [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: SSIMplus [Rehman, Zeng & Wang, Electronic Imaging ‘15] Test Result: All PLCC: Pearson’s correlation coefficient MAE: mean absolute error RMS: root mean squared error SRCC: Spearman’s rank-order correlation coefficient KRCC: Kendall’s rank-order correlation coefficient [Rehman, Zeng & Wang, Electronic Imaging ‘15] High Dynamic Range (HDR) Image Tone Mapping (TMO) and Multi-Exposure Image Fusion (MEF) HDR Tone Mapping vs. MEF HDR image Construction MEF TMO IQA Problem: Which TMO is the Best? Reference image: HDR, Test image: LDR No effective solution in the literature Tone Mapped Image Quality Index (TMQI) Tone mapped images and multi-scale quality maps indicating local structural detail loss and distortions [Yaganeh & Wang, IEEE Trans Image Proc. ‘13] Application of TMQI: Parameter Tuning in TMO [Drago ‘03] b = 0.1, TMQI = 0.89 b = 0.8, TMQI = 0.90 b = 1, TMQI = 0.84 [Yaganeh & Wang, IEEE Trans Image Proc. ‘13] Application of TMQI: Fusion of Tone Mapped Images TMO1, TMQI = 0.80 Local quality weighted averaging Fused image, TMQI = 0.94 TMO2, TMQI = 0.87 [Yaganeh & Wang, IEEE Trans Image Proc. ‘13] Application of TMQI: Optimal Design of TMOs Given HDR image X, find the Maximum Structural Fidelity (MSF) tone-mapped image: Assuming smooth behavior of the SF function, use gradient ascent method [Yaganeh & Wang, ICASSP ’13] [Yaganeh & Wang, ICASSP ’13] initial image by Gamma (2.2) mapping 10 iterations 30 iterations 50 iterations [Ma, Yaganeh, Zeng & Wang, ’14] 100 iterations IQA Problem: Which MEF is the Best? MEF-IQA Multi-exposure fused images and multi-scale quality maps indicating local structural detail loss and distortions [Ma, Zeng & Wang, ’15] Application of MEF-IQA: Parameter Tuning in TMO [Ma, Zeng & Wang, ’15] Application of MEF-IQA: Optimal Design of MEF Algorithms by [Song, TIP’12], one of the top performers in the literature by proposed method [Ma & Wang, ’15] THE END Thank you! https://ece.uwaterloo.ca/~z70wang/
© Copyright 2024