DCT-based Image/Video Compression: New Design Perspectives

Transcription

DCT-based Image/Video Compression: NewDesign PerspectivesbyChang SunA thesispresented to the University of Waterlooin fulfillment of thethesis requirement for the degree ofDoctor of PhilosophyinElectrical and Computer EngineeringWaterloo, Ontario, Canada, 2014c Chang Sun 2014

I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis,including any required final revisions, as accepted by my examiners.I understand that my thesis may be made electronically available to the public.ii

AbstractTo push the envelope of DCT-based lossy image/video compression, this thesis is motivatedto revisit design of some fundamental blocks in image/video coding, ranging from sourcemodelling, quantization table, quantizers, to entropy coding. Firstly, to better handlethe heavy tail phenomenon commonly seen in DCT coefficients, a new model dubbedtransparent composite model (TCM) is developed and justified. Given a sequence of DCTcoefficients, the TCM first separates the tail from the main body of the sequence, andthen uses a uniform distribution to model DCT coefficients in the heavy tail, while usinga parametric distribution to model DCT coefficients in the main body. The separationboundary and other distribution parameters are estimated online via maximum likelihood(ML) estimation. Efficient online algorithms are proposed for parameter estimation andtheir convergence is also proved. When the parametric distribution is truncated Laplacian,the resulting TCM dubbed Laplacian TCM (LPTCM) not only achieves superior modelingaccuracy with low estimation complexity, but also has a good capability of nonlinear datareduction by identifying and separating a DCT coefficient in the heavy tail (referred to asan outlier) from a DCT coefficient in the main body (referred to as an inlier). This in turnopens up opportunities for it to be used in DCT-based image compression.Secondly, quantization table design is revisited for image/video coding where soft decision quantization (SDQ) is considered. Unlike conventional approaches where quantizationtable design is bundled with a specific encoding method, we assume optimal SDQ encodingand design a quantization table for the purpose of reconstruction. Under this assumption,we model transform coefficients across different frequencies as independently distributedrandom sources and apply the Shannon lower bound to approximate the rate distortionfunction of each source. We then show that a quantization table can be optimized in a waythat the resulting distortion complies with certain behaviour, yielding the so-called optimaliii

distortion profile scheme (OptD). Guided by this new theoretical result, we present an efficient statistical-model-based algorithm using the Laplacian model to design quantizationtables for DCT-based image compression. When applied to standard JPEG encoding, itprovides more than 1.5 dB performance gain (in PSNR), with almost no extra burden oncomplexity. Compared with the state-of-the-art JPEG quantization table optimizer, theproposed algorithm offers an average 0.5 dB gain with computational complexity reducedby a factor of more than 2000 when SDQ is off, and a 0.1 dB performance gain or morewith 85% of the complexity reduced when SDQ is on.Thirdly, based on the LPTCM and OptD, we further propose an efficient non-predictiveDCT-based image compression system, where the quantizers and entropy coding are completely re-designed, and the relative SDQ algorithm is also developed. The proposed systemachieves overall coding results that are among the best and similar to those of H.264 orHEVC intra (predictive) coding, in terms of rate vs visual quality. On the other hand,in terms of rate vs objective quality, it significantly outperforms baseline JPEG by morethan 4.3 dB on average, with a moderate increase on complexity, and ECEB, the state-ofthe-art non-predictive image coding, by 0.75 dB when SDQ is off, with the same level ofcomputational complexity, and by 1 dB when SDQ is on, at the cost of extra complexity.In comparison with H.264 intra coding, our system provides an overall 0.4 dB gain or so,with dramatically reduced computational complexity. It offers comparable or even bettercoding performance than HEVC intra coding in the high-rate region or for complicatedimages, but with only less than 5% of the encoding complexity of the latter. In addition,our proposed DCT-based image compression system also offers a multiresolution capability,which, together with its comparatively high coding efficiency and low complexity, makesit a good alternative for real-time image processing applications.iv

AcknowledgementsBefore a long list of people I am indebted for making this thesis possible, my deepestappreciation and gratitude is reserved for my supervisor, Professor En-hui Yang, for hisinvaluable and constant guidance through out my Ph.D. studies at the University of Waterloo. Through years of extensive training from Professor Yang, I not only became inspired todo cutting-edge research, but also profoundly influenced to think more precisely, logically,and sharply. Beyond the academic training, he also provides care to my personal life andempowers me to persevere through all expected, inevitable, and unforeseeable obstacles.I am extremely grateful to all my examining committee members, formed by distinguished scholars. I would like to thank Professor Zhou Wang, Professor George Freeman,and Professor Xinzhi Liu, for their valuable comments for my comprehensive exam andtheir commitment to my Ph.D. thesis defence. And I would also like to thank ProfessorJie Liang from the Simon Fraser University, for his commitment to serve as my externalexamining committee member.My sincere thanks also goes to previous and current members of the Multimedia Communications Laboratory, many of whom I have forged strong friendships and collaborationswith, including Dr. Da-ke He, Dr. Wei Sun, Dr. Xiang Yu, Dr. Jin Meng, Dr. Lin Zheng,Dr. Mehdi Torbatian, Dr. Shenghao Yang, Krzysztof Hebel, James Ho, Yuhan Zhou,Hui Zha, Fei Teng, Jie Zhang, Krishna Rapaka, Nan Hu, Yueming Gao, and MahshadEslamifar.Finally, I give my gratitude to my father, Guangyu Sun, my mother, Ruimin Ge, andmy wife, Huijuan Ding, for their unconditional love, understanding, encouragement andsupport to my Ph.D. studies.v

To Huijuan and Ethanvi

Table of ContentsList of TablesxiList of Figuresxiv1 Introduction11.1Thesis motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2Thesis contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61.3Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82 Transparent Composite Model for DCT coefficients2.19Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102.1.1Models in the literature for DCT coefficients . . . . . . . . . . . . .102.1.2Measurement for modelling accuracy . . . . . . . . . . . . . . . . .122.2Heavy tail observations in DCT coefficients . . . . . . . . . . . . . . . . . .142.3Continuous Transparent Composite Model . . . . . . . . . . . . . . . . . .152.3.116Description of general continuous TCMs . . . . . . . . . . . . . . .vii

2.3.2ML estimate of TCM parameters . . . . . . . . . . . . . . . . . . .172.3.3LPTCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .222.3.4GGTCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25Discrete Transparent Composite Model . . . . . . . . . . . . . . . . . . . .262.4.1GMTCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .272.4.2ML Estimate of GMTCM parameters . . . . . . . . . . . . . . . . .29Experimental results on Tests of modelling Accuracy . . . . . . . . . . . .332.5.1Test conditions and test materials . . . . . . . . . . . . . . . . . . .332.5.2Overall comparisons for each image . . . . . . . . . . . . . . . . . .342.5.3Comparisons of χ2 among three models for individual frequencies .372.6Data reduction capability of Transparent Composite Model . . . . . . . . .412.7Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .432.42.53 Quantization Table Design Revisited for Image/Video compression443.1Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .443.2Quantization table design–Problem formulation . . . . . . . . . . . . . . .473.3Quantization table design–Problem solution . . . . . . . . . . . . . . . . .503.4Application to DCT-based Image Compression . . . . . . . . . . . . . . . .593.5Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .623.6Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .70viii

4 An Efficient DCT-based Image Compression System Based on LaplacianTransparent Composite Model714.1Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .724.2Transparent composite quantizers . . . . . . . . . . . . . . . . . . . . . . .734.2.1Constrained dead-zone quantizer design–Problem formulation . . .764.2.2Constrained dead-zone quantizer design–Problem solution . . . . .77Transparent composite coding . . . . . . . . . . . . . . . . . . . . . . . . .804.3.1Context-adaptive layer-based bi-level image coding . . . . . . . . .834.3.2Context-adaptive layer-based composite arithmetic coding . . . . .86Soft-decision quantization design . . . . . . . . . . . . . . . . . . . . . . .924.4.1Layer-based SDQ design–Problem formulation . . . . . . . . . . . .934.4.2Layer-based SDQ design–Problem solution . . . . . . . . . . . . . .934.5Multiresolution capability of the proposed image compression system . . .974.6Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .984.6.1Experimental results of the CDZQ . . . . . . . . . . . . . . . . . .984.6.2Experimental results of the CALBIC . . . . . . . . . . . . . . . . . 1034.6.3Experimental results of the TCC . . . . . . . . . . . . . . . . . . . 1044.6.4Experimental results of the proposed image coding system–subjective4.34.4tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054.6.5Experimental results of the proposed image coding system–objectivetests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.7Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111ix

5 Conclusion and future work1165.1Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.2Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Bibliography130x

List of Tables2.1Comparing Cauchy model with GGD (continuous DCT). . . . . . . . . . .352.2Comparing LPTCM with GGD (continuous DCT). . . . . . . . . . . . . .362.3Overall comparisons between the GMTCM and GG model for all imagescoded using JPEG with QF 100.2.4. . . . . . . . . . . . . . . . . . . . . .Overall comparisons between the GMTCM and GG model for all imagescoded using JPEG with QF 90. . . . . . . . . . . . . . . . . . . . . . . .2.566PSNR performance comparison of different Q-table design methods for baseline JPEG encoding for 512 512 GoldHill . . . . . . . . . . . . . . . . . .3.337PSNR performance comparison of different Q-table design methods for baseline JPEG encoding for 512 512 Airplane (F16) . . . . . . . . . . . . . .3.237Overall comparisons between the GMTCM and GG model for all imagescoded using JPEG with QF 70. . . . . . . . . . . . . . . . . . . . . . . .3.136Overall comparisons between the GMTCM and GG model for all imagescoded using JPEG with QF 80. . . . . . . . . . . . . . . . . . . . . . . .2.63666PSNR performance comparison of different Q-table design methods for baseline JPEG encoding for 512 512 Lena . . . . . . . . . . . . . . . . . . . .xi66

3.4PSNR performance comparison of different Q-table design methods for baseline JPEG encoding for 512 512 Dome . . . . . . . . . . . . . . . . . . .3.5PSNR performance comparison of different Q-table design methods for baseline JPEG encoding for 720p Stockholm (1st frame) . . . . . . . . . . . . .3.668PSNR performance comparison of different quantization table design methods for ARL and ECEB encoding for 512 512 GoldHill . . . . . . . . . .3.967PSNR performance comparison of different quantization table design methods for ARL and ECEB encoding for 512 512 Airplane (F16) . . . . . . .3.867PSNR performance comparison of different Q-table design methods for baseline JPEG encoding for 1080p Kimono (1st frame) . . . . . . . . . . . . . .3.76768PSNR performance comparison of different quantization table design methods for ARL and ECEB encoding for 512 512 Lena . . . . . . . . . . . .683.10 PSNR performance comparison of different quantization table design methods for ARL and ECEB encoding for 512 512 Dome . . . . . . . . . . . .693.11 PSNR performance comparison of different quantization table design methods for ARL and ECEB encoding for 720p Stockholm (1st frame) . . . . .693.12 PSNR performance comparison of different quantization table design methods for ARL and ECEB encoding for 1080p Kimono (1st frame) . . . . . .693.13 Computer running time (in milliseconds) of different quantization table design methods and other encoding components for baseline JPEG encodingfor 512 512 images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .703.14 Computer running time (in milliseconds) of all encoding components forARL or/and ECEB for 512 512 images . . . . . . . . . . . . . . . . . . .xii70

4.1MCL definition in the proposed image compression system . . . . . . . . .974.2Rate comparison for OT F image of 512 512 Lena . . . . . . . . . . . . . 1034.3 Rate comparison for OT F image of 512 512 GoldHill . . . . . . . . . . . 1044.4Lossless coding rate comparison for 512 512 Lena . . . . . . . . . . . . . 1054.5Lossless coding rate comparison for 512 512 GoldHill . . . . . . . . . . . 1054.6The predefined q

appreciation and gratitude is reserved for my supervisor, Professor En-hui Yang, for his invaluable and constant guidance through out my Ph.D. studies at the University of Water- loo. Through years of extensive training from Professor Yang, I not only became inspired to do cutting-edge research, but also profoundly in uenced to think more precisely, logically, and sharply. Beyond the academic .