Highly Parallel Mode Decision Method for HEVC Jun Zhang, Feng Dai, Yike Ma, and Yongdong Zhang Picture Coding Symposium (PCS), 2013 1 Outline Introduction Data dependency analysis and removing Proposed method Implementation and results Conclusion 2 Introduction #1 Coding Tree Unit ◦ CU, TU, PU ◦ Depth 3 Introduction #2 Motion vector prediction ◦ Motion merge ◦ AMVP 4 Introduction #3 Context adaptive binary arithmetic coding CABAC 5 Introduction #4 Motion estimation region (MER) ◦ Square ◦ Generally, the size of a MER is 32*32 MER MER MER MER 6 Data dependency analysis and removing #1 Data dependency between neighboring PU ◦ After B0, B1, B2, A0, A1 are available, current CU start to do motion vector prediction. 7 Data dependency analysis and removing #2 Solution : Merge estimation region [12] ◦ if a neighboring PU and the current PU belong to a same MER, this neighboring PU is treated as unavailable for spatial MVP derivation of the merge/skip MVP list construction process. MER MER MER MER 8 Data dependency analysis and removing #2 9 Data dependency analysis and removing #3 HEVC uses only one entropy coding method : CABAC 10 Data dependency analysis and removing #4 If parallel MD in a MER containing multiple CUs is expected, the encoder must solve the problem of CM absence for each CU because the encoding of neighboring CUs is also being performed and the accumulated CMs are not available yet. 11 Data dependency analysis and removing #5 MD for all CUs in the same MER share a same set of CMs that have been trained up to the last MER. 12 Data dependency analysis and removing #6 Neighboring coding mode information is needed to do CM selection or context modeling for a bin, which produces additional dependencies among CUs. 13 Data dependency analysis and removing #7 14 Proposed method #1 The proposed parallel MD method is based on MER and MD for all CUs in the same MER can be fully parallelized. 15 Proposed method #2 Parallel processing among CUs ◦ MD for all potential CUs within the same MER, including CUs of same and different splitting depth are computed concurretnly, i.e. all nodes in the quadtree perform parallel MD. 16 Proposed method #3 Parallel processing within a CU ◦ For a certain CU, many PU partition modes can be used and each one will give a RD cost with the corresponding coding information after ME and TU splitting computation. There are no explicit dependencies between these PU partition modes thus they can be conducted concurrently and independently. 17 Proposed method #4 Parallel ME among Pus ◦ We propose that all PUs in a CU perform ME concurrently, including merge mode estimation and regular motion estimation. Because all CUs in the same MER are conducting MD concurrently, so actually ME for all PUs in the same MER are run in parallel. 18 Implementation and results #1 19 Implementation and results #2 20 Conclusion Small bitrate increasing and high encoding speedup . Remove the dependency by MER. 21
© Copyright 2024