Highly Parallel Mode Decision Method for HEVC

Highly Parallel Mode Decision
Method for HEVC
Jun Zhang, Feng Dai, Yike Ma, and Yongdong Zhang
Picture Coding Symposium (PCS), 2013
1
Outline
Introduction
 Data dependency analysis and removing
 Proposed method
 Implementation and results
 Conclusion

2
Introduction #1

Coding Tree Unit
◦ CU, TU, PU
◦ Depth
3
Introduction #2

Motion vector prediction
◦ Motion merge
◦ AMVP
4
Introduction #3

Context adaptive binary arithmetic coding CABAC
5
Introduction #4

Motion estimation region (MER)
◦ Square
◦ Generally, the size of a MER is 32*32
MER
MER
MER
MER
6
Data dependency analysis and
removing #1

Data dependency between neighboring
PU
◦ After B0, B1, B2, A0, A1 are available, current
CU start to do motion vector prediction.
7
Data dependency analysis and
removing #2

Solution : Merge estimation region [12]
◦ if a neighboring PU and the current PU belong
to a same MER, this neighboring PU is treated
as unavailable for spatial MVP derivation of
the merge/skip MVP list construction process.
MER
MER
MER
MER
8
Data dependency analysis and
removing #2
9
Data dependency analysis and
removing #3

HEVC uses only one entropy coding
method : CABAC
10
Data dependency analysis and
removing #4

If parallel MD in a MER containing multiple
CUs is expected, the encoder must solve the
problem of CM absence for each CU because
the encoding of neighboring CUs is also being
performed and the accumulated CMs are not
available yet.
11
Data dependency analysis and
removing #5

MD for all CUs in the same MER share a same
set of CMs that have been trained up to the last
MER.
12
Data dependency analysis and
removing #6

Neighboring coding mode information is
needed to do CM selection or context
modeling for a bin, which produces
additional dependencies among CUs.
13
Data dependency analysis and
removing #7
14
Proposed method #1

The proposed parallel MD method is based on
MER and MD for all CUs in the same MER can
be fully parallelized.
15
Proposed method #2

Parallel processing among CUs
◦ MD for all potential CUs within the same MER,
including CUs of same and different splitting depth
are computed concurretnly, i.e. all nodes in the
quadtree perform parallel MD.
16
Proposed method #3

Parallel processing within a CU
◦ For a certain CU, many PU partition modes can be
used and each one will give a RD cost with the
corresponding coding information after ME and TU
splitting computation. There are no explicit
dependencies between these PU partition modes thus
they can be conducted concurrently and independently.
17
Proposed method #4

Parallel ME among Pus
◦ We propose that all PUs in a CU perform ME
concurrently, including merge mode estimation and
regular motion estimation. Because all CUs in the
same MER are conducting MD concurrently, so
actually ME for all PUs in the same MER are run in
parallel.
18
Implementation and results #1
19
Implementation and results #2
20
Conclusion
Small bitrate increasing and high encoding
speedup .
 Remove the dependency by MER.

21