`EAE eee via
`
`reenter.
`
`
`Joint Video Exploration Team (SVET)
`of ITU-T SG 16 WP3 and ISOAEC JTC USC 29°WG 1
`13th Meeting: Marrakech, MA, 9--18 Jan. 2019
`
`cement: FRIST AMAGR Sv 12.
`Document: JVET-MO088-v32
`
`Titie:
`Status:
`
`CE f0-related: LIC restriction for pipetine structure
`Input Document
`
`Purpose:
`
`Proposal
`
`
`Author(s} or—Kiyoflemi Abe Tel: +81-8f7-3083-417)
`
`Contact(s),
`Tadamasa Toma.
`Email:
`abe. kiyo@jp.panasonic.com
`toma. tadamasatéjp. panasenic.com
`Jingya La
`
`jingyalfdise.panasonic.cam
`Che-Wei Kuo
`
`VirginieDrugeon
`cheweikuo@s yg. panasonic.com
`Virginie,drugeon@cu.panasoniccom
`
`spasmcermconneent
`
`Source:
`
`Panasonic
`
`Abstract
`
`This contribution is based on the LIC which is described in CE10.5.2 (IVET-M0087}), CE10.5.2 can
`achieve the basementpipeline structure with LIC, but it still has memory bandwidthissue and stage cycle
`issue coming form pipeline restriction. This contribution provides the solution for these issues by
`introducing the LIC restriction far the combination with VPDU split,ATMYVP, and Mi-Intra. Simulation
`results reportedly showthat the proposed LIC provides 0.49% BD-rate gain for RA, 0.47% BD-rate gain
`for LDB.
`
`1
`
`introduction
`
`CHIO.5.2 (IVET-M6087{J]}) proposes to introduce LIC by implementing all LIC processes into the
`reconstruction stage and achieve pipeline structure without latency issue. Figure] shows an example of
`decoder pipeline structure. According to this structure, the neighboring reconstruction imagefeedback for
`LIC can be closed to reconstruction silage as well as intra process. However, i{ is necessary to consider the
`combinations with other tools that increase the bandwidth and stage eycles related ta LIC process. We
`detact three tool combinations that have the significant impact for this issue. This contribution proposes tu
`introduce the restrictions for these tool combinations.
`
`(1} VPDUsplit process
`
`(2) Combination with ATMVP
`
`(3) Combination with MH-fatra
`
`
`
`
`‘
`Figure I: Example of decoder pipeline structure.
`
`2 Proposed item]: VPDU restriction
`
`Problem
`
`LiC parameter is calculated using the top and left reconstructed neighboring samples of current CU and
`the top and lefi reconstructed neighboring sampies ofreference block.
`
`rigure2 shows the relationship of the neighboring samples when CUsize is equal to or fess than VPDU
`size, The top and lefi neighboring samples ofreference block are already included in MC access memor
`area of current CU. Theretore, there are no impact for dhe memory bandwicdih, and it does not increase th
`number ofstage evcle.
`
`¥o
`
`e
`
`1 MC access memory
`
`
`
`Referance block
`
`Current Cu
`
`Figure 2: Relationship of the neighboring samples when CUsize is equal fa or less than VPDUsize.
`
`On the other hand. Pigure3 shows the relationship ofthe neighboring samples when CU size is larger than
`VPDU size. MC access memory area of VPDU1 includes only the area around reference block of
`VPDUT. Therefore, sore of the neighboring samples used for LIC parameter calculation require
`additional memory access and increase bandwidth of VPDU1 pracessing stage.
`
`
`
`
`
`
`
`f
`aren
`5
`
`:VPoutlveou2|
`
`
` 4
`
`;
`
`4 VEDUI vepual
`Reterence block of VPDLN
`i
`
`
`Current CU
`Figure 3: Relationship of the neighboring samples when CU size is larger than VPDUsize.
`
`Solution
`
`The proposed method uses only top and left neighboring samples of VPDU! of both current block and
`reference block, and shares the calculated LIC parameter ‘with other VPDUsin the current CU. According
`to this method,
`the memory bandwidth of VPDUIL does not changed, and LIC parameter calculation
`processing can be removed for other VPDUs.
`
`, MC access memory of VPDU4
`
`{
`
`}vpput|veovel
`
`
`Reference block of VPDUI
`
`
`iVPDU3/ VPDUd)Lt
`
` a
`
`Current CU
`
`Figare 4: Proposed LIC method for VPDVUsplit process.
`
`3 Proposed item2: ATMYVPrestriction
`
`Problem
`
`ATMYP has almost same probleri as VPDU split process. Each sub-black has different MV, and MCis
`applied for each. But the MC access memory area for each sub-block does not include all neighboring
`samptes of reference block using for LIC parameter calculation. Therefore, it requires additional memory
`access and bandwidth as wetfas VPDUissue.
`
`Sohition
`
`We propose to disable LIC when ATMYP is selected as a sub-block merge mode. In the case of VPDU
`split process, a LIC parameter can be shared by all VPDUs in the current CU since all VPDUs have same
`MV. But. in the case ofATMYP,it is difficult to share a LIC parameterby all sub-blocks since each sub-
`biock has differentMV. Therefore. we propose to diseble LIC instead of sharing a LIC parameter.
`
`PesRGSS®
`
`3
`
`
`
`Rae Teen aia Eiger eves aeae1EG
`
`4 Proposeditem3: MH-lntra restriction
`
`Problem
`
`Figures shows the module communication related ta MH-Intra at the reconstruction stage. Tt is assumed
`that the multiple blocks included in the 64x64 areaare processedin one pipeline stage. For inter process,
`MC is already executed in the previous stage, and inter prediction image is stored in the memory
`implemented at the reconstruction stage. The waiting communication between intra and inter is not
`neededsince theintra prediction module does not need io wait inter process.
`64x64 unit
`pipeline siace
`
`‘
`t
`e
`4
`z
`&
`=
`E
`Fr
`¥
`zt
`E
`é
`Maman,
`t
`k
`13 — |
`i
`t
`a
`t
`1
`No viaiting |
`OQ
`:
`:
`:
`4
`t
`een *|
`intra Pred be
`
`atc
`
`
`.
`
`|
`
`£
`£
`z
`i
`t
`t
`5
`'
`f
`¥
`‘
`
`1.
`i
`
`1t it
`
`bl
`
`atop
`
`|
`
`Figure 5: [utra and inter module communication of MH-Intra.
`
`On the other hand. Figure6 shows tke modufe communicaiion related to MH-Intra with LIC at the
`reconstruction stage. Both LIC module and intra prediction module have to be implemented in the
`reconstruction stage, end they need to wait for eachother since the processing end timings are different. If
`Hg ©
`the CU size is small,
`the number of waiting will be increased, and
`the additional cycie of the
`th
`reconstruction stage will be increased unacceptably.
`64x64 unit
`pipeline stage .
`
`eneenrnteatniegimiemanenetetitpsmtn ry
`?
`
`etiogl==ME
`Lenten
`
`:a
`
`z ta 4
`
`Figure 6: intra and Inier module communication ofMH-Intra with LIC.
`
`Solution
`We propose to disable MH-Intra when LIC is selected for small CU. Tablel shows the relationship
`between the CUsiz¢ for disabling MH-Intra with LIC and the worst case of the number of waiting in
`69x64 uni. fa the
`case of CE10.5.2 (no restriction). the worst case of the number of waiting is 64 since
`sy oo ef
`ety ete 4
`
`
`
`CSeee Eee oar) Cit. Emde
`
`minimum CUsize of MH-intra is 8x8. We prose to limit the mumber of waiting in 64064 unit up to & [t
`means that MH-Intra with LIC is disabled when the CUsize is equal to or less than 16x16. We consider
`that this condition is a good balanceas a trade-offbetween coding efficiency and increase in cycles by the
`results of simulations.
`
`Table L: warst case of number ofwaiting in 64x64 unit.
`
`CUsize for disabling
`
`satiate with L.ic
`
`Propose it
`
`5 Summary of combinations of LIC and other inter tools
`
`inter tools in CEI@.5.2 and this proposal, The
`Table2 shows the combination of £{C and other
`combination in CE10.5.2 is determined by the straightforward approach based on JEM implementation.
`On tap of it,
`the two exclusive methods desembed im the previous sections have been added to the
`proposal.
`
`Table 2: combination with other inter tools.
`
`| Combination with LIC
`
`Inter tool9(pe ee Cc ts
`
`"
`_|CE10.5.2|Proposal
`ee
`
`rakeetereeeteettenetitininaeeaterUMeanany sme
`!
`i
`Affine _
`i
`N
`Same as JEM implementation
`This proposal (always €exclusive)
`iATMYP
`N
`EIMYP
`
`i+
`
`
`
`
`
`6 Simulation results
`Table3 shows the simulation results of the proposed LIC. The simulation iis conducted using VTM-3.0
`with the test condition described inTVET-LIQIO [2]. 1hese¢ results showthat the BD-rate of the proposed
`LIC is the interesting gain considering the trade-olf of decoder complexity.
`
`Table 3: Sinvulation results of proposed LIC
`
`Random Accass Main 16
`—_
`Over VTM-3.6
` Class Ai
`¥
`u
`Y
`' EneT
`Dect
`|
`-0. 88%
`154%
`Ciass A2 | 032% 0.13% 0.20%|128%—100% i
`
`
`Class B | -054%=0.22%«=0.23% | 131% 99% |
`
`| CiassC |
`-0.28%
`630% 020% |
`137%
`101% |
`
`|
`
`|
`
`
`
` EeTahalFaia ee
`5ae
`
`ERiRic Hi= > Tlk. BEEF
`
`
`
`t Class | a i
`
`|Qverail|% 0.29% —-0.35% |
`
`ClassD|001% o09% aso% |
`ClassF | 085% -055% 049% |
`
`
`
` Over VIM-3.4
`V
`ts
`
`[Class Al
`Class AZ
`
`
`lass 8
`O70% 093% 051% |
`148%
`98%
`
`Class C
`
`048% O44% 9.49% |
`
`0.13% —_-0.20%
`
`
`Class
`
`
`
`Class D |
`
`; Class F
`
`
`
`
`7 Additional results (informative)
`
`‘Tabled shows the simulation results of CH10.5.2 LIC. These results showthatthe restrictions proposed by
`this contribution have a total of 0.07% loss in RA and 0.04%foss in LIDB. Item((VPDUrestriction) and
`liem2(ATMVPrestriction) have no loss, most of loss comes from lterm3(MH-Intra restrictian).
`
`Yable 4d: Simulation results of CE10.43.2 LIC
`
` Random Access Main 10
`—
`Over VTM-3.0
`¥
`U
`YVoo«d§:) EneT
`tate
`need
`
`
`
`Class A1|-0.97% -0.60% -091% | 151% 99%
`
`
`
`
`
`
`
`
`
`Class A2|-0.38% 014% -0.22% | 126% 100%
`
`Fo)
`Ciass 8
`60% 0.22% -G.24%
`434%
`
`
`
`
`137%034% -0.23% | 120%
`Claas ©
`
`
`
`Class E
`
`Overall
`
`Class D
`Class F
`
`
` Ciass At
`
`4
`
`|
`Class AZ ft
`ClassB | 077% 085% -0.90%
`449%
`FOO %
`
`Class|049% 022% -064% 61% 102%
`
`106%
`101%
`Class &
`27%
`Overali
`
`101%
`
`|
`%& |
`
`151%
`151%
`
`101% |
`
`0.11%
`-2.84%
`
`yo
`
`2*
`
`zt g 62
`
`&
`
`
`
`
`
`TableS shows the simulation results ofdisabling Mi-Intra for all CUs using LIC on top of proposed LIC.
`This condition can simplify the combination of LIC and MH-Intra, out the cost is 0.04% logs in RA and
`0.03% toss in 1.DB comparing to proposed LIC.
`
`Table 5: Simalation results of proposed LIC with disabling MH-Intra for all CUs using LIC.
`
`
`Random Access Main 1a
`Over VTM-3.0
`
`DecT
`u Vo:
`¥
`
`GO%
`G.S8% CIB 151%
`| Class A1
`-O. 78%
`
`| Casa A2{|-G30% -0.15% 0.22% | 128% TO1%
`
`
`
`
`
`Clase B|-049% O.19% -0.32% 131% 99% .
`
`Class ©
`0.28% -0. 28%
`9.16% : 137%
`101% be
`
`
`Class E
`:
`Overaii
`-0.27%
`35% | 135% 100%
`Ciass D
`0.01%
`13%
`433% 100%
`
`Class F
`0.62%
`OS1% + 128%
`101%
`
`OVS
`O77
`
`| Class Al
`.
`Ciass A2
`
`
`Class B|-.68% 1 86% % 148% 100%
`
`
`
`Class|-0.44% 25% | 460% 102%
`
`
`
`Class E|0.03% 183 25% | 105% 103%
`
`
`
`Ciass F
`
`Class 0
`
`Tabieé shows the simulation results of proposed LIC with encoder speed-up algorithm. In this algorithm,
`the cost calculation of LIC is skipped when the CUsize is equal to or less than 128 samples. These results
`show this encoder speed-up algorithmcan reduce 9% encoding time with the cost of 0.03% loss in RA,
`and reduce 11% encoding time with the cest of 0.06% loss in L.DB. We think there is more room to
`reduce encoding time, and this algorithm is one ofthe candidates.
`
`Table 6: Simulation results of proposed LICwith encoder speed-up algzorithne,
`
`
` Random Access Main 10
`
`Over VTM-3.0
`
`{ Vo} EneT=DecTu
`
`Class At
`| 090% -055% 085% © 144%
`99%
`
`Glass A2|-030% -0.20% -018% | 119% 104%
`
`
`
`ClaseB|051% 011% 023% | 124% 99%
`
`Class CG
`\ “020% 0.15% -G24% |
`123%
`401%
`Class&
`Overall DAB -0.24%
`wassD | 0.04%
`0.69%
`Class |
`-0.71% -050%
`
`Low delay 8 Mainio ~
`Over VTM-3.0
`
`i
`|
`
`Page: 7
`
`
`
`SRikIN- Hie Cl. Bie
`IEEaeaAla 6]
`
`Class At
`Class A2
`Class B |
`Class ©
`Class E
`
`‘
`
`a4
`
`~4,42%
`
`5.58% ;
`0.43% |
`0.37% |
`|
`i
`
`138%
`137%
`195%
`129%
`
`Class D
`Class F
`
`;
`
`0.22%
`
`& Conclusion
`
`This contribution is based on the LIC which is described in CE 10.5.2 (FVET-MO0087). CE10.5.2 can
`achieve the basement pipeline structure with LIC, but it still has memory bandwidth issue and stage cycle
`issue coming form pipeline restriction, This contribution provides the solution for these issues by
`introducing the LICrestriction for the combination with VPDUspiit, ATMVP. and MH-Intra. Simulation
`results reportedly showthat the proposed LIC provides 6.49% BD-rate gain for RA, 0.47% BD-rate gain
`for LDB. We believe that proposed LIC in this contribution should be adopted intothe test model,
`
`9 References.
`
`
`fi} &. Abe. Po Toma, §. Li, C.-W. Kuo, ¥. Drueeon, “CEO: Lowpipeline laiency [1C (est tfs.2y",
`JVET-MO0087, VET 13th Meeting: Marrakech, MA, 9-18 Jan. 2019,
`[2] Bossen, I. Boyce, K. Suehring, X. Li, V. Seregin, “[VET common test conditions and software
`reference configurations for SDR video”, JVET-LIOIG, TVET 12th Meeting: Macao, CN, 3-12 Oct.
`2018.
`
`10 Patent rights declaration(s)
`
`Panasonic Corporation may have current or pending patent rights relating to the technology
`described in this contribution and, conditioned on reciprocity, is prepared to grant licenses under
`reasonable and non-distriminatory ferms as necessary for implementation of the resulting ITU-T
`Recommendation |
`ISOAEC International Standard (per box 2 of the ITU-TATU-RASOJIEC
`patent statement and licensing declaration form).
`
`
`
`Sataenen i574 eh oyee GS Sk 2
`
`VAL
`
`Preview document JVET-MO088 for Marrakech meeting (MPEGnumber
`
`m45345)
`
`Document information
`
`
`Submitted by Abe Kivofurni
`
`Title
`
`CE10-related: LIC restriction for pipeline structure
`
`Associated Resource
`
`K. Abe
`Authors
`T.Toma
`Joi
`CW.HuoMug
`V.Drugeon(Panasonic:
`
`Organizations
`Abstract
`
`Related
`Contributions
`AhG
`
`Sub Group
`Group
`Standard
`
`Activity
`Document
`
`JVET
`A
`
`The document is net yet approved by the Chair.
`
`JVET-MO08S (version 2 - date 2019-01-03 04:00:05)
`
`JVET-MO088 (version 3 - date 2019-01-04 09:13:78}
`JVET-MO088 (version 4 - date 2019-01-13 12:42:46}
`
`uploaddecument
`
`Comments
`
`
`Attach file in comments: _
`
`
`BF Cae wigSe
`
`
`heen Siabanio tetas feieset ibaond sisardarmened clanienont what dn fii
`
`SALES -
`
`
`
`ceEe ae
`
`AriRi= Hf Cis.
`
`“OPEB, TU (RUAEMS) 2 AARONaR, ITU
`
`OMAR BIITY AY MEMS, HERTS | TU RUD BRD
`
`AMT SEO TCHS, = OTREIRPEHR LA,
`
`= OPERCaS HET Po
`
`bs, P-CCOMM. cE, SemovtAlS LT UlHRTBS.
`
`x OPREORE AE LUCHA SAI TULHS
`
`