International Journal of Science and Research (IJSR) ISSN: 2319-7064 SJIF (2022): 7.942

# Robust Decoder Hardware System Improvements for Tuned Predictions

#### Apoorva Reddy Proddutoori

San Diego

Email: apoorvaproddutoori[at]gmail.com

**Abstract:** To deliver UHD video services on portable devices with limited battery power, it's crucial to develop multi-core-based, dedicated HEVC hardware decoders that support tile- and wavefront-based parallel processing. This approach divides each frame into multiple picture partitions, which can then be processed simultaneously by multiple hardware decoder cores. However, parallelizing in-loop filtering (ILF) at tile boundaries proves challenging for multi-core HEVC hardware decoders due to the data dependency between samples across different tiles. Distributed Video Coding (DVC) shifts the computational complexity of traditional video coding from the encoder to the decoder, aiming to address the complex nature of the mode decision (MD) algorithm in H.264/AVC video coding. The goal was to transfer the MD algorithm from the encoder to the decoder. In this review, we present a proficient control technique for ILF across tile limits in multi-center HEVC equipment decoders. A decoder core can continue processing the subsequent coding tree unit (CTU) without waiting for other decoders to finish their ILF processing for neighboring CTUs in other tiles thanks to our approach, which eliminates the need for additional in-loop filters for ILF across tile boundaries.

Keywords: Decoder, H.265/HEVC, H.264/AVC, In Loop Filtering, 4k30 UHD, multi-core

#### 1. Introduction

Predictive video coding operates by creating the prediction frame from previously (de)coded frames, which serve as reference frames available to the decoder buffer. This arrangement allows for the selection of the most accurate prediction, minimizing the distortion between the prediction and the original frame. Since the prediction frame is only available at the encoder, this process cannot be replicated at the decoder without additional information. The quality of the reference frames and the degree of temporal correlation directly impact the quality of the prediction frame. This, in turn, affects the number of bits required to code each frame for a specific target quality, ultimately determining the ratedistortion (RD) performance.

The H.264/AVC video coding standard has taken a significant step forward in enhancing compression efficiency. However, the quest for even higher video compression factors continues, as evidenced by the recent Call for Proposals on video compression technology, jointly launched by MPEG and ITU-T. In this pursuit, there have been numerous proposals in recent years to surpass the capabilities of the state-of-the-art H.264/AVC standard in video compression factors. Predictive video coding leverages temporal correlation by generating a prediction frame based on advanced motion estimation and compensation techniques. This process involves coding only the residual difference between the original frame and the prediction, along with auxiliary information such as motion vectors and mode information, which are then sent to the decoder.

The example of this time is significant standard high level accounts. Consequently, video content has transformed into the most well-known media application. As communicated by Cisco, in 2022 there will be a lot of web-based video traffic, roughly 88%, since a large portion of web clients use mobile phones (phones, workstations ...) and incline toward video content, especially with a high spatial objective as UHD, 2K, 4K, and 8K. This value is more than ten times higher than it was in 2005. High-resolution videos, on the other hand, have

a lot of issues with communication channels and device capacity. Accordingly, the strain strategies of the past standard High Practicality Video Coding (HEVC) has become lacking to deal with this first rate in many levels as time, stockpiling, transmission capacity, etc.



Figure 1: Video Encoding

# 2. Traceback: Conventional H264/AVC Video CoDec

The traditional H.264/AVC video coding standard is for making block selections more flexible and making the ratio of error when comparing blocks. It can be purchased with eight various blocks mode choice, e.g., 16x16, 16x8, 8x16, 8x8, 8x4, and 4x4, as shown in Fig. 1, to ensure that the movement assessment and pay could be more selection with various macroblock types of flexibility, and expanding the exhibition at encoder however enormously raising the computational intricacy. Mode choice calculation based on the video sequence's frame complex, using different block modes, as a general rule, the perplexing part utilizes the smooth part and the smaller macroblock represent with bigger macroblock.

H.264 technology transforms HDMI (HD) audio and video data into an IP format suitable for transmission through an IP network. Conversely, a decoder reverses this process, converting the data back into HDMI signals. The flexibility of H.264 lies in its ability to send video signals from an encoder to multiple decoders at the same time. For instance, it's possible to send a single video stream to a display, a video screen, and a digital signage system all at once.

- 1) Advantages of Potential Utilization of H.264
- Efficient use of limited bandwidth and superior video monitoring. H.264 was developed to deliver high-quality video transmission with reduced bandwidth needs and

#### Volume 13 Issue 10, October 2024 Fully Refereed | Open Access | Double Blind Peer Reviewed Journal www.ijsr.net

### International Journal of Science and Research (IJSR) ISSN: 2319-7064 SJIF (2022): 7.942

lower delay compared to conventional video standards like MPEG-2. It employs an efficient codec that delivers high-quality visuals with minimal data usage. The data usage of H.264 is lower than other formats.

- H.264's data usage is about 80% less than that of Motion JPEG video. It's projected that the data usage reduction can be as high as 50% or more when compared to MPEG-2. For instance, H.264 can achieve better image quality with a lower data rate. With a lower data rate, it can maintain the same image quality. Less need for video storage space. H.264 requires much less storage space for video compared to other standards, which is crucial for facilitating smooth video transmission over the internet.
- Video compatibility across different vendors. Since H.264 is based on standards, it offers a video compatibility solution that is not vendor-specific. This means users can combine H.264 devices from various manufacturers without concerns about compatibility or proprietary issues.



## 2) H.265/HEVC In Loop FIltering

A key characteristic of HEVC is its use of a quadtree coding framework. Within HEVC, a CTU is equivalent to a set of maximum pixels, which includes hierarchical structures essential for pixel decoding. These CTUs divide images or image partitions. Consequently, creating a pipeline structure at the CTU level is simple for a single-core HEVC hardware decoder. If the decoding pipeline's processing unit block (PUB) is a coding unit (CU), coordinating the pipeline's processing becomes challenging due to the variable sizes of CUs within a CTU, ranging from to. Moreover, the prediction unit (PU) and transform unit (TU) in HEVC, which can also vary in size and have data dependencies, pose additional challenges as problematic PUBs for the decoding pipeline. Thus, a CTU-level pipeline framework is suitable for the design of HEVC hardware decoders.

The CTU-level pipeline structure for single-core HEVC hardware decoder involves three stages: ED, IQT, and DPS. ED decodes syntax elements and performs input data derivations. IQT performs inverse quantization and transform for each TU in a CTU. DPS stores decoded pictures to the current decoded picture buffer in an external memory. The ILF stage per-forms deblocking filtering (DBF) followed by SAO filtering for reconstructed samples in the CTU level. The output of each stage becomes misaligned with the boundaries

of current Coding Tree Block (CTB) due to data dependency between samples.



Figure 3: H.265/HEVC Encoding

### 3) Entropy Standards

Sharing origins back to H.261, entropy coding methods in traditional video compression standards predominantly employ variable length coding, with numerous code maps specifically designed for various parameters. As the concept of 'adaptive' gains popularity, context-based adaptive coding gradually takes over. Meanwhile, the complexity of code maps in traditional standards becomes unacceptable with the advent of HD (High Definition) requirements. Thus, arithmetic coding emerges as a viable solution. This introduction will cover entropy coding in various video compression standards:

- a) MPEG-1/2 Entropy coding in MPEG-1 and MPEG-2 is remarkably similar, utilizing simple variable length coding. The code maps, however, differ for various parameters.
- b) MPEG-4 Entropy coding in MPEG-4 texture coding is akin to that in MPEG-2, employing variable length coding. Arithmetic coding is utilized in MPEG-4 shape coding.
- c) H.264 H.264 marks a revolution with the introduction of the 'adaptive' concept. It employs three types of entropy coding: Exp-Golomb, CAVLC, and CABAC. Exp-Golomb is a specific case of UVLC in H.26L. CAVLC leverages statistical relations within the context, coding statistical sub-models for coefficients, trailing ones sign flag, level prefix, level suffix, total zeros, and more. CABAC is a simplified version of arithmetic coding, optimizing the possibility and multiplication models to reduce computational complexity.
- AVS AVS strikes a good balance between high efficiency and low complexity. Entropy coding in H.264 employs one-order Exp-Golomb for coding parameters with large dynamic ranges.

## 4) Proposed Controlled ILF Method

The CTU-level pipeline structure outlined in Section II can be leveraged to enhance the performance of a high-performance HEVC decoder. However, the performance of an HEVC decoder can be further optimized by capitalizing on the datalevel parallelism inherent in decoding picture partitions. Tiles, which are configured by an HEVC encoder, can be decoded simultaneously by multiple decoder cores, utilizing the CTUlevel pipeline structure, except for the ILF across tile boundaries.

A straightforward method for handling ILF across tile boundaries involves the use of additional hardware for in-loop

#### Volume 13 Issue 10, October 2024 Fully Refereed | Open Access | Double Blind Peer Reviewed Journal www.ijsr.net

filters. This strategy, though, introduces the drawback of increased hardware costs and latency. An alternative strategy is to schedule the synchronization of decoder cores at tile boundaries. In this scenario, a decoder core responsible for processing a single tile would be tasked with waiting for another decoder core to complete decoding certain parts of the adjacent tile, thereby enabling the ILF across tile boundaries. However, this approach could potentially slow down the overall decoding speed due to the overhead cycles required for synchronization, H.264/AVC DBF on a many-core platform.

To productively control a multi-center based HEVC hardproduct decoder without extra in-circle channels and decoder synchronizations for ILF across tile limits, a limit CTU status record (BCSI) is characterized. The BCSI is a no-good list demonstrating the pipeline handling status of a which is neighboring at least one tile limits. Test regions inside and around the current CTB in a tile still up in the air for ILF along tile boundaries by actually looking at least one BCSIs of the adjoining CTUs in different tiles. The BCSI of an is safely checked and refreshed while the is handled in the decoder pipeline.

## 3. Conclusion

The suggested Direct Video Coding (DVC) and PB-based DVC demonstrated approaches have significant improvements in performance and a notable reduction in the complexity of the decoder compared to traditional DVC strategies. Additionally, PB-based DVC efficiently transfers the computation and decision-making processes, which are among the most complex aspects of the current H.264/AVC video compression, from the encoder to the decoder. This shift effectively reduces the complexity of the encoder, bringing it closer to the complexity of H.264/AVC in-stream coding, with the proposed decoder complexity being over a hundred times lower than that of the latest Find codec.

To enhance H.264/AVC forward error correction (FEC) video coding, with a focus on the B-cut coding approach, by utilizing a decoder-side data structure, a common technique in many existing video coding strategies. Two key methods for creating this data structure were employed to develop this reference structure. The results of the data structure performance evaluation indicate the superiority of this coding technique, achieving bitrate savings of up to 9.8% and an average bitrate savings of 5.89%. Future research aims to investigate the integration of a side data generation method for extrapolation, with the goal of improving the H.264/AVC P-cut coding approach. It is widely recognized that tile-based equalization is crucial for the performance of HEVC in handling large-scale recordings (4K/8K-UHD).

The ability to manage overlapped tiles is essential for maintaining coding efficiency and visual quality. In this study, we present an efficient tile-based equalization control system for overlapped tile management in multi-center HEVC decoder systems. For tiles overlapping each other, our proposed system identifies the optimal regions that can be handled independently in each decoder center, considering the equalization of various tiles. This approach eliminates the need for additional channels for tile overlap management, as it utilizes the existing channels in each decoder center. To validate the effectiveness of our proposed tile-based equalization control system, we implemented it with a quad-center HEVC decoder for 4K-UHD video on a prototype FPGA board. The preliminary results demonstrate that the quad-center HEVC decoder, equipped with our proposed system, can achieve tile-based equalization speedups that are directly proportional to the number of centers, with the above mentioned speedups being negligible.

## References

- D-K. Vo-Nguyen1,2, J. Jung1, J-M. Thiesse3, M. Antonini2, "SMART DECODER: A NEW PARADIGM FOR VIDEO CODING", 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP)
- [2] Liu Juanjuan1, Liu Bin1, Liu Zhengguang2, "The Design of SoC Platform Basaed on Digital Video Media Processing Technology", 2016 Eighth International Conference on Measuring Technology and Mechatronics Automation
- [3] Liang-Hao Wang1, 2\*, Dong-Xiao Li1, 2, Ming Zhang1,
  "SoC Design of VLD in Multi-Standard Video Decoder for Wearable Multimedia Players", 2010 Asia-Pacific Conference on Wearable Computing Systems
- [4] Seunghyun Cho, HyunMi Kim, Hui Yong Kim, and Munchurl Kim, "Efficient In-Loop Filtering Across Tile Boundaries for Multi-Core HEVC Hardware Decoders With 4 K/8 K-UHD Video Applications", IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 17, NO. 6, JUNE 2015
- [5] Ted Chih-Wei Lei, Fan-Shuo Tseng, "A H.264/AVC Based Distributed Video Coding Paradigm with Mode Decision at Decoder", 2014 International Symposium on Computer, Consumer and Control
- [6] Aymen Zayed, Nidhameddine Belhadj, Khaled Ben Khalifa, Mohamed Hedi Bedoui, "VVC intra prediction decoder : Feature improvement and performance analysis", 2022 IEEE International Conference Design and Test of Integrated Micro and Nano Systems
- [7] Xiem HoangVana, João Ascensob, Fernando Pereiraa, "IMPROVING PREDICTIVE VIDEO CODING PERFORMANCE WITH DECODER SIDE INFORMATION", IEEE 2012
- [8] https://www.researchgate.net/publication/220905099\_R educing\_finegrain\_communication\_overhead\_in\_multithread\_code\_ generation\_for\_heterogeneous\_MPSoC/figures