JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24]


Journal of Information Science and Engineering, Vol. 26 No. 2, pp. 409-426


On Parallelizing H.264/AVC Rate-Distortion Optimization Baseline Profile Encoder


JING-XIN WANG+, YUNG-CHANG CHIU++, ALVIN W. Y. SU+ AND CE-KUEN SHIEH++
+Department of Computer Science and Information Engineering 
++Department of Electrical Engineering 
National Cheng Kung University 
Tainan, 701 Taiwan


    A H.264/AVC encoder can incorporate many coding schemes, such as rate-distortion optimization (RDO), into its design to improve its compression performance, dramatically raising computational complexity. With the H.264/AVC RDO encoder, computation time is primarily spent calculating the rate-distortion cost in choosing the optimal coding mode for both inter and intra coding modes. Parallel computation is one of the ways to speed up the encoder. However, calculating rate-distortion costs requires a great amount of reference data obtained from coded adjacent macroblocks in order to maintain the coding efficiency established by the JM encoder. This is an undesirable property for any parallel computing strategy. The transmission of such a large amount of reference data, as well as the frequency of transmission between processing nodes, reduces the speed of the entire encoding process. Thus, it may become necessary to drop part of the reference data and decrease the frequency of transmission in order to reduce the traffic. In the investigation of this problem, this study uses three different parallel schemes for the implementation of the H.264/AVC RDO encoder. These schemes are each run over a software DSM-based (distributed shared memory) PC cluster system consisting of 1 to 5 PC computers (one master node, with or without one to several slave processing nodes). The amount of data to be exchanged among processing nodes is analyzed for each scheme. In addition, the PSNR performance and the number of speedup results are provided for each scheme. Experiments show that considerable reduction in coding gain is expected, as more information is dropped. In lower bit rate cases, performance is reduced to the level of a regular H.264 encoder. Nevertheless, this paper provides a good reference for implementing such an encoder utilizing a cluster computing system.


Keywords: H.264/AVC, rate-distortion optimization, distributed shared memory system, parallel video encoder, cluster computing system

  Retrieve PDF document (JISE_201002_06.pdf)