A high-performance VLSI architecture for the H.264/AVC context-adaptive variable- length decoder (CAVLD) is proposed in order to reduce the computation time. The overall computation is pipelined, and a parallel processing is employed for high performance. For the run_before computation, the values of input symbols are estimated in parallel to check if their computation can be skipped in advance. Experimental results show that the performance of run_before is improved by 134% on average when four symbols are estimated in parallel, while the area of the VLSI implementation is only increased by 12% compared to a sequential method. The degree of parallelism employed for the estimation module is 4, and it can be changed easily. H.264/AVD is an essential technology for the multimedia engines of many consumer electronics applications, such as D-TVs and mobile devices. The proposed method contributes to the performance improvement of those applications.