JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]


Journal of Information Science and Engineering, Vol. 22 No. 5, pp. 1145-1162


Automatic Closed Caption Detection and Filtering in MPEG Videos for Video Structuring


Duan-Yu Chen, Ming-Ho Hsiao and Suh-Yin Lee
Department of Computer Science and Information Engineering 
National Chiao Tung University 
Hsinchu, 300 Taiwan 
E-mail: {dychen; mhhsiao; sylee}@csie.nctu.edu.tw


    Video structuring is the process of extracting temporal structural information of video sequences and is a crucial step in video content analysis especially for sports videos. It involves detecting temporal boundaries, identifying meaningful segments of a video and then building a compact representation of video content. Therefore, in this paper, we propose a novel mechanism to automatically parse sports videos in compressed domain and then to construct a concise table of video content employing the superimposed closed captions and the semantic classes of video shots. First of all, shot boundaries are efficiently examined using the approach of GOP-based video segmentation. Color-based shot identification is then exploited to automatically identify meaningful shots. The efficient approach of closed caption localization is proposed to first detect caption frames in meaningful shots. Then caption frames instead of every frame are selected as targets for detecting closed captions based on long-term consistency without size constraint. Besides, in order to support discriminate captions of interest automatically, a novel tool – font size detector is proposed to recognize the font size of closed captions using compressed data in MPEG videos. Experimental results show the effectiveness and the feasibility of the proposed mechanism.


Keywords: caption frame detection, closed caption detection, font size differentiation, video structuring, video segmentation

  Retrieve PDF document (JISE_200605_10.pdf)