JISE

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Journal of Information Science and Engineering, Vol. 5 No. 4, pp. 437-448

A Unification-Based Approach to Lexicography for Machine Translation Systems

Shu-Chuan Chen, Mei-Hui Wang and Keh-Yih Su⁺
BTC R&D Center
2F, 28 R&D Road II
Science-Based Industrial Park
Hsinchu, Taiwan, Republic of China
⁺Department of Electrical Engineering
National Tsing Hua University
Hsinchu, Taiwan, Republic of China

In an operational machine translation system, a variety of texts will be encountered even if its domain of usefulness is restricted to a specific field. This diversity of texts poses a problem on handling word sense ambiguity and customized translation. This paper presents a unification-based method for lexicography that can lessen this problem by dividing the system dictionaries into hierarchically structured general, technical, and customer dictionaries, and also by unifying lexical information in these different dictionaries. In the paper, detailed discussion and example of the unification technique are given. Other advantages of using unification are also noted: the time spent in dictionary construction is saved; dictionary storage space is minimized; the integrity of distinct dictionaries is preserved; customized vocabulary is provided without resorting to run-time dictionary; and the option regarding which dictionaries to be unified is kept open. Besides, in view of the fact that categorial ambiguity might occur as a result of unification, a score function is added as a solution. With these advantages, the unification approach to lexicography is regarded as viable in enhancing the translation performance of a practical machine translation system.

Keywords: machine translation system, word sense ambiguity, customized translation, disambiguation, lexicography, lexicon, general dictionary, technical dictionary, customer dictionary, hierarchical structure of dictionaries, unification

Retrieve PDF document (JISE_198904_09.pdf)