JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]


Journal of Information Science and Engineering, Vol. 36 No. 2, pp. 309-322


Unsupervised Weighting of Transfer Rules in Rule-Based Machine Translation using Maximum-Entropy Approach


SEVILAY BAYATLI1, SEFER KURNAZ1, ABOELHAMD ALI2,
JONATHAN NORTH WASHINGTON3 AND FRANCIS M. TYERS4,5
1Department of Electrical and Computer Engineering
Altınbaş Üniversitesi
Istanbul, 34217 Turkey

2Department of Computer and Systems Engineering
Alexandria University
Alexandria, 11432 Egypt

3Linguistics Department, Swarthmore College
Swarthmore, PA 19081 USA

4School of Linguistics, Higher School of Economics
Moscow, 101000 Russia

5Department of Linguistics
Indiana University
Bloomington, IN 47405 USA
E-mail: sewale.taha@ogr.atlinbas.edu.tr; sefer.kurnaz@altinbas.edu.tr;
aboelhamd.abotreka@gmail.ocm; jonathan.washington@swarthmore.edu; ftyers@iu.edu.tr​


In this paper we present an unsupervised method for learning a model to distinguish between ambiguous se-lection of structural transfer rules in a rule-based machine translation (MT) system. In rule-based MT systems, transfer rules are the component responsible for converting source language morphological and syntactic structures to target language structures. These transfer rules function by matching a source language pattern of lexical items and applying a sequence of actions. There can, however, be more than one potential sequence of actions for each source language pattern. Our model consists of a set of maximum entropy (or logistic regression) classifiers, one trained for each source language pattern, which select the highest probability sequence of rules for a given sequence of patterns. We perform experiments on the Kazakh - Turkish language pair - a low-resource pair of morphologically-rich languages - and compare our model to two reference MT systems, a rule-based system where transfer rules are applied in a left-to-right longest match manner and to a state-of-the-art system based on the neural encoder-decoder architecture. Our system outforms both of these reference systems in three widely used metrics for machine translation evaluation.


Keywords: machine translation, weighting, structural transfer rules, ambiguous rules, disambiguation

  Retrieve PDF document (JISE_202002_10.pdf)