JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]


Journal of Information Science and Engineering, Vol. 40 No. 2, pp. 317-339


Embedding-based Two-Stage Entity Alignment for Cross-Lingual Knowledge Graphs


YUXIANG SUN1 AND YONGJU LEE2,+
1Software Technology Research Center
2School of Computer Science and Engineering
Kyungpook National University
Daegu, 41566 Republic of Korea
E-mail: syx921120@gmail.com; yongju@knu.ac.kr


In the knowledge graph alignment process, most researchers focus only on heterogeneous entities, thereby, ignoring the impact of homogeneous entities on matching accuracy and efficiency. In this study, we propose a two-stage strategy that corresponds to homogeneous and heterogeneous entities. In the first stage, an embedding-based semantic clustering algorithm is applied to divide the semantics into multiple clusters, which are paired according to the centroid distance. Additionally, homogeneous entities were matched by combining the Linked Lists and k-dimensional trees. In the second stage, we propose an embedding-based graph convolutional neural network (E-GCN) model that assigns different weights to relations based on the aligned homogeneous entities in the first stage. Compared with other GCN-based models, the entity alignment (EA) accuracy of the E-GCN model was the best, and the training time was reduced by 43.2%. Experimental results reveal that the proposed two-stage method significantly improves EA performance compared with state-of-the-art EA models. Moreover, the Canberra semantic distance is most suitable for representing the similarity between entities and the exponential linear unit (ELU) activation function accelerated the convergence of the E-GCN model.


Keywords: knowledge graph embedding, entity alignment, graph convolutional network, heterogeneous entities, homogeneous entities, semantic distance calculation

  Retrieve PDF document (JISE_202402_08.pdf)