• Anglický jazyk

AUTOMATIC EXTRACTION OF LEMMA-BASED BILINGUAL DICTIONARIES

Autor: Ibrahim Mohamed Hassan Saleh

This academic work presents an approach for the automatic extraction and filtering of a lemma-based Arabic-English dictionary from parallel corpora. Towards this end, the present approach makes use of Machine Learning algorithms to filter the Arabic-English... Viac o knihe

Na objednávku

45.36 €

bežná cena: 50.40 €

O knihe

This academic work presents an approach for the automatic extraction and filtering of a lemma-based Arabic-English dictionary from parallel corpora. Towards this end, the present approach makes use of Machine Learning algorithms to filter the Arabic-English lemma pairs wrongly extracted from the parallel corpus as good translation pairs. It also makes use of highly accurate morphological analyzers and generators of Arabic to overcome the morphological ambiguity of the Arabic words. A comparison of the automatically generated dictionary with a manually built dictionary widely used in Arabic Computational Linguistics applications shows a high degree of coverage complementarity on the part of the automatically generated dictionary. The comparison also shows that the generated dictionary: (1) has reasonable recall and high precision, (2) is significantly more comprehensive in terms of the covered Arabic-English lemma pairs, and (3) has high potential for future improvement.

  • Vydavateľstvo: LAP LAMBERT Academic Publishing
  • Rok vydania: 2010
  • Formát: Paperback
  • Rozmer: 220 x 150 mm
  • Jazyk: Anglický jazyk
  • ISBN: 9783838357522

Generuje redakčný systém BUXUS CMS spoločnosti ui42.