• Anglický jazyk

Develop a Part-of-Speech Tagger and a Tagger-Maker

Autor: Jiayun Han

This project is aimed to build an efficient, scalable, portable, and trainable part-of-speech tagger. Using 98% of Penn Treebank-3 as the training data, it builds a raw tagger, using Bayes' theorem, a hidden Markov model, and the Viterbi algorithm. After... Viac o knihe

Na objednávku

36.99 €

bežná cena: 41.10 €

O knihe

This project is aimed to build an efficient, scalable, portable, and trainable part-of-speech tagger. Using 98% of Penn Treebank-3 as the training data, it builds a raw tagger, using Bayes' theorem, a hidden Markov model, and the Viterbi algorithm. After that, a reinforcement machine learning algorithm and contextual transformation rules were applied to increase the tagger's accuracy. The tagger's final accuracy on the testing data is 96.51% and its speed is about 26,000 words per second on a computer with two-gigabyte random access memory and two 3.00 GHz Pentium duo processors. The tagger's portability and trainability are proved by the tagger-maker's success in building a new tagger out of a corpus that is annotated with the tagset different from that of Penn Treebank.

  • Vydavateľstvo: LAP LAMBERT Academic Publishing
  • Rok vydania: 2013
  • Formát: Paperback
  • Rozmer: 220 x 150 mm
  • Jazyk: Anglický jazyk
  • ISBN: 9783659376221

Generuje redakčný systém BUXUS CMS spoločnosti ui42.