Title
Ship-lemmatagger: Building an nlp toolkit for a peruvian native language
Date Issued
2017
Access level
restricted access
Resource Type
conference paper
Author(s)
Pereira-Noriega J.
Mercado-Gonzales R.
Publisher(s)
Springer Verlag
Abstract
Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing of the morphology of the language (such as lemmatization) and automatic identification of the part-of-speech tag. Thereby, this paper describes the implementation of a basic NLP toolkit for a new language, focusing in the features mentioned before, and testing them in an own corpus built for the occasion. The obtained results exceeded the expected results and could be used for more complex tasks such as machine translation. © Springer International Publishing AG 2017.
Start page
473
End page
481
Volume
10415 LNAI
Number
3
Language
English
Subjects
Scopus EID
2-s2.0-85028645758
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN of the container
0302-9743
ISBN of the container
9783319642055
Conference
20th International Conference on Text, Speech and Dialogue, TSD 2017
Sponsor(s)
Acknowledgments. For this study, the authors appreciate the linguistic team effort that made possible the corpus annotation, and also acknowledge the support of the “Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica” (CONCYTEC Perú) under the contract 225-2015-FONDECYT.
Sources of information:
Directorio de Producción Científica