Title
Coh-Metrix-Esp: A Complexity Analysis tool for documents written in Spanish
Date Issued
01 January 2016
Access level
metadata only access
Resource Type
conference paper
Author(s)
Publisher(s)
European Language Resources Association (ELRA)
Abstract
Text Complexity Analysis is an useful task in Education. For example, it can help teachers select appropriate texts for their students according to their educational level. This task requires the analysis of several text features that people do mostly manually (e.g. syntactic complexity, words variety, etc.). In this paper, we present a tool useful for Complexity Analysis, called Coh-Metrix-Esp. This is the Spanish version of Coh-Metrix and is able to calculate 45 readability indices. We analyse how these indices behave in a corpus of "simple" and "complex" documents, and also use them as features in a complexity binary classifier for texts in Spanish. After some experiments with machine learning algorithms, we got 0.9 F-measure for a corpus that contains tales for kids and adults and 0.82 F-measure for a corpus with texts written for students of Spanish as a foreign language.
Start page
4694
End page
4698
Language
English
OCDE Knowledge area
Lingüística
Ingeniería de sistemas y comunicaciones
Subjects
Scopus EID
2-s2.0-85037116955
Resource of which it is part
Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
ISBN of the container
978-295174089-1
Conference
10th International Conference on Language Resources and Evaluation, LREC 2016
Sources of information:
Directorio de Producción Científica
Scopus