Title
Yet another ranking function for automatic multiword term extraction
Date Issued
01 January 2014
Access level
open access
Resource Type
journal article
Author(s)
University of Montpellier 2
Publisher(s)
Springer Verlag
Abstract
Term extraction is an essential task in domain knowledge acquisition. We propose two new measures to extract multiword terms from a domain-specific text. The first measure is both linguistic and statistical based. The second measure is graph-based, allowing assessment of the importance of a multiword term of a domain. Existing measures often solve some problems related (but not completely) to term extraction, e.g., noise, silence, low frequency, large-corpora, complexity of the multiword term extraction process. Instead, we focus on managing the entire set of problems, e.g., detecting rare terms and overcoming the low frequency issue. We show that the two proposed measures outperform precision results previously reported for automatic multiword extraction by comparing them with the state-of-the-art reference measures.
Start page
52
End page
64
Volume
8686
Language
English
OCDE Knowledge area
Ciencias de la Información
Scopus EID
2-s2.0-84921666450
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN of the container
03029743
Sponsor(s)
Université de Montpellier - UM
Acknowledgments. This work was supported in part by the French National Research Agency under JCJC program, grant ANR-12-JS02-01001, as well as by University of Montpellier 2 and CNRS.
Sources of information:
Directorio de Producción Científica
Scopus