Title
Towards automatic building of term hierarchies from large patent datasets
Date Issued
03 April 2017
Access level
metadata only access
Resource Type
conference paper
Author(s)
Publisher(s)
Institute of Electrical and Electronics Engineers Inc.
Abstract
Term Hierarchies are structures that represent semantic relations among terms, usually of the type hyperonym and hyponym (generality and specificity, respectively). There are many scenarios that may benefit from the knowledge within Term Hierarchies. Particularly, in the patents genre there is an important demand of knowledge representation for Information Retrieval purposes, and few works have approached this demand. In this work we proposed a three stage strategy for term hierarchy building from patents. In the first stage, terms were extracted through non-phrase identification; in the second stage terms were organized hierarchically through n-gram decomposition; linally, in the third stage, the term hierarchy was eoriched with term embeddings information, particularly from Word2Vec model. This strategy was applied over patents from the United States Patent and Trademark Office (USPTO) collection. Each term in the hierarchy generated a set of associated patents. For the evaluation task we applied two strategies over the sets of associated documents, one based on the clustering degree of the sets, and the other one based on the IPC (International Patent Classification) categories proportions within the sets. Results show tbat the produced term hierarchy efficiently captures generic and specific concepts.
Language
English
OCDE Knowledge area
Ciencias de la computación
Scopus EID
2-s2.0-85018433213
ISBN
9789526839783
Source
Proceedings of the AINL FRUCT 2016 Conference
Resource of which it is part
Proceedings of the AINL FRUCT 2016 Conference
Conference
5th Artificial Intelligence and Natural Language FRUCT Conference, AINL FRUCT 2016
Sources of information: Directorio de Producción Científica Scopus