Title
The impact of network sampling on relational classification
Date Issued
01 January 2016
Access level
metadata only access
Resource Type
conference paper
Author(s)
Berton L.
Vega-Oliveros D.A.
Da Silva A.T.
De Andrade Lopes A.
University of São Paulo
Publisher(s)
CEUR-WS
Abstract
Many real-world networks, such as the Internet, social networks, biological networks are massive in size, which difficult different processing and analysis tasks. For this reason, it is necessary to apply a sampling process to reduce the network size without losing relevant network information. In this paper, we propose a new and intuitive sampling method based on exploiting the following centrality measures: degree, k-core, clustering, eccentricity and structural holes. For our experiments, we delete 30% and 50% of the vertices from the original network and evaluate our proposal on six real-world networks on relational classification task using six different classifiers. Classification results achieved on sampled graphs generated from our proposal are similar to those obtained on the entire graphs. In most cases, our proposal reduced the original graphs by up to 50% of its original number of edges. Moreover, the execution time for learning step of the classifier is shorter on the sampled graph.
Start page
62
End page
72
Volume
1743
Language
English
OCDE Knowledge area
Ciencias de la información Estadísticas, Probabilidad
Scopus EID
2-s2.0-85006176764
Source
CEUR Workshop Proceedings
ISSN of the container
16130073
Conference
3rd Annual International Symposium on Information Management and Big Data, SIMBig 2016
Sources of information: Directorio de Producción Científica Scopus