Title
Big data analytics for critical information classification in online social networks using classifier chains
Date Issued
2022
Access level
metadata only access
Resource Type
journal article
Author(s)
Silva D.H.
Maziero E.G.
Saadi M.
Rosa R.L.
Silva J.C.
Igorevich K.K.
Federal University of Lavras
Publisher(s)
Springer
Abstract
Industrial and academic organizations are using online social network (OSN) for different purposes, such as social and economic aspects. Now, OSN is a new mean of obtaining information from people about their preferences, and interests. Due to the large volume of user-generated content, researchers use various techniques, such as sentiment analysis or data mining to evaluate this information automatically. However, the sentiment analysis of OSN content is performed by different methods, but there are some problems to obtain highly reliable results, mainly because of the lack of user profile information, such as gender and age. In this work, a novel dataset is built, which contains the writing characteristics of 160,000 users of the Twitter OSN. Before creating classification models with Machine Learning (ML) techniques, feature transformation and feature selection methods are applied to determine the most relevant set of characteristics. To create the models, the Classifier Chain (CC) transformation technique and different machine learning algorithms are applied to the training set. Simulation results show that the Random Forest, XGBoost and Decision Tree algorithms obtain the best performance results. In the testing phase, these algorithms reached Hamming Loss values of 0.033, 0.033, and 0.034, respectively, and all of them reached the same F1 micro-average value equal to 0.976. Therefore, our proposal based on a multidimensional learning technique using CC transformation overcomes other similar proposals.
Start page
626
End page
641
Volume
15
Issue
1
Language
English
OCDE Knowledge area
Medios de comunicación, Comunicación socio-cultural Telecomunicaciones
Scopus EID
2-s2.0-85122895297
Source
Peer-to-Peer Networking and Applications
ISSN of the container
19366442
Sources of information: Directorio de Producción Científica Scopus