Title
Classification of Breast Cancer and Breast Neoplasm Scenarios Based on Machine Learning and Sequence Features from lncRNAs–miRNAs-Diseases Associations
Date Issued
01 December 2021
Access level
metadata only access
Resource Type
journal article
Author(s)
Publisher(s)
Springer Science and Business Media Deutschland GmbH
Abstract
The influence of non-coding RNAs, such as lncRNAs (long non-coding RNAs) and miRNAs (microRNAs), is undeniable in several diseases, for example, in the formation of neoplasms and cancer scenarios. However, there are challenges due to the scarcity of validated datasets and the imbalance in the data. We found that the research of associations between miRNAs-lncRNAs and diseases is limited or done separately. In addition, those investigations, which use Machine Learning models joined with genomic sequence features extracted from miRNAs and lncRNAs, are few compared with using some methods such as genomic expression or Deep Learning techniques. In this paper, we propose a structure of using supervised and unsupervised machine learning models with genomic sequence features, such as k-mers, sequence alignments, and energy folding values, to validate miRNAs and lncRNAs association with breast cancer and neoplasms scenarios. Using One-Class SVM for outlier detection and comparing two supervised models such as SVM and Random Forest, we manage to obtain accuracy results of 95.44% for the One-class model, with 88.79% and 99.65% for the SVM and Random Forest models, respectively. The results showed a promising path for the study of sequence features interactions joined with Machine Learning models comparable to those found in the existing literature. Graphic Abstract: [Figure not available: see fulltext.]
Start page
572
End page
581
Volume
13
Issue
4
Language
English
OCDE Knowledge area
Ciencias de la información
Oncología
Subjects
Scopus EID
2-s2.0-85117284645
PubMed ID
Source
Interdisciplinary Sciences – Computational Life Sciences
ISSN of the container
19132751
Sponsor(s)
This research is supported partially by South African National Research Foundation Grants (Nos. 114911 & 132797) and Tertiary Education Support Programme (TESP) of South African ESKOM.
Sources of information:
Directorio de Producción Científica
Scopus