Title
Optimization of blast seed indexing in the alignment of DNA sequences with GPU using CUDA
Date Issued
01 October 2018
Access level
metadata only access
Resource Type
conference paper
Publisher(s)
Institute of Electrical and Electronics Engineers Inc.
Abstract
In the alignment of biological sequences such as DNA, RNA and proteins, different algorithms are used, mainly the Basic Local Alignment Search Tool (BLAST), which has two phases, a heuristic phase of seed indexing and another extension phase with a comparison of sequences using the Smith-Waterman (SW) algorithm, which allows the alignment of a short sequence 'query' with a long reference sequence 'db' in a very fast way in relation to other algorithms of alignment. This work proposes to use a two-dimensional matrix instead of a sparse matrix as a hash table for the storage of the seed index obtained, as well as the use of the GPU of our graphic card to optimize the planting, it reduces 11.24 % of the time of processing of seed indexing phase of the BLAST, presenting the use of GPU with CUDA a better performance in processing time than the sequential implementation and another multi CPUs using threads with OPENMP. Our algorithm has a complexity in time of O(1) to obtain the seeds identical to the pattern key. The performance is greater when the length of the hash key increases. For its evaluation tests we used a laptop core i7 of 16gb of RAM and a graphic card of 384 cores with C++ programming language and CUDA. Alignment tests were performed using real DNA sequences obtained from the National Center for Biotechnology Information (NCBI) and ENSEMBL in FASTA format with reference sequences of up to 1.3 Gb, such as the complete genome of the hen (Gallus gallus) that has 1 230 258 557 base pairs (bp) and with a query sequence of 140 bp, which was indexed with a 5 bp key in 1074 milliseconds using GPU.
Start page
527
End page
532
Language
Spanish
OCDE Knowledge area
Bioinformática
Scopus EID
2-s2.0-85071094840
Resource of which it is part
Proceedings - 2018 44th Latin American Computing Conference, CLEI 2018
ISBN of the container
9781728104379
Conference
44th Latin American Computing Conference, CLEI 2018
Sources of information: Directorio de Producción Científica Scopus