The pairwise problem with High Performance Computing Systems, contextualized as a key part to solve the Multiple Sequence Alignment problem

Montañola Lacort, Alberto

The pairwise problem with High Performance Computing Systems, contextualized as a key part to solve the Multiple Sequence Alignment problem

Author

Montañola Lacort, Alberto

Director

Roig Mateu, Concepció

Hernández Budé, Porfidio

Date of defense

2016-02-02

Pages

140 p.

Department/Institute

Universitat de Lleida. Departament d'Informàtica i Enginyeria Industrial

Abstract

L'alineació múltiple de seqüencies (MSA), com a repte dins de la bioinformàtica, es un element clau per entendre el funcionament del genoma. Aquest consisteix en alinear en un temps òptim aquestes seqüencies garantint un nivell de qualitat. Aquest problema esdevé un repte de computació de altes prestacions degut als requeriments de recursos de memòria i còmput. S'han estudiat diferents implementacions, les quals es comparen i es presenten en aquesta investigació. Hem contribuït en la millora dels primers passos del problema MSA de diverses maneres. Amb l'objectiu de reduir el temps de càlcul i l'ús de memòria, adaptem T-Coffee per treballar en paral·lel amb ús de fils lleugers. Seguidament, hem desenvolupat un mètode de alineació de parells paral·lel, amb una assignació eficient de seqüències a nodes. Finalment es presenta un mètode per determinar la quantitat mínima de recursos del sistema, necessaris per resoldre un problema d'una mida determinada, per tal de configurar el sistema per un ús eficient.

El alineamiento múltiple de secuencias (MSA), como reto dentro de la bioinformática, es un elemento clave para entender el funcionamiento del genoma. Este consiste en alinear en un tiempo óptimo esta secuencias garantizando un nivel de calidad. Este problema es un reto de computo de altas prestaciones debido a los altos requerimientos de memoria y computo. Se han estudiado diferentes implementaciones, las cuales se comparan y se presentan en esta investigación. Hemos contribuido en la mejora de los primeros pasos del problema MSA de diversas formas. Con el objetivo de reducir el tiempo de cálculo y el uso de memoria, adaptamos T-Coffee para trabajar en paralelo con el uso de hilos ligeros. Seguidamente, hemos desarrollado un método de alineación de pares en paralelo, con una asignación eficiente de secuencias a nodos. Finalmente se presenta un método para determinar la cantidad mínima de recursos del sistema, necesarios para resolver el problema de un tamaño determinado, para poder configurar el sistema para un uso eficiente.

The multiple sequence alignment (MSA), as a challenge in bioinformatics, becomes a key element for understanding the inner working of the genome. This consists on aligning these sequences in an optimal time, with a good level of quality. This problem is a challenge for the high performance computing, because of the high memory and processing requirements. Different implementations were studied, which are being compared and presented on this thesis. We have contributed in the improvement of the first steps of the MSA problem in different ways. With the goal of reducing the computing time and the memory usage, we adapted T-Coffee for working in parallel with the usage of threads. Furthermore, we have developed a pair-wise sequence alignment method, with an efficient mapping of sequences to nodes. Finally, we are presenting the method for determining the minimal amount of resources, required for solving the problem of a determined size, in order to configure the system for an efficient use.

Keywords

Alineació múltiple de seqüencies; Computació distribuïda; Bioinformàtica; Alineamiento múltiple de secuencias; Computación distribuida; Bioinformática; Multiple sequence alignment; Distributed computing; Bioinformatics

Subjects

004 - Computer science

Knowledge Area

Arquitectura i tecnologia d'ordinadors

Documents

Taml1de1.pdf

5.039Mb

Export

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Rights

L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-sa/3.0/es/

This item appears in the following Collection(s)

Departament d'Informàtica i Enginyeria Industrial [77]

The pairwise problem with High Performance Computing Systems, contextualized as a key part to solve the Multiple Sequence Alignment problem

Author

Director

Date of defense

Pages

Share

Department/Institute

Abstract

Keywords

Subjects

Knowledge Area

Documents

Export

Rights

This item appears in the following Collection(s)