Universitat de Barcelona. Departament de Bioquímica i Biologia Molecular (Farmàcia)
Macromolecular structure, and, specifically, its dynamics and flexibility, play a crucial role in its final biological function. Intense efforts are being made to obtain experimental information about macromolecular flexibility. However, despite encouraging advances, we are far from achieving a complete description of the flexibility of a molecular system. Theoretical approaches are convenient alternatives. One of the most used theoretical techniques to account for dynamic information of structures is Molecular Dynamics (MD). Unfortunately, the practical use of MD has been severely limited by its computational cost and by the problems found in the automatic setup of simulations. An alternative to this methods are Coarse-Grained (CG) Dynamics, where, in order to increase computer efficiency, a certain loss of accuracy is accepted, with a significant reduction in structural resolution. Then, using CG algorithms, larger macromolecules and larger timescales can be simulated, reaching the mesoscopic scale. Nowadays, with the development of new and more efficient simulation engines and the availability of supercomputers and grid platforms (High Performance Computing – HPC), these methods are becoming more and more popular. However, their use in large computational High Throughput (HT) studies requires a complete automation of all the necessary steps in the process of generation of the final trajectory and its subsequent analysis, as well as the building of an efficient storage system, giving the huge amount of data generated by MD/CG methods. In this thesis, we have designed and implemented a set of bioinformatics tools to port Molecular and Coarse-Grained Dynamics to the HT regime. We have obtained a library of 1,595 protein MD simulations (MoDEL), containing a picture of macromolecular structure flexibility. This large library allowed us to perform HT studies such as the analysis of protein-solvent dynamics, with more than 16 million water molecules available. Finally, all the bioinformatics tools developed in this thesis were included in a set of graphical interfaces as web servers, to ease their use for non-expert users. Addition of pre-configured workflows, integration of macromolecular flexibility analyses and visualization possibilities enhance the value of the final project. The set of web server applications designed and implemented in this thesis is publicly accessible for the scientific community, forming an integrated macromolecular flexibility portal, which can be reached directly http://mmb.irbbarcelona.org/FlexPortal or through the Spanish National Institute of Bioinformatics (INB) portal http://www.inab.org .
Las estructuras tridimensionales de las macromoléculas, y en particular, su dinámica y flexibilidad, están íntimamente relacionadas con su función biológica. Debido a la tremenda dificultad del estudio experimental de las propiedades dinámicas de las macromoléculas, se han popularizado un conjunto de técnicas teóricas con las que obtener simulaciones de su movimiento. En los últimos años, los grandes y rápidos avances tanto en la computación como en los estudios teóricos de flexibilidad de macromoléculas han abierto la posibilidad de llevar a cabo estudios masivos de alto rendimiento (High throughput). Sin embargo, para lograr realizar este tipo de estudios, no solo se requieren algoritmos potentes y poder computacional, sino también una automatización de los distintos pasos necesarios en el proceso de cálculo de trayectorias así como de su posterior análisis. Casi tan importante como los cálculos, es necesario un sistema de almacenamiento que permita tanto guardar como consultar de manera eficiente la cantidad enorme de datos generados por el estudio masivo. En esta tesis, se han estudiado, diseñado e implementado diferentes sistemas de automatización high throughput de cálculos de dinámica molecular, tanto atomística como de baja resolución, así como herramientas para su posterior análisis. Así mismo, y para acercar estas metodologías complejas a usuarios no expertos, hemos implementado un conjunto de entornos gráficos a partir de servidores web, que directamente, o vía el portal del Instituto Nacional de Bioinformática (INB), permiten su uso por una amplia comunidad científica.
Dinàmica molecular; Dinámica molecular; Molecular dynamics; Bioinformàtica; Bioninformática; Bioinformatics; Càlcul intensiu (Informàtica); Computación de altas prestaciones; High performance computing
577 - Biochemistry. Molecular biology. Biophysics
Ciències Experimentals i Matemàtiques
Tesi realitzada a l'Institut de Recerca Biomèdica de Barcelona (IRBB)