Performance optimisation of biological pathway data storage, retrieval, analysis and its interactive visualisation
NAGIOS: RODERIC FUNCIONANDO

Performance optimisation of biological pathway data storage, retrieval, analysis and its interactive visualisation

DSpace Repository

Performance optimisation of biological pathway data storage, retrieval, analysis and its interactive visualisation

Show simple item record

dc.contributor.advisor Marín García, Pablo
dc.contributor.advisor Arnau Llombart, Vicente
dc.contributor.author Fabregat Mundo, Antonio
dc.contributor.other Departament d'Informàtica es_ES
dc.date.accessioned 2018-07-17T08:26:38Z
dc.date.available 2018-07-18T04:45:07Z
dc.date.issued 2018 es_ES
dc.date.submitted 27-07-2018 es_ES
dc.identifier.uri http://hdl.handle.net/10550/67008
dc.description.abstract The aim of this research was to optimise the performance of the storage, retrieval, analysis and interactive visualisation of biomolecular pathways data. This was achieved by the adoption of new technologies and a variety of highly optimised data structures, algorithms and strategies across the different layers of the software. The first challenge to overcome was the creation of a long-lasting, large-scale web application to enable pathways navigation; the Pathway Browser. This tool had to aggregate different modules to allow users to browse pathway content and use their own data to perform pathway analysis. Another challenge was the development of a high-performance pathway analysis tool to enable the analysis of genome-wide datasets within seconds. Once developed, it was also integrated into the Pathway Browser allowing interactive exploration and analysis of high throughput data. The Pathways Overview layout and widget were created to enable the representation of the complex parent-child relationships present in the pathways hierarchical organisation. This module provides a means to overlay analysis results in such a way that the user can easily distinguish the most significant areas of biology represented in their data. Although an existing force-directed layout algorithm was initially utilised for the graphical representation, it did not achieve the expected results and a custom radial layout algorithm was developed instead. A new version of the pathway Diagram Viewer was engineered to achieve loading and rendering of 97% of the target diagrams in less than 1 second. Combining the multi-layer HTML5 Canvas strategy with a space partitioning data structure minimised CPU workload, enabling the introduction of new features that further enhance user experience. On the server side, the work focused on the adoption of a graph database (Neo4j) and the creation of the new Content Service (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, enabled efficient access to the complex pathway data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. es_ES
dc.format.extent 253 p. es_ES
dc.language.iso en es_ES
dc.subject Biological pathway es_ES
dc.subject Pathway analysis es_ES
dc.subject Data visualisation es_ES
dc.subject Graph database es_ES
dc.subject Open source es_ES
dc.title Performance optimisation of biological pathway data storage, retrieval, analysis and its interactive visualisation es_ES
dc.type info:eu-repo/semantics/doctoralThesis es_ES
dc.subject.unesco UNESCO::CIENCIAS TECNOLÓGICAS es_ES
dc.embargo.terms 0 days es_ES

View       (59.39Mb)

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search

Browse

Statistics