Bioinformatic approaches to study the metabolic effect on Gene Regulation
Cellular adaptation to changing environments constitutes a critical mechanism for cell survival. Cells primarily respond to external conditions by modulating the molecular mechanisms that regulate gene expression or protein activity, granting a rapid response to external metabolic changes. Therefore, metabolic sensing constitutes an important step in cell adaptation, and epigenetics is now considered the mechanism that connects metabolic shifts with gene regulation. Epigenetic marks give cells the capability of shaping chromatin conformation, which in turn regulates gene expression. Consequently, the correct functioning of a cell's epigenetic program is critical for cellular adaptation to changing conditions. Different epigenetic modifiers rely on metabolite availability to modify the cell's epigenetic landscape. Recent studies point towards the accumulation of key metabolites as the critical mechanism by which epigenetic modifiers modulate the chromatin marks. This can be appreciated in circadian rhythms, where epigenetic changes mediate the cross-talk between metabolic oscillations and gene expression. Deficiencies that disconnect this molecular regulation lead to diseases, such as metabolic syndrome. The study of the metabolic control of the epigenome and transcriptome is an emerging field of research. Multiple studies have generated large, high-throughput datasets that measure gene expression, metabolites and histone modifications, among others, to study these interconnections; although a wealth of literature is accumulating, the precise mechanisms of these multi-layered regulations are still to be fully elucidated. Also, a consensus pathway describing these processes cannot yet be found in any of the common biological pathway databases. One critical need in the field is the integrative analysis of existing molecular data to propose detailed regulatory models for the interplay between metabolism, chromatin state and transcription. This thesis addresses the statistical integration of metabolomics and epigenetics measurements with gene expression. We approached this data analysis challenge using the Yeast Metabolic Cycle (YMC) as a model system. Gene expression at the YMC can be divided into three, well-defined phases where transcription is coordinated with histone modifications and metabolomics oscillations. First, we analyzed the impact of histone modifications on gene expression, which led to the identification of the histone marks that have a higher impact on gene expression changes. Next, we created a comprehensive, multi-layered, multi-omics dataset for this system by obtaining metabolomics and ATAC-Seq data of the YMC and incorporating an existing nascent transcription (NET-seq) dataset. Moreover, we modeled the impact of chromatin conformation and metabolic changes on gene expression, and created a regulatory model for gene expression, epigenetics and metabolomics by applying PLS Path Modeling, a multivariate strategy suitable for finding relationships across multiple high-dimensional datasets. To our knowledge, this is the first time that PLS-PM is used for the modelling of molecular regulatory layers. We found that gene expression in OX phase was mainly controlled by H3K9ac histone mark and ATP accumulation at this phase, suggesting INO80 ATP-dependent chromatin remodeling activity. We also found an enrichment of H3K18ac during RC phase, together with accumulation of nicotinamide and its derivatives, suggesting that sirtuins may regulate H3K18ac levels at RC to activate fatty acid oxidation response. Aspartate was also associated with RC phase epigenetic regulation, but the mechanisms by which this amino acid may control the epigenome are still unanswered. Finally, in this work, we have also created Padhoc, a computational pipeline to integrate the existing published knowledge in emerging research fields -such as those studied in this thesis- to propose pathway models that can complement current pathway databases. Altogether, this thesis involves the generation of a multi-omics dataset that covers metabolic, epigenetic and gene expression information, and their integrative analysis using novel multivariate strategies that model their mechanistic coordination. Moreover, it includes a framework for the reconstruction of biological pathways. All in all, we have presented different strategies by which to study the impact of metabolic changes in chromatin using computational biology approaches.