Study of the evolution of terminal differentiation programs using C. elegans serotonergic neurons as a model
1. Introduction The complexity of animal morphology, from body plans to the smallest appendage, and the functions that different organs and cell types are able to do, all arise from a set of tightly coordinated processes that together constitute development. Understanding these processes, i.e., understanding how a fertilized egg develops into a multicellular animal, has been acknowledged as one of the greatest challenges in biology. The aim of this thesis is to gain knowledge on the mechanisms controlling the final step of development, terminal differentiation and how they change through evolution. During differentiation, cells gradually restrict their potential to become different cell types. Usually, cells up to the step of gastrulation are regarded as pluripotent, which means that they have the potential to become any cell type in each of the three germ layers. Afterwards, cells traverse different stages of decreasing multipotency. The term ‘stem cell’ refers to undifferentiated cells that are capable of self-renewal. Tissue-specific stem cells are usually multipotent, such as haematopoietic stem cells, or unipotent, such as myosatellite cells of muscle. Tissue-specific stem cells give rise to committed progenitors, generally regarded as uni- or oligopotent, which are capable of a finite number of divisions and give rise to immature unipotent cells, which eventually maturate and become terminally differentiated cells of a given type. Each terminally differentiated cell type is defined by a particular combination of differentially expressed effector genes, which are responsible for its unique properties and behaviours. Therefore, the process of terminal differentiation consists in the coordinated expression of these specific combinations of genes. In most cases, cells will never acquire additional functions or express new features once they have begun to express cell-type specific genes; they are either postmitotic, as neurons, or, if they retain mitotic capacity, they will only produce more of their own kind, hence the term ‘terminally differentiated’. All the steps of cell differentiation are mediated by the spatial and temporal regulation of thousands of genes. This regulation is usually integrated through genomic elements called cis-regulatory modules (CRMs), which usually consist of clusters of short sequence motifs that are recognized by diverse DNA-binding proteins, such as transcription factors (TFs). TFs are able to recognize short (6-12bp) and degenerate sequences, to which they bind in order to activate transcription. However, their binding is conditioned by nucleosome positioning, chromatin compaction and histone marks, such as H3K9me, a hallmark of heterochromatin. TFs, in turn, are largely responsible for the distribution of these features in the genome. There are different kinds of CRMs; typical examples include enhancers, which have a positive effect on transcription (at least part of the time), insulators, which restrict the range of enhancer, silencers or tethering elements, which contribute to directing a remote enhancer activity towards a specific gene. Based on analysis of CRMs of effector genes, and on assessment of effector gene expression in different mutant backgrounds, it has been shown that neuron type-specific genes are usually co-regulated by a small group of transcription factors. These transcription factors bind directly to motifs in cis-regulatory regions of effector genes, and are required for the initiation and maintenance of their expression. Their own initial activation is regulated in a complex way; they are thought to integrate inputs from lineage-dependent TFs as well as other spatio-temporal non-autonomous cues. However, they usually maintain their expression via self-regulation during the entire organism life. Transcription factors acting in such a way have been termed terminal selectors. Terminal selectors act in combinations: the same TF can act as a terminal selector in different neuron types with very different sets of effector genes, but TF combinations are unique for each neuron subtype. The ability of a TF to activate different sets of genes in different cells is often attained through cooperative activation of gene expression. In this thesis we study terminal differentiation of C. elegans serotonergic system. Particularly, we study changes in terminal differentiation programs that took place between C. elegans and another Caenorhabditis species, C. angaria, in VC4-5. In C. elegans and other species, these cells are not serotonergic, whereas in C. angaria they stain strongly for serotonin. In another work, we study terminal differentiation of the HSN neuron, particularly, we assess how TF binding sites in gene promoters can predict cis-regulatory activity in the HSN neuron and we assess molecular and regulatory homology between HSN and mouse raphe serotonergic neurons. 2. Objectives This thesis has two main parts. In the first part, we aim to study the evolutionary changes that took place between C. elegans and C. angaria regarding serotonergic phenotype of VC4 and VC5 motorneurons. We seek to identify 1) which genes (serotonin pathway genes) are differentially regulated in VC4-5 in both species (Objective 1.1), and 2) which changes in regulatory logic are responsible for these differential expression patterns (Objective 1.2). Our global purpose is to gain insight into the mechanisms of cell type evolution. The second part is a computational study on the regulatory logic of the HSN motor neuron. In our laboratory it was found that a transcription factor collective of six TFs orchestrates HSN terminal differentiation. Our objectives are, in the first place, to assess the prevalence of TF binding motifs for the six TFs in different groups of genes, according to their expression pattern (Objective 2.1). We expect that genes known to be expressed in the HSN neuron are more likely to bear motifs in their upstream region for all or most of the six TFs regulating HSN differentiation. In the second place, we aim to use the presence of these TFBS to make de novo predictions on gene expression (i.e., we want to test the extent to which having a set of motifs in the promoter region predicts expression in the HSN neuron) (Objective 2.2). Finally, by combining this TFBS finding approach with other methods, we want to test whether the HSN regulatory logic is conserved in mouse raphe serotonergic neurons (Objective 2.3). 3. Methodology and Results 1: H3K9me restrains serotonergic phenotype in C. elegans VC4-5 neurons 3.1 Serotonergic system diversity in the Caenorhabditis genus We performed serotonin-immunostaining in most species of the Caenorhabditis genus. We found that , serotonergic identity in the head region is mostly conserved, although quantitative variation was found between species in the percentage of ADF, RIH and AIM that were 5-HT positive. In the vulva region, we found higher variability. One species (C. monodelphis) lacked serotonergic HSNs, and a group of five species (C. angaria, C. castelli, C. quiockensis, C. sp. 8 and C. sp. 24) had serotonergic VC4 and VC5, and a variable number of other VNC neurons. Since VC4-5 5-HT is not found outside the angaria group, this feature is a synapomorphy of this group. 3.2 Conservation of transcriptional regulatory logic of 5-HT pathway genes between C. elegans and C. angaria. Next, we aimed to understand which kind of regulatory changes had led to this VC divergent phenotype. In the nematode C. elegans there are two classes of serotonergic neurons, the ones synthesizing 5-HT, which express the enzymes required for serotonin synthesis (tph-1, cat-4 and bas-1), such as NSM, ADF and HSN, and the ones that incorporate 5-HT from the environment (and express mod-5/SERT and cat-1/VMAT for this purpose), which are RIH and AIMs. We built GFP reporters of the C. angaria serotonin pathway genes and injected them into C. elegans. Only cat-1 and mod-5 were expressed in VC4-5. We also injected tph-1, cat-1 and mod-5 reporters in C. angaria, and, although the number of worms we were able to score was very low, mod-5 reporter was the only one expressed in VC4-5. Altogether these results suggest C. angaria VC4 and VC5 have acquired 5-HT staining through changes in the mod-5 cis regulatory regions. 3.3 Transcriptional regulation of cat-1/VMAT in C. elegans As C. elegans VC4-5 express cat-1 and C. angaria mod-5 reporter is also expressed in C. elegans VC4-5, we hypothesized that the regulation of cat-1 in C. elegans could be similar to the regulation of C. angaria VC4-5 fully serotonergic phenotype and, consequently, studying the regulation of the former could provide information about the latter. In order to find regulators of cat-1 transcriptional activation, we undertook a candidate approach: we selected all transcription factors, as well as some genes involved in relevant signalling pathways that, according to WormBase, are expressed in, or signal to, VC4-5. To test for functionality, we performed interference RNA (RNAi) by feeding against them in an rrf-3 mutant sensitized strain carrying the otIs221[cat-1::gfp] transgene. We found that cat-1 is regulated by the lin-3/EGF pathway in VC4-5. UNC-4, a paired-like homeodomain TF, which is downstream of lin-3/EGF, is also required for cat-1 expression in VC4-5. We also found that bHLH TF hlh-3 is required for cat-1 expression in VC4-5. However, a thorough cis-regulatory analysis of the cat-1 minimal promoter (a 400bp region that drives expression specifically in VC4-5), did not reveal specific TF binding motifs. Instead, most of the region seems to bear relevant regulatory information. 3.4 H3K9me restrains 5-HT reuptake in C. elegans VC4-5 It has been reported that mutants with dysfunction in Histone 3 lysine 9 methylation (H3K9me) show ectopic expression of unc-4 transcription factor, which is normally expressed in VC4 and 5, in VC1-3 and 6. Since unc-4 is required for cat-1 expression in C. elegans VC4-5, we decided to test if Histone 3 lysine 9 methylation was important to restrain 5-HT fate in VC4 and VC5 in C. elegans. To test this hypothesis we performed serotonin staining in strains carrying predicted null alleles for histone methyltransferases met-1, met-2 and set-25, the ortholog of mouse Heterochromatin Protein 1 (HP1), hpl-2,and lin-61, a Malignant Brain Tumor (MBT) protein. When these mutants are grown at 15ºC, they show a percentage of 5-HT positive VC4-5 comparable to that of wildtype worms (p-value > 0.05; chi-squared test and Bonferroni correction). However, when they are grown at 25ºC, the percentage of 5-HT positive VC4-5 increases, on average, up to 71% in lin-61, 64% in hpl-2, 41% in met-2 and 19% in met-1 mutants. On the contrary, set-25 mutants show no increase in 5-HT staining. While met-1 and met-2 are in charge of mono- and di- methylation of H3K9, set-25 is involved in tri-methylation and it has been shown that tri-methylation levels correlate poorly with hpl-2 binding. Thus our results suggest VC4 and VC5 serotonergic fate is actively repressed in C. elegans by mono and di- methylation of H3K9. Interestingly, hpl-1; hpl-2 double mutants show a high percentage of 5-HT positive VC4-5 at 15ºC. Next, we crossed hpl-2(tm1489) and met-2(4256) to mod-5 null alleles, to see whether VC4-5 5-HT phenotype in H3K9me deficient mutants is also due to 5-HT reuptake. 5-HT staining showed a significant reduction when comparing hpl-2(tm1489) to mod-5(n822); hpl-2(tm1489) or mod-5(knu383); hpl-2(tm1489), or met-2(4256) to mod-5(n822);met-2(4256). Additionally, crossing lin-61, hpl-2 and met-2 mutants to mutants which HSN neurons fail to synthesize serotonin, or to mutants in which synaptic vesicles cannot be released, also results in a loss of 5-HT staining in VC4-5. Altogether, these experiments convincingly show that, in H3K9 methylation deficient mutants, VC4-5 reuptake serotonin from the nearby HSN neurons, in a mod-5/SERT dependent way. However, we could not detect mod-5 expression in VC4-5, despite using different kinds of reporters, including CRISPR-based knock-ins. 3.5 H3K9me is required cell-autonomously to repress 5-HT reuptake in C. elegans VC4-5 Components of the chromatin compaction machinery hpl-2, met-2 and lin-61 are broadly expressed and are rather pleiotropic. Therefore, we wanted to assess whether the effects observed in VC4 and VC5 in these mutants were due to cell autonomous or non cell autonomous action of these factors. Therefore, we performed rescue experiments of H3K9me deficient mutants with cell specific promoters. We overexpressed HPL-2 isoforms a and c, LIN-61 isoform a and MET-2 unique isoform specifically in VCs or HSNs (as a control) and performed immunostaining. We found that, in all cases, overexpression in VC4-5 reduced 5-HT staining to levels similar to wildtype worms, whereas overexpression in HSN had no effect. 3.6 VC4-5 critical period for becoming serotonergic in H3K9me deficient background is L4 In order to find out at what developmental point the action of H3K9me was required to restrain VC4 and VC5 5-HT fate in C. elegans we took advantage of the temperature sensitive phenotype of H3K9me mutants. VC4 and VC5 become 5-HT positive in hpl-2, lin-61 and met-2 mutants when worms are grown at 25ºC but not at 15ºC. Thus we performed a series of temperature shift experiments to determine the time window in which these factors are required. We kept synchronized populations of worms at 15ºC or 25ºC, and every 12h we shifted (up or down) a part of the population. Additionally, some worms were kept at 25ºC only for 24h. We found that, regardless of the total amount of time spent at 25ºC, only worms that had been at 25ºC through the L4 stage showed 5-HT VC4-5. These experiments show that H3K9me during L4, the stage in which VC neurons mature, is critical for the repression of 5-HT phenotype . 3.7 UNC-4 and HLH-3 are required for 5-HT staining in VC4 and VC5 in H3K9me deficient mutants According to the terminal selector conceptual framework, most cell-specific properties in a cell subtype are expected to be regulated by the same set of transcription factors. Since we already knew that UNC-4 and HLH-3 regulate cat-1 expression in VC4 and VC5, we assessed whether they were required for the 5-HT phenotype observed in heterochromatin formation deficient mutants. We crossed unc-4 and hlh-3 to lin-61 mutants. In both double mutants, VC4 and VC5 lost their 5-HT staining compared to lin-61 mutants alone. 3.8 HSN laser ablation abolish VC4-5 serotonin staining in C. angaria and C. elegans H3K9 chromatin mutants. In summary, our results so far indicate that: 1) VC4 and VC5 neurons in the C. angaria group have acquired the ability to stain for serotonin in a constitutive manner, 2) In C. elegans, VC4 and VC5 serotonergic staining is restrained by H3K9 methylation cell-autonomously. Without this histone mark, VC4 and VC5 reuptake 5- HT from the nearby HSN by the action of mod-5/SERT. Thus our next aim was to study if, as in C. elegans heterochromain mutants, the regulation of VC4 and VC5 serotonergic phenotype in C. angaria was dependent on 5-HT reuptake. Expression of C. angaria mod-5 reporter in C. elegans and C. angaria indicates so. Since no mutant strains are available in this species, we decided to use an alternative strategy to test for 5-HT. If C. angaria VC4-5 re-uptake 5-HT from the neighbouring HSN, then laser ablation of these cells should abolish VC4-5 5-HT staining. Therefore, we perfomed laser ablations of HSNs at L1 stage, long before they differentiate. First, as a positive control, HSNs were laser ablated in C. elegans lin-61 mutants and worms were stained upon reaching adulthood. In these worms, VC4-5 lost their 5-HT staining. Similarly, in HSN-ablated C. angaria females, VC4 and VC5 also lost their serotonin staining, indicating that VC4 and VC5 acquire their staining by reuptaking serotonin released by HSNs. 3.9. Functional relevance of the VC4-5 serotonergic phenotype Besides studying the molecular changes leading to VC4 and VC5 serotonergic phenotype in the angaria group, we were interested in understanding the functional implications of this evolutionary change. VC4 and VC5, together with HSN neurons, are central in controlling egg-laying behaviour. This behaviour has been modelled as a three-state process in which serotonin released by the HSN neurons acts on vm1 and vm2 muscle cells through CGPR as a facilitator of the egg-laying active state, in which egg-laying events are highly probable, and acetylcholine released by both HSN and VC neurons triggers individual egg-laying events. Additionally, it has been shown that VC4 and VC5 and uv1 neurosecretory cell inhibit HSNs, and HSNs activates VC4 and VC5. It has been shown that in a low osmolarity medium, both HSNs and VC4 and VC5 show a higher frequency of calcium spikes, and egg-laying rate increases, whereas HSN and VC activity, and subsequently egg-laying, are inhibited in high osmolarity media, such as M9. If exogenous serotonin is added to the medium, however, egg laying rate increases, which indicates that 5-HT signalling from HSN induces egg laying. We evaluated sensitivity to exogenous 5-HT in several Caenorhabditis species; we selected some species in the elegans group, such as C. briggsae, some species of the Drosophilae group (C. sp. 2 and C. virilis), which are close to C. angaria but do not belong to the angaria subgroup, and to not show serotonergic VCs, and all the angaria group species. We used a very standard test to evaluate egg laying behaviour. Briefly, synchronized worms are grown until the adult stage (they are about 24h adults when assayed), and they are incubated for 1 hour in individual wells with M9 or in M9 plus 35mM 5-HT hydrochloride, after which the egg number in each well is annotated. All the species without serotonergic VC4 and VC5 in the elegans or the Drosophilae groups showed an evident increase in the egg-laying rate upon treatment with exogenous 5-HT, with one exception. C. monodelphis does not have serotonergic HSN neurons, therefore, it is expected that vulval muscles are not responsive to serotonin. This seems to be the case. On the other hand, treatment with exogenous 5-HT in species within the angaria group had either no effect (C. angaria RGD1 strain and C. sp. 8) or an inhibitory effect (C. angaria PS1010 strain, C. castelli, C. quiockensis), with the exception of C. sp. 24, in which 5-HT treatment also seemed to produce an enhancement of egg-laying. However, this species is particular because it does not only have 5-HT positive VC1-6, but also some extra VC-like and tail neurons. Overall, the VC serotonergic phenotype seems to be linked to differences in the functionality of the egg-laying circuit. 4. Methodology and Results 2: Bioinformatics analysis of the cis-regulatory logic controlling HSN terminal differentiation In order to identify the TF combination controlling HSN terminal differentiation, we followed a candidate approach, selecting six transcription factors that had been described to have an egg-laying defective phenotype (egl) linked to serotonin staining defects: the POU domain TF unc-86, the Spalt-type Zn finger TF sem-4, the bHLH domain TF hlh-3 and the Insm-type Zn finger TF egl-46, the ETS transcription factor ast-1 and the GATA factor egl-18. Carla Lloret, a PhD student in the laboratory, characterized the expression of several effector genes in the respective mutant backgrounds for the 6 TF candidates. She found that all six TFs are required for correct terminal differentiation of HSN. Complementarily, Miren Maicas, a post-doctoral researcher in the laboratory, performed extensive cis regulatory analysis of three serotonin pathway genes, cat-1, tph-1 and bas-1. Through site directed mutagenesis, Miren showed that these six transcription factors bind directly to cis-regulatory modules (CRMs) controlling transcription of 5-HT pathway genes. The direct action of these 6 TFs indicates that they behave as a transcription factor collective to orchestrate HSN neuron differentiation. Based on our experimental data, our next aim was to analyze if the HSN TF collective is able to impose a regulatory signature in HSN expressed genes that can distinguish them from the rest of the genome. 4.1 The HSN signature is over-represented in HSN-expressed genes and can be used to predict gene expression in HSN neurons. Since 5-HT pathway genes promoters bear binding motifs for all or most of the HSN TF collective, we hypothesized that, if the TF collective regulates transcription of most HSN-specific genes, clustered binding sites for these TFs would be more likely to occur within promoters and intronic regions of HSN-expressed genes than in the rest of the genome. We performed a genome-wide search of the HSN regulatory signature, which we defined as the presence of at least one putative binding site for each of the six TFs, in a region spanning 700 bp or less. We found that HSN signature was more prevalent in genes known to be expressed in HSN. Moreover, taking into account conservation (presence of signature in other Caenorhabditis species) increased relative enrichment of HSN-expressed vs other genes. We wondered whether the HSN signature was enough to predict de novo expression in HSN. We selected promoter regions bearing the HSN signature of genes that had been reported to be expressed in neurons. But not in HSN. 37% of them were expressed in HSN, whereas none of 10 control regions (without signature) were expressed in HSN. Therefore, presence of HSN signature has some predictive power regarding expression in HSN. However, it is often not enough; maybe additional motifs, syntactic rules of motif arrangement, motif number or intrinsic DNA properties are also required for expression in HSN. 4.2 Deep homology between HSN and mouse raphe serotonergic neurons Once we had characterized the HSN TF collective and the prevalence of the HSN regulatory signature in the C. elegans genome, we aimed to assess whether the HSN regulatory logic was conserved in mammals. Indeed, we found that mouse orthologs for four out of the six TFs of the HSN collective were already known to be involved in mammalian serotonergic specification: bHLH TF ASCL1, GATA2/3, INSM1 and ETS TF PET1, which are orthologs of hlh-3, egl-18, egl-46 and ast-1, respectively. Additionally, Laura Chirivella at my laboratory studied the expression pattern of BRN2, a POU transcription factor ortholog to unc-86 which has been associated with serotonergic specification, and found that it is expressed in serotonergic neuron progenitors and postmitotic serotonergic neurons at embryonic stage E11.5, when serotonergic neurons are differentiating. She also studied the expression pattern of SALL2, the closest mouse ortholog of sem-4, and found that it is also expressed in serotonergic neuron progenitors and postmitotic serotonergic neurons at E11.5. We hypothesized that, given that HSN and mouse raphe 5-HT neurons specification seemed to be regulated by the same set of TFs, these two neuron types might share broad molecular similarities and not only the serotonin pathway genes. To test this hypothesis, we integrated the Hobert neuronal atlas with published RNA-seq data of the mouse raphe 5-HT neurons and performed hierarchical clustering on the resulting cell type-gene expression matrix. Interestingly, HSN neuron expression profile clustered with the raphe serotonergic nuclei with high bootstrap support. Considering the similarities between HSN and mouse raphe regarding transcriptome and TF collective we next aimed to analyze whether a similar 5-HT regulatory signature is present in the mouse genome. We found that mouse genomic regions predicted by the ENCODE consortium to act as enhancers in hindbrain, which is where 5-HT neuron populations are found, are enriched in serotonergic regulatory signature with respect to predicted enhancers from other non serotonergic tissues (p<0.05, Tukey’s HSD test after logistic regression). We conclude that, from all C. elegans neurons, the HSN partial transcriptome is molecularly the closest to the mouse raphe serotonergic neurons transcriptome, and that active hindbrain enhancers are enriched in the serotonergic signature compared to enhancers active in other tissues. Therefore, and taking into account also the experimental evidence summarized above, HSN neurons and mouse raphe 5-HT neurons share deep homology. 5. Conclusions Serotonergic identity of VC4 and VC5 evolved once within the Caenorhabditis genus, in the lineage that gave rise to the angaria group, comprised by C. angaria, C. castelli, C. quiockensis, C. sp. 8 and C. sp. 24. C. angaria mod-5/SERT promoter is able to drive reporter expression in both C. elegans and C. angaria VC4 and VC5. Therefore, VC4 and VC5, in C. angaria, most likely reuptake serotonin, rather than synthesize i.t Changes in mod-5/SERT promoter are responsible for changes in mod-5 expression between C. angaria and C. elegans. UNC-4 paired-like homeodomain and HLH-3 bHLH TFs regulate cat-1 reporter expression in VC4 and VC5. C. elegans mutants for hpl-2, met-2 and lin-61, but not set-25, have an increased percentage of 5-HT positive VC4 and VC5, which is also due to 5-HT reuptake. The requirement of these genes for 5-HT phenotype repression is cell-autonomous. Since C. elegans H3K9me deficient mutants have a VC4 and VC5 phenotype similar to that of C. angaria, changes in C. angaria mod-5 promoter could be related to changes in recruitment of chromatin factors, rather than exclusively in TFBS. The joint presence of predicted binding sites for the six TFs of the TF collective regulating HSN differentiation (HSN signature) is more prevalent in genes known to be expressed in HSN than in the rest of the genome. Conservation in Caenorhabditis species increases enrichment in HSN-expressed genes versus the rest of the genome. Additionally, presence of the HSN signature can be used to predict reporter expression in HSN. The HSN neuron shows molecular homology with mouse raphe serotonergic neurons, but not with other mouse neuronal populations. Accordingly, the mouse orthologous of the HSN signature is slightly more prevalent in ENCODE annotated enhancers of the mouse midbrain than in enhancers active in other body regions.