Introduction

Each cell of the body has the same genome, yet very different forms and functions. Carefully orchestrated epigenetic mechanisms play a role in the cell/tissue-specific gene expression regulating cell differentiation, maintenance, and proliferation [1]. There are three distinct epigenetic mechanisms, including histone modifications, non-coding RNAs, and DNA methylation, that work singularly or conjointly to epigenetically regulate the chromatin structure [1, 2]. These mechanisms modify downstream gene transcription and function in order to fine-tune biological mechanisms at the cellular level. There is increasing evidence that 5-hydroxymethycytosine (5-hmC), considered to be the “sixth base” [3], may also play a role in epigenetic regulation. Given the fact that epigenetic changes are sensitive to environmental stimuli, in this review, we focus on the role of DNA hydroxymethylation in gene regulation and how it interacts with environmental factors, leading to disease development.

DNA methylation and hydroxymethylation are proposed as epigenetic modifications of gene regulation. The chemistry of DNA methylation and hydroxymethylation is summarized in Fig. 1. DNA methyltransferases (DNMTs) catalyze the covalent addition of a methyl group from S-adenosyl-methionine (SAM) to the C5 position of cytosine in CpG dinucleotides (Fig. 1), which predominantly cluster in densely populated CpG islands (CGI), often located in gene promoters and the first intron and exon [47]. Initially, 5-methylcytosine (5-mC) is established by de novo DNMT3A and DNMT3B [8], and during semi-conservative DNA replication, DNMT1 maintains methylation by using the hemi-methylated DNA template to regenerate the symmetrical methylation site on the new DNA strand [5]. 5-mC is found in 1–4 % of all cytosines (or ~80 % of CpG dinucleotides) in the mammalian genome [6, 9•, 10]. CGI methylation occurs mainly in the promoter regions, where ~70 % of the CGIs are unmethylated except for those on the inactive X chromosome and some associated with imprinted genes [11]. DNA methylation in critical regulatory regions, such as gene promoters, demonstrates the influential role of 5-mC in gene repression. 5-mC has a high affinity for methyl-CpG-binding proteins (MBPs), and their interaction causes the recruitment of chromatin-modifying enzymes at the promoter, leading to gene repression by sterically hindering the binding of transcription factors and basal transcriptional machinery [5]. Although the mechanisms of DNA methylation are well defined, the opposing role of DNA demethylation has not yet been fully understood. During zygote formation, there is rapid demethylation during zygote cleavage and rapid methylation after implantation, independent of cell division, which indicates that active DNA demethylation pathways are responsible [1, 4]. The discovery of ten-eleven translocation (TET) proteins brought an incredible interest to the 5-hmC-mediated DNA demethylation pathways and the biological role of 5-hmC in development [12]. TET enzymes are responsible for catalyzing the oxidation of 5-mC to 5-hmC and the iterative oxidation of 5-hmC to 5-formylcytosine (5-fC) and 5-carboxycytosine (5-caC) DNA derivatives, believed to be the intermediates in the DNA demethylation pathways [4, 6, 13] (Fig. 1).

Fig. 1
figure 1

Chemistry of DNA methylation and hydroxymethylation. 5-Methylcytosine (5-mC) is produced from the addition of S-adenosylmethionine (SAM) onto the 5-carbon of cytosine by DNA methyltransferases (DNMT). Ten-eleven translocation (TET) proteins then catalyze the interactive oxidation of 5mC to 5-hydroxymethylcytosine (5-hmC), 5-formlycytosine (5-fC) and 5-carboxycytosine (5-caC), with required cofactors alpha-ketoglutarate (a-KG), iron (Fe2+), and oxygen. 5-hmC, 5-fC, or 5-caC could act as an intermediate in both passive and active DNA demethylation pathways involving DNA repair enzymes like AID or APOBEC (activation-induced cytidine deaminase)

TET Proteins

TET expression levels vary between cells and organs. TET1 and TET2 are highly expressed in mouse ES cells, while TET3 is seen predominantly in oocytes and one-cell zygotes [4]. TET1 and TET3 are expressed in several adult tissues but most abundant in brain, while TET2 is enriched in hematopoietic cells [3]. There is a limited number of studies about how TET proteins are targeted to specific genes in distinct cell types and developmental stages. Whether TET expression profiles correlate to 5-hmC product formation is also very complicated and controversial. The brain (~0.80 %) and spinal cord (~0.45 %) are the most 5-hmC-enriched cells/organs among different tissues [6, 14]. Studies showed global 5-hmC levels in human tissues are not associated with 5-mC content or TET gene expression [15]. However, additional studies are required to evaluate this finding in other tissue/cell types and in large-scale human population studies. All TET proteins have a catalytic domain that contains both a cysteine-rich region and 2-oxoglutarate-Fe(II) dioxygenase activity [6, 9•, 16]. In mammals, TET proteins (TET1, TET2, TET3) catalyze the transfer of a hydroxyl group to 5-mC [9•, 13, 16] to form 5-hmC/5-fC/5-caC. TET1 can oxidize both fully and hemi-methylated DNA, and does not require CpG dinucleotides [4]. In the N-terminal, the CXXC domain of the TET proteins (TET1 and TET3 only) is a DNA binding domain, with a high affinity for clustered unmethylated DNA [4, 16]. Subsequent genome-wide mapping of mouse and human embryonic stem cells determined that the CXXC domain of TET1 overlaps with the unmethylated DNA generally located in the CGIs of promoter regions [4], signaling many chromatin-associated proteins. TET1 has been shown to activate gene transcription by reducing the 5-mC mediated recruitment of the polycomb repression complex 2 (PRC2) to chromatin [4]. In contrast, TET proteins have been shown to recruit the Sin3a (directly) and the Ezh2 complex (indirectly), to deacetylate and trimethylate histone H3, and repress target genes [4, 6]. To recap, TET proteins may both repress and activate gene transcription by the recruitment of chromatin remodeling enzymes through the CXXC domain and/or as the intermediate of gene demethylation through the catalytic 5-mC oxidation to 5-hmC/5-fC/5-caC.

Non-Enzymatic Regulation by 5-hmC

The substitution of 5-mC for 5-hmC can non-catalytically regulate gene transcription by interfering with DNA-protein interactions. Once established, these 5-hmC residues can influence gene regulation. 5-hmC has a large, polar, hydroxymethyl group, which extends into the major groove of DNA, pushing the attached cytosine further away from the duplex and creating a polar cavity that increases solvation dynamics (the interaction of DNA and water molecules) [17]. The solvation effects of 5-hmC causes local geometric changes in intra-base pair fluctuations (shear, stretch, stagger, and buckle), which decreases rigidity around the local helical axis conformation, and increases the ability of the duplex to propeller twist and open [17]. 5-hmC generally functions to activate gene expression by destabilizing the DNA structure to allow transcriptional machinery to access the transcriptional start site [3]. Using synthetic oligonucleotides, the replacement of 5-mC with 5-hmC diminishes the binding affinity of MeCP2 (methyl-CpG binding protein 2), suggesting that 5-hmC residues are functionally equivalent to cytosine residues [5]. The degree of 5-hmC content can reduce methylation binding domain (MBD) interactions to reorganize the chromatin structure and reverse the repressive effects of 5-mC upon transcriptional regulation. 5-hmC modifications generally are transcriptionally activating, but exceptions always arise. In contrast, 5-hmC residues have been demonstrated to interact with MBDs or methyl-CpG-binding proteins (MBPs) to recruit many chromatin remodeling enzymes that maintain the repressive transcriptional regulation. For example, in mouse embryonic stem cells (ESCs), MBD3 (methyl-CpG binding protein 3) directly binds 5-hmC modified DNA (but not 5-mC) and recruits NuRD (nucleosome remodeling and deacetylase) complexes to repress gene transcription [18]. In addition, evidence is emerging that 5-hmC might act as a new landmark that recognizes and recruits specific DNA-binding proteins [6, 16], which can further complicate the role of 5-hmC transcriptional regulation.

Enzymatic Regulation by 5-hmC

TET proteins catalyze the generation of 5-hmC residues that act as intermediates for DNA demethylation to affect gene transcription. Proposed mechanisms indicate that 5-hmC is a stable intermediate for both “passive” and “active” DNA demethylation [3] (Fig. 1). “Passive” (replication-dependent) DNA demethylation occurs as a result of the inability of DNMT1 to maintain DNA methylation, resulting in the passive loss of DNA methylation at a specific CpG site [3, 5, 6, 15]. Less defined are the mechanisms of “active” (or replication-independent) DNA demethylation, which suggest that 5-hmC is either spontaneously or enzymatically converted to cytosine through DNA repair pathways [9•]. Possible mechanisms of “active” DNA demethylation include: the recognition of 5-hmC by activation-induced cytidine deaminase (AID/APOBEC), which mediates deamination to 5-hydroxymethlyuracil (5-hmU), generating an abasic site, which is recognized and subsequently removed by DNA glycosylases (SMUG1, single-strand-selective monofunctional uracil-DNA glycosylase 1) and TDG (thymine DNA glycosylase); and base excision repair (BER) machinery, to restore an unmodified cytosine (Fig. 1) [4, 6]. Regardless of the route of TET-mediated demethylation, 5-hmC is an essential intermediate in the removal of 5-mC resides in the DNA demethylation pathway, which can drastically change global methylation patterns and biological mechanisms. Therefore, further investigations are needed to fully understand the DNA demethylation pathway.

Recent studies have demonstrated that TET enzymatic activity is affected by many factors. Isocitrate dehydrogenase enzymes (IDH1 and IDH2) commonly undergo gain-of-function mutations in human gliomas (~75 %) and acute myeloid leukemia (~20 %) patients that catalyze the production of α-ketoglutarate (α-KG) and 2-hydroxyglutarate (2-HG) respectively [19, 20]. 2-HG is a proposed oncometabolite that competitively inhibits α-KG dioxygenases, such as TET [20]. In TET1-overexpressing HEK293 cells, the coexpression of tumor-derived mutant IDH1 and IDH2 substantially decreased the 5-hmC content compared to the wild type [19]. In vitro, 2-HG was shown to inhibit TET activity, and clinically, glioma samples subjected to immunohistochemical (IHC) staining demonstrated that 5-hmC was significantly decreased in IDH1-mutatated gliomas compared to the control, despite consistent 5-mC levels [19]. Oxidative stress has also been proposed to affect TET activity. Under high oxidative stress conditions, when homeostasis cannot be maintained, increased NAD+ levels in the mitochondria activate sirtuin NAD+ dependent deacetylases (Sirt) [20]. In mammals, increased Sirt3 results in the deacetylation of IDH2 and consequent activation of metabolic activity [20]. Therefore, activated IDH2 increases the conversion of isocitrate to α-KG and increases TET enzymatic activity, potentially changing global methylation patterns. Research continues to associate epigenetic mechanisms as responses to environmental sensors that can potentially disrupt cellular and biological processes, resulting in disease development.

Measurement of DNA Hydroxymethylation

Tools that exist for the detection of methylated DNA, such as bisulfite conversion and methylated-DNA-specific antibodies, have failed to distinguish between 5-mC and 5-hmC DNA, so it is critical to develop techniques that can differentiate between these two modifications. Currently, the most accessible technique developed for hydroxymethylated DNA enrichment and detection is the use of enzymatic and antibody approaches. 5-hmC DNA quantitation, by techniques such as polymerase chain reaction (PCR), high-performance liquid chromatography (HPLC), thin layer chromatography (TLC), or liquid chromatography-mass spectrometry (LC-MS) can detect 5-hmC at single CpG dinucleotides. We summarized several common techniques to reveal (1) global hydroxymethylation, (2) loci-specific hydroxymethylation, and (3) whole genome profiling for hydroxymethylated DNA loci (Table 1). All the techniques described could be used on distinct tissues, cell types or biological specimens.

Table 1 Common methods for 5-hmC detection

Quantitation for Global Hydroxymethylation

The content of 5-hmC across the genome can be measured using enzyme-linked immunosorbent assay (ELISA), with capture and detection antibodies followed by colorimetric quantification. The capture antibody for 5-hmC has no or negligible cross-reactivity to both methylated and unmethylated cytosines. Additionally, there are other protocols to examine the 5-hmC content across the genome, using enzymatic digestion coupled with HPLC, TLC, or LC-MS. Input DNA is first glucosylated by 5-hmC glucosyltransferase (GT), which transfers a glucose moiety from uridine diphosphoglucose (UDPG) onto preexisting 5-hmC’s within DNA [21, 22]. The difference in the chemical structure (mass) between glucosylated 5-hmC and unglucosylated 5-mC or cytosine generates a distinct profile for 5-hmC in the chromatography [2327]. To locate the 5-hmC distribution in the tissues, immunohistochemistry or immunofluorescence staining using an antibody against 5-hmC is employed to examine the distribution of 5-hmC among different cell types.

Specific Hydroxymethylated DNA/Loci/Gene DNA Quantitation

Current bisulfite sequencing technology cannot distinguish 5-mC and 5-hmC at specific CpG sites; therefore, real-time polymerase chain reaction (RTPCR), coupled with a glucosyl-5-hmC sensitive restriction endonuclease (GSRE) digestion [28], is used to measure the percentage of 5-hmC of specific CpG sites. First, the hydroxy group of 5-hmC is protected with a glucosylated moiety by glucosyltransferase prior to PCR. Glucosylated 5-hmC is next cut by a GSRE, like Taq1a, and according to the RTPCR assay, the cleaved glucosylated 5-hmC has a smaller threshold value (Ct) than the unglucosylated cytosine/5-mC. The percentage 5-hmC at specific CpG sites is quantified using the Ct difference between the glucosylated and unglucosylated samples. It is used for detection of PCR products smaller than 150 base pairs. For longer fragments, we suggest the new TET-1 assisted bisulfite sequencing. It involves the UDPG-mediated protection of 5-hmC and recombinant mouse TET1 (mTET1)-mediated oxidation of 5-mC to 5-caC. After the subsequent bisulfite treatment and PCR amplification, both cytosine and 5-caC (derived from 5-mC) are converted to thymine (T), whereas 5-hmC reads as cytosine (C). The treated genomic DNA is suitable for locus-specific sequencing [9•, 29, 30•].

Whole Genome DNA Quantitation

Researchers who are interested in quantifying the whole genome profiling hydroxymethylation loci could use the anti-5-hmC antibody to immunoprecipitate and enrich the sonicated DNA-protein complexes. The enriched DNA products are subjected to a methylation oligonucleotide array (hMeDIP-chip) or next generation sequencing (hMeDIP-Seq) [31, 32].

Environmental Exposure Linked to 5-hmC Perturbations

Cytosine modifications, such as 5-mC and 5-hmC, are epigenetic mechanisms that enable genes to respond to external environmental cues. Environmental exposure has been shown to disrupt epigenetic mechanisms and interfere with 5-mC and 5-hmC profiles in many diseases, as summarized in Tables 2 and 3. TET proteins are also apt to environmental perturbations. Therefore, researchers have been investigating 5-hmC content as a novel epigenetic marker to understand the link between epigenetic signatures and disease states.

Table 2 Examples of environmental exposures contributed to TET regulation and 5-hydroxymethylation changes
Table 3 Examples of diseases associated with changes in 5-hydroxymethylation or TET regulation

Ascorbate Acid

Vitamin C, or ascorbic acid (AA), stimulates the drastic erasure of 5-mC by promoting TET activity and generating 5-hmC-mediated DNA demethylation [33, 34, 35•]. TET enzymes require both Fe2+ and α-KG co-factors [3, 9•], and AA is another co-factor required for full catalytic activity of TET proteins [34]. AA directly interacts with the catalytic domains of TET1 and TET2 to reduce Fe3+ to Fe2+ and enhance TET-mediated oxidation [33]. AA not only increases 5-hmC, but also the iterative oxidation products 5-fC and 5-caC in mouse ESCs [33]. The reducing effects of AA are specific, since other antioxidant enzymes (glutathione, selenite, vitamin B1, vitamin E, L-carnitine, lipoic acid) do not induce the enzymatic activity of TET [35•]. Mouse ESCs treated with AA for 12 and 72 h displayed a progressive increase in 5-hmC and decrease in 5-mC at the gene promoters, but it was not influenced by the gene expression of TET or DNMT [35•]. In addition, the global induction of 5-hmC in response to AA treatment was reversible; after 3 days of AA removal, both the 5-hmC and 5-mC profiles returned to the baseline levels. These findings confirmed the dynamic interplay between 5-mC and 5-hmC from exogenous exposures mediated by TET oxidation. Additionally, these investigations suggest the nutritional value of AA, especially during embryonic development, where drastic global changes in DNA methylation occur at the fetus genome.

Phenobarbital

Phenobarbital, a widely used barbiturate medication to control seizures, is classified as a non-genotoxic carcinogen. In the mouse liver, PB perturbs 5-mC and 5-hmC profiles associated with tumor formation [14, 36]. Mice administered PB through drinking water displayed significant differential hydroxymethylation in juvenile (30–33 days of phenobarbital exposure) and mature (91 days of phenobarbital exposure) liver tissue. In mature liver tissue, phenobarbital induced proximal promoter enrichment of 5-hmC, coupled with a decrease in 5-mC, to increase the gene transcription of a collection of candidate genes with a potential role in liver tumorigenesis [36]. Additionally, phenobarbital has provoked the rapid and prolonged gene expression of cytochrome P450 2b10 (Cyp2b10), a member of the Cyp family of genes, critical for xenobiotic metabolism in the liver [36]. Therefore, by epigenetically inducing the expression of Cyp2b10 in the liver, the increased rate of phenobarbital metabolites produced, including carcinogens, could promote liver tumor formation. Wisp1, a Wnt signaling pathway gene, was overexpressed and associated with increased 5-hmC levels following phenobarbital exposure [14]. It supports the previous findings in both rodents and humans of Wisp1 gene expression increasing during the disease progression of hepatocellular carcinoma. These findings exemplify the use of 5-hmC as a sensitive biosensor, which can be used to distinguish control versus phenobarbital-exposed rodents.

Diethylstilbestrol

Neonatal exposure to diethylstilbestrol, an endocrine disrupting chemical, resulted in the adult onset of uterine and vaginal cancer in women. Mice postnatally exposed to diethylstilbestrol showed global 5-hmC reduction in uterine tissues of diethylstilbestrol treated mice (8 weeks old) when compared to the control [37]. Furthermore, diethylstilbestrol mediated the reduction of TET1, TET2, Dnmt1, and Dnmt3a genes, which remained repressed in adult uterine tissue. The decreased 5-hmC profile observed in adult mice in response to diethylstilbestrol treatment could be due to the impaired TET oxidation of 5-mC and/or declining 5-mC preservation due to DNMT reduction. Results suggest that neonatal exposure to endocrine disruption chemicals may result in aberrant 5-mC/5-hmC-mediated DNA methylation, altering gene expression patterns throughout adulthood and possibly increasing disease risk.

Hydroquinone

Hydroquinone, a copious metabolite of benzene (a carcinogen found in car exhaust, industrial emissions, and cigarette smoke), caused an increase in TET1 activity and global 5-hmC in human embryonic kidney cell cultures [38]. Hydroquinone exposure increased the global 5-hmC content and decreased 5-mC profiles through a reactive oxygen species (ROS)-associated mechanism, since the hydroquinone effects were annulled upon N-acetyl-cysteine (NAC) rescue. In addition, the targeted siRNA-mediated knockdown of TET1 caused a reversal of 5-hmC levels back to baseline. It suggests that numerous environmental exposures that induce reactive oxygen species/oxidative stress may influence 5-hmC patterns and increase the disease risk.

Environmental Allergens

Asthma is a significant public health concern, and its prevalence has increased dramatically in past 30 years among both children and adults [49]. Since the prevalence of asthma cannot be explained by traditional genetic inheritance, much consideration has been given to lifestyle and environmental exposures being related to increased disease risk. House dust mite, a common environmental allergen, induces airway hyperresponsiveness and inflammation in mouse models through epigenetic alterations in the lung [39]. Exposure to dust mites has resulted in global increases in both 5-mC and 5-hmC in the lung compared to saline controls. In addition, gene-specific methylation of Pde4d, Pom121l2, and Ncx3 has been linked to dysregulated gene expression and cell functions of the lungs. How 5-mC and 5-hmC mediate gene regulation toward environmental allergens requires further investigation.

5-hmC in Disease Development

Cancer

Epigenetic dysregulation is well described in cancer pathogenesis. Using human lung, brain, kidney, liver, skin, and small intestine tissues, the levels of 5-hmC are remarkably depleted in malignant tissues compared to the benign tissues [40, 41, 50]. One explanation is that tumor formation involves rapid cell proliferation, which could lead to the passive loss of 5-hmC [50]. In support of this theory, proliferating cells stained with Ki67 did not express 5-hmC [3, 50]. However, more research is needed to understand the mechanisms linking the loss of 5-hmC in proliferating cells, which leads to increased cancer risks. If this theory is proved, genome-wide 5-hmC and 5-mC profiles in conjunction with Ki67 staining could be valuable biomarkers for cancer diagnosis and management.

Gliomas exhibit epigenetic changes in response to disease progression. Immunohistochemistry (IHC) detection of 5-hmC in formalin-fixed, paraffin-embedded, human glioblastomas revealed that 5-hmC decreased significantly with the progression of the tumor grade, regardless of fetal or adult origin [40]. Upon stratifying the 5-hmC IHC stains into high- and low-content tissues, 5-hmC reduction became a significant prognostic indicator of decreased life expectancy in fetal and adult glioblastomas. This indicates the critical role of 5-hmC in disease state progression and disease management.

In hepatocellular carcinoma, clinical and animal data indicate that 5-hmC may serve as a prognostic marker for disease outcomes. IHC detection of 5-hmC on hepatocellular carcinoma samples showed a drastic decrease in 5-hmC content in the tumor tissue compared to the benign normal liver tissue [41]. Subsequently, lower levels of 5-hmC were associated with decreased overall survival in hepatocellular carcinoma. Male rats exposed to diethylnitrosamine, a potent carcinogen for hepatocellular carcinoma, exhibited a gradual dose-dependent decrease in 5-hmC IHC detection up through 24 weeks of treatment and showed remarkably reduced 5-hmC in tumor versus non-tumor tissue [41]. These data further confirm the role of 5-hmC depletion as a biomarker of cancer development.

Melanomas are aggressive cancers that are capable of transforming from benign nevi into lethal metastases in a small surface area. Using tissue sections from patients in different stages of melanoma progression, IHC staining exhibited significant decreases in global 5-hmC levels depending on the degree of disease [42, 43], even though the 5-mC profiles among the samples were constant, regardless of the disease state. IHC staining of TET2 was robust in nevi samples and reduced in advanced melanomas [43]. Therefore, a reduction in TET2 and 5-hmC indicates the epigenetic hallmark representing the disease progression of melanoma. These findings also further implicate the dangers of UV exposure in melanoma disease pathogenesis [42].

Myeloid malignancies are cancers originating in blood stem cells in the bone marrow and include acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS) [51]. In bone marrow samples from mixed lineage leukemia (MLL)-rearranged AML patients (TET1 is a fusion partner of the MLL gene), TET1 is overexpressed [44]. Using small hairpin RNA (shRNA) against TET1 in mouse MLL-rearranged leukemic cells, it led to decreased 5-hmC in a dose-dependent manner and induced apoptosis [44]. In vivo, TET1-knockout mice exhibited similar results to TET1-shRNA models, with TET1 reduction correlating to significant decreases in 5-hmC compared to the controls [44]. Results suggested that TET1 plays a role in the pathogenesis of MLL-rearranged leukemia, which is marked by a global increase in 5-hmC and the transcriptional activation of many targets promoting cell proliferation and inhibiting apoptosis, resulting in cell transformation and AML disease pathogenesis. In MDS, TET2 and 5-hmC expression appears to be necessary for the normal differentiation of hematopoietic stem cells [52, 53]. TET2 mutations in bone marrow mononuclear cells were associated with ~16 % of MDS patients, and the level of 5-hmC was significantly lower in MDS patients with mutant TET2 [45]. Upon risk stratifying the 5-hmC patients, high 5-hmC levels had increased MDS survival rates when compared to those with low 5-hmC levels [45]. This suggested that the depletion of 5-hmC can be used as a diagnostic and prognostic tool for MDS patients.

Developmental Syndromes

In normal human tissues, average 5-hmC values vary depending on tissue type. Brain tissues have the highest 5-hmC content, up to 10- to 20-fold more than peripheral tissues and ESCs [15, 46]. Many neurodegenerative disorders are associated with aberrant methylation once the epigenetic signature has been impaired. Since neuronal cells in the adult brain do not divide mitotically, high levels of 5-hmC are maintained. High 5-hmC content indicates that active demethylation pathways may be upregulated [46, 54]. Under normal conditions in the central nervous system, abundant MeCP2s have a higher affinity for 5-hmC resides compared to 5-mC, facilitating active gene transcription [54]. In classical Rett syndrome, a neurodevelopmental disorder that affects primarily girls and results in the loss of language and motor skills, the loss-of-function MeCP2 mutation disrupts the neuronal maturation through the epigenetic dysregulation of MeCP2 function [55]. Given the abundant levels of 5-hmC in the central nervous system, decoding the role of 5-hmC in neuronal gene regulation and neurodevelopmental disorders is a growing interest among researchers.

Late-Onset Diseases

Huntington’s disease is a fatal genetic disorder characterized by chorea, and cognitive and psychiatric decline. In a mouse model of Huntington’s disease, 5-hmC was significantly reduced in the striatum and cortex of Huntington’s disease mouse brains compared to the control, and the IHC stain co-localized with NeuN, a marker of mature neurons [46]. Upon deep sequencing and genomic mapping, 436 striatal and 199 cortical genes were differentially hydroxymethylated [46]. Therefore, 5-hmC reduction in Huntington’s disease mouse brains further complements previous findings on aging and the decrease in 5-hmC in neurodegenerative disorders.

Alzheimer’s disease is the most common form of dementia among the aging population and results from the aberrant production and cleavage of amyloid-ß (Aß) plaques and tau protein tangles. The etiology of Alzheimer’s disease is unknown, since disease onset occurs after decades of exposure at different stages of life. One factor investigated in Alzheimer’s disease mouse models is the role of stress during prenatal development, which has been shown to negatively affect cognition and hippocampal functioning, predominantly in females [47]. In female offspring experiencing prenatal stress, compared to the males and un-stressed controls, the dorsal and ventral subregions of the hippocampus showed increased 5-hmC. This suggests that in utero prenatal stress can impact epigenetic mechanisms via 5-hydroxymethylation that increase Alzheimer’s disease risk.

Increasing evidence is implicating the role of epigenetic modifications in neuropsychiatric disorders. Using post-mortem brain cortex tissue from psychotic (bipolar disorder, schizophrenia) and depressed patients for examining the TET1 mRNA and protein expression, it was shown that TET1 expression levels were increased in both the psychotic and depressed patients when compared to the controls [48]. In another study, the global 5-hmC levels were significantly increased in the psychotic patients compared to controls [56]. In addition, decreased 5-hydroxymethylation was correlated to the decreased glutamic acid decarboxylase67 (GAD67) mRNA expression in psychotic subjects [48, 56]. Hence, 5-hmC can be used to investigate the pathogenesis of psychosis.

Conclusion and Future Perspective

Given the nature of 5-hmC in the epigenetic regulation of cell functions, and its possible link to disease development, there is the possibility of employing global 5-hmC or specific hydroxymethylated genes as environmental biosensors. 5-hmC, the “sixth base,” is as stable as an epigenetic marker as 5-mC in the genome. The discovery of TET proteins, which was a breakthrough in the field of understanding gene regulation by DNA hydroxymethylation, allows us to track the development of 5-hmC via TET proteins in response to environmental exposure. Similar to other epigenetic studies, there are several challenges in performing 5-hmC studies. First, the exact mechanism by which TET proteins or 5-hmC regulates gene transcription is unclear. What are the other epigenetic factors that modulate the TET proteins and process the 5-hmC-mediated transcription signal? Second, the distribution of 5-hmC varies among the cell types and cell tissues. Are there any cell type-specific 5-hmC signatures? Most of the 5-hmC studies are focused on prenatal development, especially the stem cell differentiation and lineage. There are limited studies on the biological roles of 5-hmC in adult tissues. If 5-hmC markers could be validated in various biological samples, the 5-hmC signature could be applied in clinical research, such as in the development of non-invasive diagnostic and prognostic approaches, by testing for epigenetic markers of body fluids (i.e., serum and urine). Third, is there any critical window for the establishment of 5-hmC marks under the exposure of environmental factors? Not many studies have demonstrated the establishment of 5-hmC from the prenatal to the aging or diseased stages. No doubt it could be assayed in the rodent model first to understand its role in disease development and manipulate the studies on human tissues.

With the advance of “omic” profiling techniques and bioinformatic database searches, 5-hmC signatures can be discovered at every stage of cell development or in response to any environmental exposure. Proper bioinformatics and statistical analyses on high-throughput epigenomic technologies, such as microarray or deep sequencing, are required because of the huge amount of data that these tests provide. A set of follow-up studies, in vitro or in vivo, are also needed to validate and characterize a new set of molecular participants in disease development. We expect to identify markers that could allow us to reset the disruptive hydroxymethylome in disease development. Additionally, techniques for detecting single CpG hydroxymethylation have been established and applied for detecting gene-specific hydroxymethylation. It helps to understand the epigenetic modulation of cell functions via modification at a specific gene promoter, leading to new designs for therapeutic drugs. In order to apply 5-hmC as a promising environmental biosensor, we must track epigenetic changes in different tissues and cell types, and the time course in a number of subjects. Applying 5-hmC studies to human population studies would speed up the understanding of the use of 5-hmC in predicting exposure and disease risk. However, methods of statistical analysis, such as data modeling and multiple testing, must be carefully chosen when we apply genetic and epigenetic analyses in environmental diseases. All in all, the studies of 5-hmC and its interactions with environmental exposure are exciting; however, many questions remain unanswered. It is believed that with the new basic knowledge of 5-hmC biology, advanced sequencing technology, and proper statistical and epidemiological strategies, we could expect the findings from 5-hmC studies to be applied for lifestyle recommendations and/or changes in clinical practices, which can lead to an improvement in disease management and, ultimately, in public health.