Skip to main content

Identification of key DNA methylation changes on fasting plasma glucose: a genome-wide DNA methylation analysis in Chinese monozygotic twins

Abstract

Background

Elevated fasting plasma glucose (FPG) levels can increase morbidity and mortality even when it is below the diagnostic threshold of type 2 diabetes mellitus (T2DM). We conducted a genome-wide DNA methylation analysis to detect DNA methylation (DNAm) variants potentially related to FPG in Chinese monozygotic twins.

Methods

Genome-wide DNA methylation profiling in whole blood of twins was performed using Reduced Representation Bisulfite Sequencing (RRBS), yielding 551,447 raw CpGs. Association between DNAm of single CpG and FPG was tested using a generalized estimation equation. Differentially methylated regions (DMRs) were identified using comb-P approach. ICE FALCON method was utilized to perform the causal inference. Candidate CpGs were quantified and validated using Sequenom MassARRAY platform in a community population. Weighted gene co-expression network analysis (WGCNA) was conducted using gene expression data from twins.

Results

The mean age of 52 twin pairs was 52 years (SD: 7). The relationship between DNAm of 142 CpGs and FPG reached the genome-wide significance level. Thirty-two DMRs within 24 genes were identified, including TLCD1, MRPS31P5, CASZ1, and CXADRP3. The causal relationship of top CpGs mapped to TLCD1, MZF1, PTPRN2, SLC6A18, ASTN2, IQCA1, GRIN1, and PDE2A genes with FPG were further identified using ICE FALCON method. Pathways potentially related to FPG were also identified, such as phospholipid-hydroperoxide glutathione peroxidase activity and mitogen-activated protein kinase p38 binding. Three CpGs mapped to SLC6A18 gene were validated in a community population, with a hypermethylated direction in diabetic patients. The expression levels of 18 genes (including SLC6A18 and TLCD1) were positively correlated with FPG levels.

Conclusions

We detect many DNAm variants that may be associated with FPG in whole blood, particularly the loci within SLC6A18 gene. Our findings provide important reference for the epigenetic regulation of elevated FPG levels and diabetes.

Introduction

Type 2 diabetes mellitus (T2DM) is a chronic metabolic disease with a high prevalence characterized by chronic hyperglycemia, which can cause serious complications, such as heart attack, blindness, and nerve and blood vessel damage. As an important indicator for T2DM diagnosis, elevated fasting plasma glucose (FPG) levels can increase morbidity and mortality even when it is below the diagnostic threshold [1].4

The FPG levels and T2DM may be influenced by a combination of genetic factors and environmental exposure with being mediated by epigenetic modification [2]. At present, the magnitude of genetic sources of variance in FPG and T2DM has been extensively explored. The heritability of FPG has been reported to range from 24.90% to 67.66% [3,4,5]. Additionally, some genome-wide association studies (GWASs) have reported the genetic variants responsible for susceptibility to elevated FPG levels and T2DM, such as the genetic loci in/near SPATS2L, SLC26A11, and JAZF1 [2, 3]. However, the genetic variants identified in previous GWASs just could explain less than 20% of the estimated heritability for T2DM [6] and thus could only partially contribute to the pathogenesis of this disease.

In recent years, increasing evidence has supported the significant role of epigenetic mechanisms with altered gene expression in the increased susceptibility to diseases. DNA methylation (DNAm) is an important aspect of epigenetic research, and several existing epigenome-wide association studies (EWASs) have investigated the relationship of DNAm with T2DM and glycemic traits [7,8,9,10,11,12,13,14,15,16,17,18]. Although some cytosine phosphate guanines (CpGs) and genes have been reported, replicated CpGs and genes are limited. For example, Walaszczyk et al. previously identified 52 T2DM-related CpGs in peripheral blood, however, only 5 CpGs located at LOXL2, TXNIP, SLC1A5, SREBF1 and ABCG1 were replicated after strict multiple corrections [11]. Thus, more EWASs on T2DM or glycemic traits are needed for further replication and validation. In addition, given that most of the reported associations are from cross-sectional and case–control designs, the causal nature of the relationship, that is, if DNAm exerts a causal effect on FPG or vice versa, is unknown. Therefore, it is necessary to further explore the causal relationship between DNAm and FPG.

The limited detection and replication of CpGs and genes among EWASs may be due to the use of unrelated individuals as controls in traditional cross-sectional [9, 13, 14] or case–control studies [7, 8, 10,11,12, 17]. Although common factors such as age, sex and race were fully considered, the confounding effect from different genetic backgrounds was not well controlled [19]. Nowadays, monozygotic twins with the same genetic background have been proved to be ideal samples for EWASs [20]. Especially for the modest and highly heritable traits or diseases, the use of monozygotic twins in EWAS can improve statistical power by perfectly controlling for the effect of different genetic background [21]. Nevertheless, currently only a few studies have investigated the effect of epigenetics on T2DM or glycemic traits in monozygotic twins [15, 18]. Furthermore, the use of monozygotic twins also makes causal inference possible in association studies of epigenetics based on cross-sectional design [22]. To our knowledge, no studies have yet performed causal inference analysis between them.

The Chinese population may have different DNAm variants compared to other ethnic groups owing to the different genetic makeup and environmental exposure. However, to data, the EWASs on T2DM or glycemic traits performed in Chinese population were limited, particularly in twins [23]. Herein, we conducted this EWAS to explore the potential CpGs, genes and biological pathways potentially related to FPG in Chinese monozygotic twins, and further estimated the causation between DNAm variants and FPG. Candidate CpGs were further validated in a community population. Finally, we integrated the differentially methylated results with gene expression data in twins.

Methods

The primary materials and methods used in this study were in accordance with those of our previously published studies [24,25,26,27,28,29].

Participants

Monozygotic twin samples were collected through the Qingdao twin registry, and details of study recruitment have been previously described [30]. We excluded participants who were pregnant or breastfeeding, had used hypoglycemic drugs or insulin, and did not complete a questionnaire or physical examination. The participants had unqualified blood samples, such as blood collection vessel ruptured, or the concentration or total amount of extracted DNA could not meet the experimental requirements, were further dropped. Moreover, incomplete twin pairs that either of the twins lacked blood sample or relevant information were also excluded. Considering the advantage of trait or disease-discordant monozygotic twin design, the particularity of monozygotic twin samples, and our experience in previous research [24,25,26,27,28,29], the twins with intra-pair FPG difference ≥ 0.1 mmol/L was chosen. A total of 52 complete monozygotic twin pairs were included in the methylation analysis, and a subsample of 12 pairs were randomly selected for gene expression analysis. The median of intra-pair absolute difference of FPG (|Δ(FPG)|) in all twins was 0.48 mmol/L (95% range: 0.13–1.80).

FPG levels were determined using a semiautomatic analyzer. Sex, ABO blood type, and 16 multiple short tandem repeat DNA markers were used to identify zygosity. Informed consent was obtained from all the participants. Ethical approval was obtained, and the study was conducted in accordance with the Declaration of Helsinki.

Reduced representation bisulfite sequencing (RRBS) analysis and data preprocessing

The DNA extracted from whole blood was sent to a corporation (Biomarker Biological Technology, Beijing, China) for RRBS analysis. Briefly, genomic DNA was treated with MspI to generate short fragments containing CpG dinucleotides. Then, end-repair, dA-tailing, and purification processes were performed to obtain CpG-rich DNA fragments. The obtained DNA fragments were bisulfite-converted, amplified by polymerase chain reaction (PCR), and sequenced using Illumina HiSeq X Ten. The raw methylation data covered 551,447 CpGs across the genome of each individual.

The pipeline recommended by Bismark was adopted to preprocess raw data [31]. Sequencing data were aligned to the Genome Reference Consortium Human Build 37 using Bowtie2 [32]. The processed coverage outputs were then inputted to R package BiSeq to generate a smooth methylation level [33]. To reduce bias, we limited the coverage to 90% quantile and further removed CpGs with an average methylation β-value < 0.05 or more than ten missing observations. After quality control, a total of 252,564 CpGs remained for subsequent analyses. The methylation β-value was converted to M-value with M = log2(β/(1- β)) for further analysis [34].

Cell-type composition estimation

Considering that DNAm data was measured in whole blood, distinct methylation profiles of different cell types may give rise to false discoveries [35]. The ReFACTor method was introduced to attenuate the effects of distinct cell components on DNAm [36]. Specifically, ReFACTor selected methylation sites that provided important information on cell composition for principal component analysis (PCA). The top five components of PCA were used to structure the underlying substitutes of cell type composition to adjust the heterogeneity of cell types.

Construction, sequence, and quality control of RNA library

The detailed experimental process has been described in our previous study [25]. Briefly, mRNA was extracted from whole blood using TRIzol reagent, and its concentration, purity, and integrity were rigorously measured. After purification, fragment size selection, and PCR enrichment, the qualified mRNA was used to construct the RNA-seq library. The RNA-seq library was then sequenced to obtain sequencing data using the Illumina HiSeq 2500 and was validated by real-time quantitative PCR (RT-qPCR). The TopHat2 was used to map the sequencing data to the human genome [37]. The FPKM value was used to detect gene expression levels using Cufflinks software [38].

Statistical analysis

Epigenome-wide association analysis

The association between DNAm M-value at a single CpG and FPG was tested by applying generalized estimation equation (GEE) approach through geeglm function in R-package geepack, adjusting for age, sex, diastolic blood pressure (DBP) and top five cell-type composition. Furthermore, in order to address the paired structure of twin data, we included a vector which identified the clusters of twins within a pair into the GEE model. To take multiple testing into account, we calculated the false discovery rate (FDR) [39] and defined genome-wide significance as FDR < 0.05. The identified genomic CpGs (P < 0.05) were annotated to the nearest genes using R-package biomaRt [40].

Detecting differentially methylated regions (DMRs)

The DMRs associated with FPG were evaluated using Comb-P approach that could calculate auto-correlation, combine adjacent P-values, performe false discovery adjustment, find regions of enrichment, and assign significance to those regions successively [41]. The Stouffer-Liptak-Kechris (slk) corrected P-value < 0.05 was used to detect significantly enriched DMRs.

Causal inference analysis

For the identified top 30 CpGs, the causal relationship with FPG was estimated by Inference about Causation through Examination of Familial Confounding (ICE FALCON) method in twins [22]. In this method, ‘familial’ meant both genetic and shared environmental factors in twins, which was essential for making explicit causal inference. The GEE model was applied for parameter estimation with correction for twin pairing. Estimations of βself, βco-twin, as well as β′self, and β′co-twin were calculated, where βself was the estimation of the overall correlation including the causal proportion and family confounding proportion, βco-twin estimated only the family confounding proportion of the correlation, and β′self and β′co-twin was the estimation of the full model. If |βco-twin–β′co-twin| was similar to |βself–β′self|, then the association was due to family confounding; and if |βco-twin–β′co-twin| was much larger than |βself–β′self| (ratio > 1.5), then it indicated a causal effect [42].

Ontology enrichment analysis

In order to further explore the biological function of CpGs identified in EWAS, we submitted 20,925 CpGs (P-value < 0.05) to the Genomic Regions Enrichment of Annotations Tool (GREAT) online to analyze ontology enrichments [43]. Annotation was based on human GRCh37, and the default “basal plus extension” association rule was used. Statistical significance was defined as FDR < 0.05.

Sensitivity analysis

In order to evaluate the robustness of study findings, we performed a sensitivity analysis by further adjusting for smoking status (now or ever smoking versus never smoking) and drinking status (now or ever drinking versus never drinking) in the original GEE model in epigenome-wide association analysis. Subsequently, the DMRs were also explored, and causal inference analysis was also performed. We also performed another sensitivity analysis by removing the outliers for DNAm of each top CpG and then testing the association between DNAm of each CpG and FPG again.

Power of epigenome-wide association analysis

We have published a computer simulation study on the power of EWAS using twin design [21]. According to this study, for one trait/disease with a heritability of 0.6, the sample size required for the statistical power to exceed 80% in the twin design ranged from 22 to 63 pairs when the correlation between environmental factors and DNA methylation ranged from 0.8 to 0.1, which is an immense improvement over the ordinary case–control design. The heritability of FPG was about 0.68 in Chinese twins [3]. Hence, our study, based on 52 twin pairs, would get a statistical power of about 80%.

Quantitative methylation analysis of SLC6A18

Considering the results of top DNAm signals identified in EWAS, the biological function of genes, the causal relationship with FPG, the correlation of gene expression level with FPG, and the primers designed results, we selected the SLC6A18 gene to validate in the community population. In the case–control study, we randomly recruited 72 diabetic cases and 170 healthy controls from the community, with no restrictions or criteria on the selection of the controls. The patients were defined as those with a fasting FPG level ≥ 7.0 mmol/L, taking hypoglycemic drugs, or using insulin. Participants with a history of hypertension, obesity, cancer, stroke, cardiovascular disease, or hepatitis were excluded. The participants were interviewed, and blood samples were collected and stored at -80C for DNA methylation analysis.

We designed primers for SLC6A18 gene to cover the region with the most CpGs (P-value < 0.05) in EWAS. The mass spectra of the cleavage products were collected using MALDI-TOF mass spectrometry based on the MassARRAY System (Bio Miao Biological Technology, Beijing, China), and the methylation ratio of the spectra was generated using MassARRAY EpiTYPER software (Agena Bioscience, San Diego, California, USA). The DNAm of CpGs and characteristics between the two independent groups was compared using Wilcoxon rank sum test or t test. A binary logistic regression model was applied to evaluate the association between CpGs and T2DM while adjusting for total cholesterol (CHOL), triglyceride (TG), and low-density lipoprotein cholesterol (LDLC). Statistical significance was set at P-value < 0.05.

Weighted gene co-expression network analysis (WGCNA)

In order to investigate whether the genes where the top CpGs and DMRs were mapped in methylation analysis were also differentially expressed, we further performed a weighted gene co-expression network analysis (WGCNA) using the gene expression data of twins. The R-package WGCNA is a comprehensive function that can perform weighted correlation network analysis [44, 45]. Briefly, first, a weighted adjacency matrix was established. Then, we constructed a topological overlap matrix (TOM) and used it as an input for conducting hierarchical clustering analysis. A dynamic tree-cutting algorithm was used to detect the gene modules. The module eigengenes (MEs) of the modules were correlated with FPG levels.

Furthermore, in order to find the important biological function on FPG and diabetes, we also performed functional enrichment analysis for the genes clustered in modules related to FPG and tried to find the common enrichment terms between methylation analysis and gene expression analysis. The BIOCARTA, KEGG, and REACTOME pathway enrichment analysis and GO enrichment analysis were performed using DAVID tool for genes clustered in important modules [46]. A modified Fisher’s exact P-value < 0.05 was treated as the cut-off standard for significant enrichments.

Results

Epigenome-wide association analysis

In this study, 52 twin pairs (including 27 male pairs) with a mean age of 52 years (SD: 7) were included. The median FPG level was 5.44 mmol/L (95% range: 3.76–7.4). Most clinical indicators showed a statistically significant correlation in twin pairs (Additional file 1: Table S1), suggesting the co-twin design beneficial. However, the correlation of DBP (r = 0.207, P = 0.141) showed no statistical significance, hence we treated DBP as a covariate in the subsequent GEE model.

The Manhattan plot is shown in Additional file 2: Fig. S1. The association between DNAm of 142 CpGs and FPG reached the genome-wide significance level (FDR < 0.05). The top 30 CpGs are shown in Table 1. The strongest association (β = 2.49, FDR = 1.80 × 10–5) was determined for the CpG (chr12:105,478,501 bp) in ALDH1L2. These top CpGs were located in/near 17 genes, such as TLCD1, SYNPO, MZF1, PTPRN2, SLC6A18, ASTN2, and IQCA1.

Table 1 The results of DNA methylation of top 30 CpGs with fasting plasma glucose.

Differentially methylated regions (DMRs) analysis

As shown Fig. 1 and Table 2, a total of 32 DMRs were identified for FPG. The methylation level of 18 DMRs (1, 2, 4–10, 12, 15–17, 19, 20, 25, 28, and 30) at/near TLCD1, MRPS31P5, MRPL23, AK126380, PTPRN2, CSNK1E, GON4L, PRDM16, AK056657, LOC440434, TUBB8B, and FAM175B was positively associated with FPG level, and 12 DMRs (3, 11, 13, 14, 21–24, 26, 27, 29, and 31) at/near CXADRP3, ZNF516, CASZ1, PKP3, SLC25A3P1, PCDH7, MEOX2, MTHFSD, MIPOL1, C067347, and ZNF578 was negatively associated with FPG level, respectively. However, the association between the methylation level of two DMRs (18 and 32) and FPG level was difficult to determine. Four DMRs covered several of the top CpGs as depicted in Table 1, with DMR25 (FAM175B) covering one CpG, DMR7 (PTPRN2) and DMR18 (SYNPO) covering two CpGs, and DMR1 (TLCD1) covering four CpGs, respectively.

Fig. 1
figure 1

The methylation patterns for the identified differentially methylated regions (DMRs). The horizontal axis shows the chromosome positions with the black point indicating each CpG, and the vertical axis shows the coefficient for the association between each CpG and fasting plasma glucose (FPG). Black line indicates the methylation pattern for each DMR. BP, base pair; chr, chromosome. NA, not available

Table 2 The results of annotation to significant differentially methylated regions

Causal inference analysis

The causal inference results of the top CpGs are depicted in Table 3. Briefly, the DNAm of seven CpGs located at/near three genes (SLC6A18, IQCA1, and PDE2A) was bidirectionally associated with FPG, that was, when FPG changed DNAm changed, and vice versa. The DNAm changes at eight CpGs (MZF1, PTPRN2, GRIN1, and SLC6A18) caused FPG changes, and FPG changes caused DNAm changes at five other CpGs (TLCD1 and ASTN2). The causal relationship between the remaining CpGs and FPG was not statistically significant.

Table 3 The results of causal inference analysis of top CpGs with fasting plasma glucose

Ontology enrichments analysis

Important pathways potentially related to FPG and diabetes were found, including phospholipid-hydroperoxide glutathione peroxidase activity, mitogen-activated protein kinase p38 binding, positive regulation of insulin receptor signaling pathway, cell fate commitment, notch signaling pathway, and biosynthesis of neurotransmitters (Additional file 3: Table S2).

Sensitivity analysis

Additionally, we performed a sensitivity analysis by further adjusting for smoking status and drinking status in the original GEE model in epigenome-wide association analysis. The smoking and drinking status of intra-pair twins were almost consistent in twin samples. The numbers of twins with both smoking, both non-smoking, and inconsistent smoking status were 19, 28, and 5, respectively. The numbers of twins with both drinking, both non-drinking, and inconsistent drinking status were 17, 32, and 3, respectively. As shown in Additional file 4: Table S3, Additional file 5: Table S4, and Additional file 6: Table S5, the results of epigenome-wide association analysis, DMRs analysis, as well as causal inference analysis were almost consistent with those before sensitivity analysis, indicating our findings robust. Moreover, when we removed the outliers for DNAm of each top CpG in another sensitivity analysis, the association between DNAm of each CpG and FPG remained nearly constant.

Quantitative methylation analysis of SLC6A18

A total of 72 diabetic cases and 170 healthy controls from the community were included to validate the CpGs mapped to SLC6A18 gene identified in EWAS. As shown in Additional file 7: Table S6, people with diabetes had higher levels of CHOL, higher levels of TG, and lower levels of LDLC than people without diabetes. Of the CpGs identified (P-value < 0.05) mapped to SLC6A18 in EWAS, three were quantified using the Sequenom MassARRAY platform in a community population. As shown in Additional file 8: Table S7, these three CpGs were hypermethylated in T2DM patients, showing a same direction as in EWAS. Particularly, one CpG (chr5:1,233,066) was also regarded as the top DNAm signal shown in Table 1.

WGCNA

Twelve twin pairs were included in gene expression analysis, with a mean age of 54 years (SD: 6) and a median FPG level of 5.60 mmol/L (95% range: 4.31–7.90). As shown in Additional file 9: Fig. S2, the Darkolivegreen module (including 721 genes) was positively correlated with FPG levels (r = 0.61, P-value = 0.001).

Among the genes where the top CpGs and DMRs were mapped in methylation analysis, 18 genes were also clustered in the Darkolivegreen module of WGCNA (Additional file 10: Table S8), including ALDH1L2, TLCD1, RNF126, SYNPO, MZF1, PTPRN2, PPP2R2A, SLC6A18, ASTN2, LAMC3, IQCA1, FAM175B, GRIN1, PDE2A, MRPL23, CASZ1, GON4L, and CSNK1E. Moreover, some common enrichment terms were also identified, such as extracellular matrix structural constituent, dopamine binding, dopamine neurotransmitter receptor activity, regulation of biosynthetic process, neuron fate specification, and C21-steroid hormone biosynthetic process (Additional file 11: Table S9).

Discussion

Based on monozygotic twin samples, we identified some CpGs, DMRs, and pathways potentially associated with FPG. Three CpGs mapped to SLC6A18 gene were also validated in a community population. Common genes and enrichment terms between the DNA methylation and gene expression analyses were identified. In addition, causal relationship between DNAm of several CpGs and FPG was identified. What’ more, the results of sensitivity analysis indicated our findings robust.

Some genes where the top CpGs and DMRs were mapped may influence FPG levels or diabetes, such as MZFI, PTPRN2, ASTN2, GRIN1, SLC6A18, and PDE2A. The MZFI gene binds and transactivates L-selectin promoter, which has been proven to be related to disease entities, including T2DM [47]. While comparing the mouse data with T2DM patients data, altered DNAm of 105 genes was correlated with incident T2DM, among which PTPRN2 showed a stronger predictive potential [8]. The association between genetic variants of ASTN2 gene and T2DM has been determined [48, 49]. It was reported that SNP rs6293 in GRIN1 gene could affect eating behavior in T2DM [50]. The protein encoded by SLC6A18 gene is a member of the SLC6 family, which acts as a transporter for small molecules, including alpha-D-glucose. The SLC6A18 gene is involved in the proximal tubule transport pathway, and insulin might participate in renal glucose handling by acting on the proximal tubules [51]. In addition, the SLC6A18 gene is also involved in the NRF2 pathway, and its role in mastering antioxidants in diabetic dysfunction has been clearly elucidated [52]. Early and specific upregulation of cardiac PDE2A gene expression was noted in diabetic cardiomyopathy [53]. Other genes currently have unknown roles in FPG levels or diabetes, and they may be potential candidates for further exploration and validation.

Additionally, when we integrated the DNA methylation data with gene expression data, we found that the gene expression levels of several genes where the top CpGs and DMRs were mapped in methylation analysis was positively correlated with FPG levels. It is worth noting that the roles of PTPRN2, ASTN2, GRIN1, SLC6A18, and PDE2A in influencing FPG levels or diabetes had previously been suggested as mentioned above. We speculated that the DNA methylation variants in these genes might influence FPG levels by regulating the corresponding gene expression levels. Moreover, we also found some common enrichment terms between methylation analysis and gene expression analysis, such as extracellular matrix structural constituent [54], dopamine binding and dopamine neurotransmitter receptor activity [55, 56], and C21-steroid hormone biosynthetic process [57], which might play important roles in the elevated FPG levels and diabetes and might serve as important targets to be further researched.

As additional replication, we compared previously reported significant results in a series of EWASs with ours. Due to the different methods to assess methylation profiles, we mainly compared the results among studies based on the genes where the significant CpGs were mapped. As shown in Additional file 12: Table S10, three differentially methylated genes including C7orf50, PTPRN2, and CASZ1 could be replicated. The association between methylated levels of C7orf50 gene and T2DM or glycemic traits has previously been reported in Koreans [7], Saharan African [9], Britons [10], Americans [16], and Chinese [18], as well as in a meta-analysis of five prospective European cohorts [58]. Ouni et al. found that PTPRN2 gene showed a stronger predictive potential for T2DM [8]. Abnormal DNAm levels at the promoter of CASZ1 gene in the placental may lead to metabolic diseases, including T2DM [59]. All of these indicated that our results are credible.

Three strengths in our study can be noted. First, our study was the implementation of a trait-discordant monozygotic twin model, which is proven to be a powerful tool for EWAS [20, 21]. supporting the relevance of our results. Second, causal relationships between the DNAm of some top CpGs and FPG was identified. Third, considering the differences in genetic background and environmental exposure, the underlying pathogenic process of diabetes in Chinese population may be partly referred to in our study.

However, several limitations of this study cannot be ignored. First, compared to the traditional case–control design, the sample size of our study was relatively limited because of the challenges of recruiting and identifying high-quality monozygotic twins. However, according to a study on the power of EWAS using a twin design [21], our study based on 52 twin pairs would obtain a statistical power of about 80%. Moreover, the CpGs mapped to SLC6A18 gene were successfully validated in a community population. Second, this study was based on peripheral venous blood samples from twins. The development of diabetes is disrupted by multiple biological mechanisms in different organs of the body, including the pancreas, skeletal muscle, and adipose tissue [60]. The EWASs conducted in these tissues may provide a comprehensive understanding of the etiology of diabetes. However, such tissue samples cannot be obtained on a large scale. Given the relative availability and potential value of peripheral venous blood samples in population methylation studies, most EWASs are now performed using whole blood [61]. Third, given that the eligible twins were limited, we did not perform the analysis by sex.

Conclusions

Multiple methylated CpGs, DMRs, crucial genes (particularly SLC6A18), and biological pathways potentially related to FPG were identified. Our findings provide reference for the epigenetic regulation of elevated FPG levels and diabetes.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

CpG:

Cytosine phosphate guanine

DBP:

Diastolic blood pressure

DMR:

Differentially methylated region

EWAS:

Epigenome-wide association study

FDR:

False discovery rate

FPG:

Fasting plasma glucose

GREAT:

Genomic regions enrichment of annotations tool

GWAS:

Genome-wide association study

GEE:

Generalized estimation equation

PCA:

Principal component analysis

RRBS:

Reduced representation bisulfite sequencing

SNPs:

Single nucleotide polymorphisms

T2DM:

Type 2diabetes mellitus

WGCNA:

Weighted gene co-expression network analysis

References

  1. Song X, Wang W, Li Z, Zhang D. Association between serum copper and serum lipids in adults. Ann Nutr Metab. 2018;73(4):282–9.

    Article  CAS  PubMed  Google Scholar 

  2. Kwak SH, Park KS. Recent progress in genetic and epigenetic research on type 2 diabetes. Exp Mol Med. 2016;48(3): e220.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Wang W, Zhang C, Liu H, Xu C, Duan H, Tian X, et al. Heritability and genome-wide association analyses of fasting plasma glucose in Chinese adult twins. BMC Genomics. 2020;21(1):491.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Simonis-Bik AM, Eekhoff EM, Diamant M, Boomsma DI, Heine RJ, Dekker JM, et al. The heritability of HbA1c and fasting blood glucose in different measurement settings. Twin Res Hum Genet. 2008;11(6):597–602.

    Article  PubMed  Google Scholar 

  5. Santos RL, Zillikens MC, Rivadeneira FR, Pols HA, Oostra BA, van Duijn CM, et al. Heritability of fasting glucose levels in a young genetically isolated population. Diabetologia. 2006;49(4):667–72.

    Article  CAS  PubMed  Google Scholar 

  6. Davegårdh C, García-Calzón S, Bacos K, Ling C. DNA methylation in the pathogenesis of type 2 diabetes in humans. Mol Metab. 2018;14:12–25.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Kim H, Bae JH, Park KS, Sung J, Kwak SH. DNA methylation changes associated with type 2 diabetes and diabetic kidney disease in an East Asian population. J Clin Endocrinol Metab. 2021;106(10):e3837–51.

    Article  PubMed  Google Scholar 

  8. Ouni M, Saussenthaler S, Eichelmann F, Jähnert M, Stadion M, Wittenbecher C, et al. Epigenetic changes in islets of langerhans preceding the onset of diabetes. Diabetes. 2020;69(11):2503–17.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Meeks KAC, Henneman P, Venema A, Addo J, Bahendeka S, Burr T, et al. Epigenome-wide association study in whole blood on type 2 diabetes among sub-Saharan African individuals: findings from the RODAM study. Int J Epidemiol. 2019;48(1):58–70.

    Article  PubMed  Google Scholar 

  10. Cardona A, Day FR, Perry JRB, Loh M, Chu AY, Lehne B, et al. Epigenome-wide association study of incident type 2 diabetes in a British population: EPIC-Norfolk study. Diabetes. 2019;68(12):2315–26.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Walaszczyk E, Luijten M, Spijkerman AMW, Bonder MJ, Lutgers HL, Snieder H, et al. DNA methylation markers associated with type 2 diabetes, fasting glucose and HbA(1c) levels: a systematic review and replication in a case-control sample of the Lifelines study. Diabetologia. 2018;61(2):354–68.

    Article  CAS  PubMed  Google Scholar 

  12. Soriano-Tárraga C, Jiménez-Conde J, Giralt-Steinhauer E, Mola-Caminal M, Vivanco-Hidalgo RM, Ois A, et al. Epigenome-wide association study identifies TXNIP gene associated with type 2 diabetes mellitus and sustained hyperglycemia. Hum Mol Genet. 2016;25(3):609–19.

    Article  PubMed  Google Scholar 

  13. Kriebel J, Herder C, Rathmann W, Wahl S, Kunze S, Molnos S, et al. Association between DNA methylation in whole blood and measures of glucose metabolism: KORA F4 study. PLoS ONE. 2016;11(3): e0152314.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Florath I, Butterbach K, Heiss J, Bewerunge-Hudler M, Zhang Y, Schöttker B, et al. Type 2 diabetes and leucocyte DNA methylation: an epigenome-wide association study in over 1500 older adults. Diabetologia. 2016;59(1):130–8.

    Article  CAS  PubMed  Google Scholar 

  15. Al Muftah WA, Al-Shafai M, Zaghlool SB, Visconti A, Tsai PC, Kumar P, et al. Epigenetic associations of type 2 diabetes and BMI in an Arab population. Clin Epigenetics. 2016;8:13.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Kulkarni H, Kos MZ, Neary J, Dyer TD, Kent JW Jr, Göring HH, et al. Novel epigenetic determinants of type 2 diabetes in Mexican-American families. Hum Mol Genet. 2015;24(18):5330–44.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Chambers JC, Loh M, Lehne B, Drong A, Kriebel J, Motta V, et al. Epigenome-wide association of DNA methylation markers in peripheral blood from Indian Asians and Europeans with incident type 2 diabetes: a nested case-control study. Lancet Diabetes Endocrinol. 2015;3(7):526–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Wang Z, Peng H, Gao W, Cao W, Lv J, Yu C, et al. Blood DNA methylation markers associated with type 2 diabetes, fasting glucose, and HbA1c levels: an epigenome-wide association study in 316 adult twin pairs. Genomics. 2021;113(6):4206–13.

    Article  CAS  PubMed  Google Scholar 

  19. Ling C, Rönn T. Epigenetics in human obesity and type 2 diabetes. Cell Metab. 2019;29(5):1028–44.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Tan Q, Christiansen L, von Bornemann HJ, Christensen K. Twin methodology in epigenetic studies. J Exp Biol. 2015;218(Pt 1):134–9.

    Article  PubMed  Google Scholar 

  21. Li W, Christiansen L, Hjelmborg J, Baumbach J, Tan Q. On the power of epigenome-wide association studies using a disease-discordant twin design. Bioinformatics. 2018;34(23):4073–8.

    Article  CAS  PubMed  Google Scholar 

  22. Li S, Wong EM, Bui M, Nguyen TL, Joo JE, Stone J, et al. Inference about causation between body mass index and DNA methylation in blood from a twin family study. Int J Obes (Lond). 2019;43(2):243–52.

    Article  CAS  PubMed  Google Scholar 

  23. Wang ZN, Gao WJ, Wang BQ, Cao WH, Lv J, Yu CQ, et al. Correlation between fasting plasma glucose, HbA1c and DNA methylation in adult twins. Beijing Da Xue Xue Bao Yi Xue Ban. 2020;52(3):425–31.

    CAS  PubMed  Google Scholar 

  24. Wang W, Li W, Jiang W, Lin H, Wu Y, Wen Y, et al. Genome-wide DNA methylation analysis of cognitive function in middle and old-aged Chinese monozygotic twins. J Psychiatr Res. 2021;136:571–80.

    Article  PubMed  Google Scholar 

  25. Wang W, Jiang W, Hou L, Duan H, Wu Y, Xu C, et al. Weighted gene co-expression network analysis of expression data of monozygotic twins identifies specific modules and hub genes related to BMI. BMC Genomics. 2017;18(1):872.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Wang W, Li W, Wu Y, Tian X, Duan H, Li S, et al. Genome-wide DNA methylation and gene expression analyses in monozygotic twins identify potential biomarkers of depression. Transl Psychiatry. 2021;11(1):416.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Wang W, Yao J, Li W, Wu Y, Duan H, Xu C, et al. Epigenome-wide association study in Chinese monozygotic twins identifies DNA methylation loci associated with blood pressure. Clin Epigenetics. 2023;15(1):38.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Wang W, Li W, Duan H, Xu C, Tian X, Li S, et al. Mediation by DNA methylation on the association of BMI and serum uric acid in Chinese monozygotic twins. Gene. 2023;850: 146957.

    Article  CAS  PubMed  Google Scholar 

  29. Wang T, Wang W, Li W, Duan H, Xu C, Tian X, et al. Genome-wide DNA methylation analysis of pulmonary function in middle and old-aged Chinese monozygotic twins. Respir Res. 2021;22(1):300.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Xu C, Zhang D, Tian X, Wu Y, Pang Z, Li S, et al. Genetic and environmental basis in phenotype correlation between physical function and cognition in aging Chinese twins. Twin Res Hum Genet. 2017;20(1):60–5.

    Article  PubMed  Google Scholar 

  31. Krueger F, Andrews SR. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics. 2011;27(11):1571–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Hebestreit K, Dugas M, Klein HU. Detection of significantly differentially methylated regions in targeted bisulfite sequencing data. Bioinformatics. 2013;29(13):1647–53.

    Article  CAS  PubMed  Google Scholar 

  34. Du P, Zhang X, Huang CC, Jafari N, Kibbe WA, Hou L, et al. Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics. 2010;11:587.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Jaffe AE, Irizarry RA. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014;15(2):R31.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Rahmani E, Zaitlen N, Baran Y, Eng C, Hu D, Galanter J, et al. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nat Methods. 2016;13(5):443–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14(4):R36.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B Stat Methodol. 1995;57(1):289–300.

    Google Scholar 

  40. Durinck S, Spellman PT, Birney E, Huber W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc. 2009;4(8):1184–91.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Pedersen BS, Schwartz DA, Yang IV, Kechris KJ. Comb-p: software for combining, analyzing, grouping and correcting spatially correlated P-values. Bioinformatics. 2012;28(22):2986–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Li W, Baumbach J, Larsen MJ, Mohammadnejad A, Lund J, Christensen K, et al. Differential long noncoding RNA profiling of BMI in twins. Epigenomics. 2020;12(17):1531–41.

    Article  CAS  PubMed  Google Scholar 

  43. McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, Lowe CB, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010;28(5):495–501.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Langfelder P, Horvath S. Fast R functions for robust correlations and hierarchical clustering. J Stat Softw. 2012;46(11): i11.

    Article  PubMed  PubMed Central  Google Scholar 

  46. da Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57.

    Article  CAS  PubMed  Google Scholar 

  47. Lou Y, Lu X, Dang X. FOXO1 Up-regulates human L-selectin expression through binding to a consensus FOXO1 motif. Gene Regul Syst Bio. 2012;6:139–49.

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Goldsworthy M, Bai Y, Li CM, Ge H, Lamas E, Hilton H, et al. Haploinsufficiency of the insulin receptor in the presence of a splice-site mutation in Ppp2r2a results in a novel digenic mouse model of type 2 diabetes. Diabetes. 2016;65(5):1434–46.

    Article  CAS  PubMed  Google Scholar 

  49. Harbuwono DS, Tahapary DL, Tarigan TJE, Yunir E. New proposed cut-off of waist circumference for central obesity as risk factor for diabetes mellitus: Evidence from the Indonesian Basic National Health Survey. PLoS ONE. 2020;15(11): e0242417.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Kochetova OV, Avzaletdinova DS, Korytina GF, Morugova TV, Mustafina OE. The association between eating behavior and polymorphisms in GRIN2B, GRIK3, GRIA1 and GRIN1 genes in people with type 2 diabetes mellitus. Mol Biol Rep. 2020;47(3):2035–46.

    Article  CAS  PubMed  Google Scholar 

  51. Pereira-Moreira R, Muscelli E. Effect of insulin on proximal tubules handling of glucose: a systematic review. J Diabetes Res. 2020;2020:8492467.

    Article  PubMed  PubMed Central  Google Scholar 

  52. David JA, Rifkin WJ, Rabbani PS, Ceradini DJ. The Nrf2/Keap1/ARE pathway and oxidative stress as a therapeutic target in type II diabetes mellitus. J Diabetes Res. 2017;2017:4826724.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Hanna R, Nour-Eldine W, Saliba Y, Dagher-Hamalian C, Hachem P, Abou-Khalil P, et al. Cardiac phosphodiesterases are differentially increased in diabetic cardiomyopathy. Life Sci. 2021;283: 119857.

    Article  CAS  PubMed  Google Scholar 

  54. Bogdani M, Korpos E, Simeonovic CJ, Parish CR, Sorokin L, Wight TN. Extracellular matrix components in the pathogenesis of type 1 diabetes. Curr Diab Rep. 2014;14(12):552.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Wei H, Zapata RC, Lopez-Valencia M, Aslanoglou D, Farino ZJ, Benner V, et al. Dopamine D(2) receptor signaling modulates pancreatic beta cell circadian rhythms. Psychoneuroendocrinology. 2020;113: 104551.

    Article  CAS  PubMed  Google Scholar 

  56. Sakano D, Choi S, Kataoka M, Shiraki N, Uesugi M, Kume K, et al. Dopamine D2 receptor-mediated regulation of pancreatic β cell mass. Stem Cell Reports. 2016;7(1):95–109.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Hwang JL, Weiss RE. Steroid-induced diabetes: a clinical and molecular approach to understanding and treatment. Diabetes Metab Res Rev. 2014;30(2):96–102.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Fraszczyk E, Spijkerman AMW, Zhang Y, Brandmaier S, Day FR, Zhou L, et al. Epigenome-wide association study of incident type 2 diabetes: a meta-analysis of five prospective European cohorts. Diabetologia. 2022;65(5):763–76.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Chen PY, Chu A, Liao WW, Rubbi L, Janzen C, Hsu FM, et al. Prenatal growth patterns and birthweight are associated with differential DNA methylation and gene expression of cardiometabolic risk genes in human placentas: a discovery-based approach. Reprod Sci. 2018;25(4):523–39.

    Article  CAS  PubMed  Google Scholar 

  60. Kahn SE, Cooper ME, Del Prato S. Pathophysiology and treatment of type 2 diabetes: perspectives on the past, present, and future. Lancet. 2014;383(9922):1068–83.

    Article  CAS  PubMed  Google Scholar 

  61. Houseman EA, Kim S, Kelsey KT, Wiencke JK. DNA Methylation in Whole Blood: Uses and Challenges. Curr Environ Health Rep. 2015;2(2):145–54.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the Natural Science Foundation of Shandong Province [Grant number: ZR2020QH304], the National Natural Science Foundation of China [Grant number: 31741063], the China Postdoctoral Science Foundation [Grant number: 2020M682129], and the Qingdao Science and Technology Fund [Grant number: 21-1-4-rkjk-1-nsh].

Author information

Authors and Affiliations

Authors

Contributions

WW conceived and designed the research and analyzed the data; WY interpreted the data and wrote the paper; HD, XT and CX collected the participants information and acquired the data; QT and SL analyzed the data and interpreted the results; DZ conceived and designed the research. All authors were involved in drafting and revising the manuscript.

Corresponding author

Correspondence to Weijing Wang.

Ethics declarations

Ethics approval and consent to participate

The authors state that all participants have provided written informed consent for participating in the study, and this study was approved by the local ethics committee at Qingdao CDC, Qingdao, China.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1

. Basic characteristics of the participants.

Additional file 2: Figure S1

. Circular Manhattan plot for epigenome-wide association study on fasting plasma glucose. The numbers of chromosome and the -log10 of P-values for statistical significance are shown. Dots represent the observed CpGs.

Additional file 3: Table S2

. The top GREAT ontology enrichments potentially related to fasting plasma glucose using binomial test.

Additional file 4

: Table S3. The results of DNA methylation of top 30 CpGs with fasting plasma glucose in epigenome-wide association analysis in sensitivity analysis that further adjusting for smoking status and drinking status in GEE model.

Additional file 5

: Table S4. The results of annotation to significant differentially methylated regions for fasting plasma glucose in sensitivity analysis.

Additional file 6

: Table S5. The results of causal inference analysis of top CpGs with fasting plasma glucose in sensitivity analysis.

Additional file 7

: Table S6. Basic characteristics of participants from the community in the quantitative methylation analysis of SLC6A18 gene

Additional file 8

: Table S7. Validation analysis results for the CpGs mapped to SLC6A18 gene.

Additional file 9: Figure S2

. Relationships between consensus module eigengenes and external trait. Each row in the table corresponds to a consensus module, and each column to a trait. Numbers in the table report the correlations of the corresponding module eigengenes and trait with the P-values printed below the correlations in parentheses. The table is color coded by correlation according to the shade of color legend. FPG, fasting plasma glucose.

Additional file 10

: Table S8. The common genes between DNA methylation analysis and gene expression analysis.

Additional file 11

: Table S9. The common biological enrichment terms between DNA methylation analysis and gene expression analysis.

Additional file 12: Table S10

. Comparison between our results and other previously reported fasting plasma glucose-associated differentially methylated genes where significant CpGs were mapped.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Yao, W., Tan, Q. et al. Identification of key DNA methylation changes on fasting plasma glucose: a genome-wide DNA methylation analysis in Chinese monozygotic twins. Diabetol Metab Syndr 15, 159 (2023). https://doi.org/10.1186/s13098-023-01136-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13098-023-01136-4

Keywords