- Research
- Open access
- Published:
Identification of metabolic reprogramming-related genes as potential diagnostic biomarkers for diabetic nephropathy based on bioinformatics
Diabetology & Metabolic Syndrome volume 16, Article number: 287 (2024)
Abstract
Background
Diabetic nephropathy (DN) is a serious complication of diabetes mellitus, marked by progressive renal damage. Recent evidence indicates that metabolic reprogramming is crucial to DN pathogenesis, yet its underlying mechanisms are not well understood. This study aimed to examine how metabolic reprogramming-related genes (MRRGs) are differentially expressed and to explore their potential mechanisms in the development of DN.
Methods
We analyzed the datasets GSE30528 and GSE96804 from the Gene Expression Omnibus (GEO), comprising 50 DN samples and 33 controls. MRRGs were sourced from GeneCards and PubMed. Data preprocessing included batch effect correction using the R package sva, followed by normalization and differential expression analysis with limma (|logFC|> 0.5, adj.p < 0.05). Functional enrichment analyses (GO, KEGG, GSEA) were performed using clusterProfiler. Protein–protein interaction (PPI) networks were constructed via STRING, identifying hub genes through CytoHubba. Regulatory networks (mRNA-TF, mRNA-miRNA) were derived from ChIPBase and StarBase. Validation of hub genes and ROC analysis assessed diagnostic performance. ssGSEA quantified immune cell infiltration.
Results
Our analysis identified 708 differentially expressed genes (DEGs), including 119 metabolic reprogramming-related DEGs (MRRDEGs). Enrichment analyses revealed significant roles for MRRDEGs in processes such as wound healing and pathways like MAPK signaling. The PPI network identified nine hub genes: FN1, CD44, KDR, EGF, HSPG2, HGF, FGF9, IGF1, and ALB, which exhibited high diagnostic accuracy (AUC 0.7 to 0.9). Notably, FN1 and CD44 showed significant association with renal fibrosis and could serve as potential biomarkers for early diagnosis and therapeutic targets in DN. Immune infiltration analysis showed notable differences in immune cell composition between DN and control samples.
Conclusion
This study identifies hub genes such as FN1 and CD44, with potential diagnostic value in DN. It also reveals immune cell infiltration differences between DN patients and controls, offering insights into disease progression and potential therapeutic targets.
Introduction
Diabetic nephropathy (DN) is a serious microvascular complication of diabetes mellitus. It affects around 35% of diabetic patients and is the leading cause of end-stage renal disease (ESRD) worldwide [1]. The incidence of DN has been increasing alongside the global rise in diabetes prevalence, creating significant challenges for healthcare systems [2]. Current therapeutic strategies mainly focus on glycemic control, blood pressure management, and using medications like ACE inhibitors or angiotensin receptor blockers (ARBs) [3, 4]. Although there have been advancements in glycemic and blood pressure control, current therapeutic approaches often do not stop the progression of DN [5]. Addressing this challenge demands exploration of novel therapeutic strategies. In recent years, researchers have actively explored various new treatments, including glucose stabilizers, renal protective agents, and targeted therapies against inflammation and fibrosis [6]. For instance, SGLT2 inhibitors have demonstrated efficacy not only in glycemic control but also potentially in reducing renal tubular glucose reabsorption, alleviating renal burden, and improving renal function [7]. Nevertheless, these interventions still have limitations in delaying disease progression and are unable to completely reverse the course of DN. The majority of patients inevitably progress to ESRD. Thus, urgent research is needed to identify new therapeutic targets, biomarkers, and strategies to effectively tackle the complexities of DN and enhance patient outcomes.
Metabolic reprogramming has emerged as a key factor in the pathophysiology of DN, contributing to disease progression through altered glucose, lipid, and amino acid metabolism. In DN, this reprogramming drives renal fibrosis, inflammation, and oxidative stress—hallmarks of the disease [8, 9]. Elevated levels of metabolic biomarkers such as irisin and visfatin are correlated with DN severity, suggesting a direct link between metabolic dysregulation and renal injury. These markers not only indicate disease progression but also have potential as prognostic tools for monitoring DN [10]. Importantly, targeting metabolic pathways has proven effective in other diseases, such as cancer and cardiovascular disorders, where modulating metabolism has halted progression and improved outcomes [11, 12]. This raises the possibility that similar therapeutic strategies targeting metabolic reprogramming could be applied to DN. However, the exact mechanisms underlying these metabolic changes in DN remain poorly understood, and the potential for metabolic-based therapies remains underexplored. This study aims to explore the role of metabolic reprogramming-related genes (MRRGs) in DN and investigate their contribution to disease progression. By identifying new biomarkers and therapeutic targets, we hope to advance the management of diabetic nephropathy.
We aimed to identify key genes and pathways involved in DN progression by integrating data from the GEO database and conducting bioinformatics analyses. Specifically, we hypothesize that metabolic reprogramming-related differentially expressed genes (MRRDEGs) play a central role in DN progression, influencing key processes such as renal fibrosis, inflammation, and oxidative stress. We will focus on identifying these MRRDEGs and their associated signaling pathways, and investigate how they contribute to the pathophysiology of DN. Additionally, we examined the regulatory networks of transcription factors (TFs) and microRNAs (miRNAs) to better understand the mechanisms regulating these Hub genes. Additionally, we explored the regulatory networks involving transcription factors (TFs) and microRNAs (miRNAs) to gain insights into the regulatory mechanisms of these Hub genes. We expect our findings to enhance understanding of the molecular mechanisms underlying DN and to identify potential biomarkers and therapeutic targets. This could lead to the development of new diagnostic tools and targeted therapies, ultimately improving the management and prognosis for patients with DN.
Materials and methods
Data download
The DN datasets GSE30528 [13] and GSE96804 [14] were downloaded from the GEO database [15] using the R package GEOquery [16] (Version 2.70.0). Both datasets include samples from Homo sapiens derived from kidney tissue, as shown in Table 1. GSE30528, which uses the GPL571 platform, contains 9 DN samples and 13 control samples. GSE96804, which uses the GPL17586 platform, includes 41 DN samples and 20 control samples. All samples were included in this study. Despite the use of different platforms, both datasets contain a large number of DN samples and are derived from the same species, ensuring biological comparability between the samples.
MRRGs were collected from the GeneCards database [17] (https://www.genecards.org/) using "Metabolic Reprogramming" as a search term, limited to "Protein Coding" genes with a Relevance Score > 3, yielding 1694 unique genes. Additionally, a search on PubMed for "Metabolic Reprogramming" revealed 5 MRRGs documented in published literature [18]. After merging and removing duplicates, a total of 1695 MRRGs were obtained. Specific details were provided in Table S1.
The R package sva [19] (Version 3.50.0) was used to correct batch effects in datasets GSE30528 and GSE96804, using the ComBat method, creating a unified GEO dataset (Combined Dataset). The ComBat method within the sva package focuses on estimating unknown batch effects and has been widely applied in the literature, demonstrating strong adaptability and effectiveness, particularly when handling large genomic datasets. Subsequently, the Combined Dataset was standardized using the R package limma [20] (Version 3.58.1), including probe annotation and normalization processes. Principal Component Analysis (PCA) [21] and boxplots were conducted on the expression matrix before and after batch effect removal to validate the efficacy of batch correction.
Differential expression analysis of metabolic reprogramming-related genes
We categorized the samples from the Combined Dataset into two groups: DN and Control. We performed differential expression analysis between these groups using the R package limma. Genes with |logFC|> 0.5 and adj.p < 0.05 were considered as DEGs. We defined up-regulated DEGs as those with logFC > 0.5 and adj.p < 0.05. In contrast, down-regulated DEGs were those with logFC < -0.5 and adj.p < 0.05. The results of the differential expression analysis were visualized using volcano plots generated with the R package ggplot2 (Version 3.4.4).
To identify MRRDEGs associated with DN, we performed an intersection analysis. DEGs were intersected with MRRGs. A Venn diagram was generated to visualize the overlap, identifying MRRDEGs. Heatmaps illustrating the expression patterns of MRRDEGs were constructed using the R package pheatmap (Version 1.0.12).
Gene ontology (GO) and kyoto encyclopedia of genes and genomes (KEGG) enrichment analysis
GO analysis [22] is a widely used method for functional enrichment studies, which includes Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). KEGG [23] is a widely used database storing information on genomes, biological pathways, diseases, and drugs. We used the R package clusterProfiler [24] (Version 4.10.0) to conduct GO and KEGG enrichment analysis of MRRDEGs. Enrichment significance was determined using an adj.p < 0.05 and a false discovery rate (FDR, q value) < 0.25, corrected using the Benjamini-Hochberg (BH) method.
Gene set enrichment analysis (GSEA)
GSEA [25] evaluates how pre-defined gene sets are distributed within a ranked gene list associated with specific phenotypes. This analysis helps determine the contribution of these gene sets to the phenotypes.
In this study, genes from the Combined Dataset were first ranked based on logFC. Subsequently, the R package clusterProfiler was utilized to perform GSEA on all genes within the Combined Dataset. A seed of 2020 ensured reproducibility. Gene sets were limited to 10–500 genes to avoid statistical noise from small sets and to prevent overly broad sets from diluting signals. These thresholds are standard in GSEA analyses. We used the c2.cp.all.v2022.1.Hs.symbols.gmt gene sets from the Molecular Signatures Database (MSigDB) [26] (https://www.gsea-msigdb.org/gsea/msigdb), which cover pathways relevant to diabetic nephropathy (DN), including metabolism, inflammation, fibrosis, and oxidative stress. Specific pathways like "HIF1 signaling," "TGF-beta signaling," and "MAPK signaling" are implicated in DN progression. Enrichment significance was assessed with adj.p < 0.05 and FDR < 0.25, corrected using the BH method.
Protein–protein interaction (PPI) network and hub gene selection
The PPI Network plays crucial roles in various biological processes. These include signal transduction, gene expression regulation, energy and metabolic processes, and cell cycle control. The STRING database [27] (https://string-db.org/) was utilized in this study to investigate known and predicted protein–protein interactions. Specifically, a PPI Network related to MRRDEGs was constructed using a minimum required interaction score greater than 0.900 (highest confidence level). MRRDEGs were selected due to their differential expression linked to metabolic reprogramming. To identify Hub genes within the PPI Network, Cytoscape [28] software was employed with its CytoHubba [29] plugin utilizing five algorithms: Maximal Clique Centrality (MCC), Degree, Maximum Neighborhood Component (MNC), Edge Percolated Component (EPC), and Closeness [30]. The five algorithms selected for hub gene identification in this study—MCC, Degree, MNC, EPC, and Closeness—were chosen for their distinct strengths in evaluating different aspects of gene connectivity within the PPI network. Each algorithm offers a unique perspective on the role of genes in the network. MCC is particularly useful for identifying core genes with high centrality, Degree helps detect genes with the most connections, MNC identifies key genes based on their local connectivity, EPC highlights genes involved in signal transduction, and Closeness measures the influence of genes on the overall network. By utilizing a combination of these algorithms, we aimed to enhance the accuracy and reliability of hub gene identification while minimizing biases inherent in any single method. Scores for MRRDEGs within the PPI Network were computed, and the top 20 MRRDEGs were selected based on these scores. The intersection of genes identified by all five algorithms was illustrated with a Venn diagram to identify Hub Genes.
Construction of regulatory networks
Transcription factors (TFs) control gene expression at the transcriptional level by interacting with Hub Genes. TFs were identified from the ChIPBase database [31] (http://rna.sysu.edu.cn/chipbase/) based on the criterion that the sum of "Number of samples found (upstream)" and "Number of samples found (downstream)" exceeds 12. We analyzed the regulatory impact of TFs on Hub Genes and used Cytoscape software to visualize the mRNA-TF regulatory network. Additionally, miRNAs play pivotal roles in biological development and evolution by regulating diverse target genes, with individual target genes often subject to regulation by multiple miRNAs. To investigate the relationship between Hub Genes and miRNAs, miRNAs associated with Hub Genes were obtained from the StarBase v3.0 database [32] (https://starbase.sysu.edu.cn/), specifically retaining those with a "pancancerNum > 6" criterion. Cytoscape software was employed to visualize the mRNA-miRNA regulatory network (mRNA-miRNA Regulatory Network).
Validation of differential expression and ROC curve analysis
To investigate expression differences of Hub Genes between DN and Control in the Combined Dataset, we generated group comparison plots based on their expression levels. ROC curves for Hub Genes were constructed using the R package pROC (Version 1.18.5), and we calculated the area under the curve (AUC) to assess their diagnostic performance in predicting DN.
Immune infiltration analysis
This study employed Single-Sample Gene-Set Enrichment Analysis (ssGSEA) [33] to quantify the relative abundance of immune cell infiltration. Initially, various human immune cell subtypes such as Activated CD8 + T cell, Activated dendritic cell, Gamma-delta T cell, Natural killer cell, and Regulatory T cell (Treg) were annotated. ssGSEA was then used to compute enrichment scores for each sample, representing the relative abundance of immune cell infiltration. This process resulted in an immune cell infiltration matrix for the Combined Dataset. We generated group comparison plots using the ggplot2 R package to illustrate the expression differences of immune cells between the Control and DN groups in the Combined Dataset. Significant immune cell types showing differential expression between the two groups were selected for further analysis. Spearman correlation analysis was then conducted to assess correlations among immune cells, and the results were visualized using the pheatmap R package to generate correlation heatmaps. Additionally, Spearman correlation analysis was performed to evaluate the correlation between Hub Genes and immune cells, and correlation bubble plots were created using the ggplot2 R package to visualize these results.
Statistical analysis
We conducted all data processing and analysis using R software (Version 4.3.0). Comparisons between two groups of continuous variables were assessed using independent Student's t-tests for normally distributed variables. For non-normally distributed variables, we used Wilcoxon Rank Sum tests (Mann–Whitney U tests). Comparisons involving three or more groups utilized Kruskal–Wallis tests. Spearman correlation analysis was employed to calculate correlation coefficients between different molecules. All statistical p-values are two-tailed unless specified otherwise, with significance set at p < 0.05.
Results
Technology roadmap
The technology roadmap of our study is presented in Fig. 1.
Flow chart for the comprehensive analysis of MRRDEGs. DN Diabetic nephropathy, GSEA Gene Set Enrichment Analysis, DEGs Differentially Expressed Genes, MRRGs Metabolic Reprogramming Related Genes, MRRDEGs Metabolic Reprogramming-Related Differentially Expressed Genes, GO Gene Ontology, KEGG Kyoto Encyclopedia of Genes and Genomes, PPI Protein–Protein Interaction, ROC Receiver Operating Characteristic, TF Transcription Factor, ssGSEA single-sample Gene-Set Enrichment Analysis
Merging of diabetic kidney disease datasets
The distribution boxplots (Fig. 2A, B) and PCA results (Fig. 2C, D) indicate that the batch effects in the DN dataset samples were largely eliminated after batch effect removal. The results indicated that, after batch effect removal, the data exhibited consistent statistical properties, confirming that the batch effects were effectively eliminated, which further ensured the reliability of the analysis.
Batch effects removal of GSE30528 and GSE96804. A Distribution boxplots of combined dataset before normalization. B Distribution boxplots of combined dataset after normalization. C PCA plot of combined dataset before normalization. D PCA plot of Combined Dataset after normalization. In the figures, the DN dataset GSE30528 is represented in orange, while GSE96804 is shown in green. PCA principal component analysis
Differentially expressed genes related to metabolic reprogramming associated with diabetic nephropathy
The differential expression analysis revealed significant differences in gene expression profiles between the DN and Control groups. A total of 708 DEGs in the Combined Dataset met the thresholds of |logFC|> 0.5 and adj.p < 0.05. Of these, 330 genes were up-regulated (logFC > 0.5 and adj.p < 0.05), while 378 genes were down-regulated (logFC < -0.5 and adj.p < 0.05); more information is available in Table S5. The volcano map was drawn according to the difference analysis results of this dataset (Fig. 3A). A Venn diagram (Fig. 3B) was constructed to visualize this intersection of DEGs and MRRGs. We identified a total of 119 MRRDEGs, with detailed information provided in Table S2. Based on the intersection results, we analyzed the expression differences of the top 20 MRRDEGs between the DN and Control groups. The selection of these top 20 genes was primarily based on their fold change. We then used the R package pheatmap to generate a heatmap illustrating the analysis findings (Fig. 3C).
Differential gene expression analysis. A Volcano plot depicting DEGs analysis between DN and Control in Combined Dataset. B Venn diagram illustrating the overlap between DEGs and MRRGs. C Heatmap displaying the top 20 MRRDEGs. In the heatmap, orange color indicates control samples, grey denotes DN samples, red represents high expression, blue indicates lower expression. DN diabetic nephropathy, DEGs differentially expressed genes, MRRGs metabolic reprogramming related genes, MRRDEGs metabolic reprogramming—related differentially expressed genes
Gene ontology (GO) and pathway (KEGG) enrichment analysis
The GO and KEGG enrichment analyses of the 119 MRRDEGs indicated significant enrichment in areas such as wound healing, reproductive structure development, and metabolic reprogramming in DN, as summarized in Table 2. The BP that were enriched include reproductive system development, muscle tissue development, and responses to reactive oxygen species. The CC highlighted include the platelet alpha granule lumen, platelet alpha granules, collagen-containing extracellular matrix, secretory granule lumen, and cytoplasmic vesicle lumen. The MF identified encompass growth factor activity, receptor ligand activity, signaling receptor activator activity, antioxidant activity, and heparin binding. Additionally, the KEGG analysis revealed notable enrichment in several pathways, including the MAPK signaling pathway, AGE-RAGE signaling in diabetic complications, focal adhesion, proteoglycans in cancer, and the degradation of valine, leucine, and isoleucine. Visualization of the GO and KEGG pathway enrichment analysis results is depicted in histogram form (Fig. 4A).
GO and KEGG Enrichment Analysis for MRRDEGs. A Bar graphs depicting GO and KEGG enrichment analysis results of MRRDEGs. The x-axis represents GO terms and KEGG terms. B–D Network diagram illustrating GO enrichment analysis of BP, CC and MF. E Network diagram illustrating KEGG pathway enrichment analysis for MRRDEGs. The screening criteria for GO and KEGG enrichment analysis were adj.p < 0.05 and FDR < 0.25, and the p value correction method was BH. The orange nodes represent items, the green nodes represent molecules, and the lines represent the relationship between items and molecules. MRRDEGs metabolic reprogramming—related differentially expressed genes, GO gene ontology, KEGG Kyoto Encyclopedia of Genes and Genomes, BP biological process, CC cellular component, MF molecular function, FDR false discovery rate, BH Benjamini-Hochberg
Meanwhile, the network diagram of BP, CC, MF and biological pathway was drawn according to GO and KEGG enrichment analysis (Fig. 4B–E). The lines show the corresponding molecules and the annotations of the corresponding entries, and the larger the nodes, the more molecules the entries contain.
Gene set enrichment analysis (GSEA) for diabetic nephropathy
GSEA aims to identify enriched biological processes associated with gene expression and to outline the affected cellular components and involved molecular functions (Fig. 5A). Detailed outcomes are presented in Table 3.The results indicated that all genes in the Combined Dataset were significantly enriched in the Inflammatory Response Pathway (Fig. 5B), P130cas Linkage to MAPK Signaling for Integrins (Fig. 5C), Quercetin and NF-κB Ap1 Induced Apoptosis (Fig. 5D), Fatty Acid Metabolism (Fig. 5E), and other biologically relevant functions and signaling pathways.
GSEA for combined dataset. A The bubble plot illustrated the GSEA results for four biological functions in the Combined Dataset. B–E GSEA demonstrated significant enrichment of all genes in the Inflammatory Response Pathway (B), P130cas Linkage to MAPK Signaling for Integrins (C), Quercetin and NF-κB Ap1 Induced Apoptosis (D), and Fatty Acid Metabolism (E). Bubble size represented the number of enriched genes, while bubble color indicated the NES; warmer colors denoted higher NES values (red) and cooler colors denoted lower NES values (blue). GSEA criteria include adj.p < 0.05 and FDR < 0.25, with p value adjusted using the BH method. GSEA Gene Set Enrichment Analysis, FDR false discovery rate, BH Benjamini-Hochberg
Protein—protein interaction network construction of Hub and gene screening
The PPI network of 119 MRRDEGs was first constructed using the STRING database and visualized with Cytoscape software (Fig. 6A). Analysis of the PPI network showed interactions among 66 MRRDEGs. Subsequently, CytoHubba plugin within Cytoscape was employed to calculate scores for these 66 MRRDEGs using five algorithms, ranking them sequentially based on these scores. Next, five algorithms were applied to the top 20 differentially expressed MRRDEGs to construct the PPI network: Closeness (Fig. 6B), Degree (Fig. 6C), EPC (Fig. 6D), MCC (Fig. 6E), and MNC (Fig. 6F). In the network, circle colors range from red to yellow, indicating scores from high to low. Finally, the intersection of genes identified by the five algorithms was analyzed using a Venn diagram (Fig. 6G). This analysis revealed nine Hub Genes: FN1, CD44, KDR, EGF, HSPG2, HGF, FGF9, IGF1, and ALB.
PPI network and Hub genes analysis. A the PPI network of MRRDEGs, computed using the STRING database. B–F The PPI Network illustrated the top 20 MRRDEGs associated with metabolic reprogramming, identified using five algorithms from the CytoHubba plugin. including the Closeness (B), Degree (C), EPC (D), MCC (E) and MNC (F). G the Venn diagram depicting the intersection of the top 20 MRRDEGs identified by above five algorithms of the CytoHubba plugin. PPI Protein—Protein Interaction, MRRDEGs Metabolic Reprogramming-Related Differentially Expressed Genes, EPC Edge Percolated Component, MCC Maximal Clique Centrality, MNC Maximum Neighborhood Component
construction of control network
We first used the StarBase v3.0 database to identify microRNAs associated with Hub Genes and constructed the mRNA-miRNA regulatory network, which was visualized with Cytoscape software (Fig. 7A). This network included 2 Hub Genes and 15 microRNAs; additional details are provided in Table S3.
Next, we used the ChIPBase database to identify Hub Genes and their associated transcription factors (TFs) to construct the mRNA-TF regulatory network, visualized with Cytoscape software (Fig. 7B). This network includes five Hub Genes and 20 TFs; more information is available in Table S4.
Differentially expressed validation and ROC curve analysis
Figure 8A presents a comparative analysis of the expression levels of Hub Genes between the DN and Control groups in the Combined Dataset. The results indicated significant statistical differences (p value < 0.001) in the expression levels of nine Hub Genes: FN1, CD44, KDR, EGF, HSPG2, HGF, FGF9, IGF1, and ALB. Next, ROC curves were generated using the R package pROC on the Combined Dataset. These curves are shown in Fig. 8B–D and are based on the expression levels of the Hub Genes. The ROC curves showed moderate accuracy (0.7 < AUC < 0.9) in distinguishing between the DN samples and the Control group based on the expression levels of the nine Hub Genes.
Differential expression validation and ROC curve analysis. A Group comparison plots of Hub Genes in DN and Control from the Combined Dataset. B ROC curves for Hub Genes FN1, CD44, KDR in the Combined Dataset. C ROC curves for Hub Genes EGF, HSPG2, HGF in the Combined Dataset. D ROC curves for Hub Genes FGF9, IGF1, ALB in the Combined Dataset. *** indicates p value < 0.001, indicating statistical significance. In the comparison plots, orange indicates Control samples, while gray indicates DN samples. AUC ranges from 0.7 to 0.9 indicate moderate accuracy. DN diabetic nephropathy, ROC Receiver Operating Characteristic, AUC Area Under the Curve, TPR True Positive Rate, FPR False Positive Rate
Analysis of immune infiltration in diabetic nephropathy
Utilizing the expression matrix from the Combined Dataset, we applied the ssGSEA algorithm to compute immune infiltration abundances for 28 types of immune cells. Statistical significance (p value < 0.05) was found for 15 immune cell types when comparing DN and Control samples (Fig. 9A). These include: Activated B cells, CD56 bright natural killer cells, Central memory CD4 + T cells, Effector memory CD4 + T cells, Effector memory CD8 + T cells, Immature B cells, Immature dendritic cells, Macrophages, Memory B cells, Natural killer cells, Natural killer T cells, Neutrophils, Regulatory T cells (Treg), T follicular helper cells (Tfh), and Type 1 T helper cells. Subsequently, correlation heatmap (Fig. 9B) was utilized to illustrate the relationships among the infiltration abundances of these 15 immune cell types in the Combined Dataset. Results highlighted a significant positive correlation (r value = 0.8, p value < 0.05) between Regulatory T cells and Immature B cells. Finally, we created correlation bubble plots (Fig. 9C) to show the relationships between Hub Genes and the abundances of immune cell infiltration. The bubble plot analysis revealed strong correlations for most immune cell types, with gene FN1 showing a notably significant strongest positive correlation with Natural killer cells (r value = 0.88, p value < 0.05).
Immune Infiltration Analysis by ssGSEA Algorithm. A Group comparison plots of immune cell integration in samples from DN and Control groups within the Combined Dataset. B Heatmap depicting the correlation of immune cell infiltration abundances within the Combined Dataset. C Bubble plot illustrating the correlation between Hub Genes and immune cell infiltration abundances within the Combined Dataset. ns on behalf of the p value ≥ 0.05, no statistical significance; *, p value < 0.05, statistically significant; **, p value < 0.01, highly statistically significant; ***, p value < 0.001 and highly statistically significant. In the group comparison plots, orange indicates Control samples, while grey indicates DN samples. The absolute values of correlation coefficients (r values) indicate relationship strength: values below 0.3 suggest weak or negligible correlation, 0.3–0.5 suggest weak correlation, 0.5–0.8 suggest moderate correlation, and values above 0.8 suggest strong correlation. In the correlation heatmap, red denotes positive correlation, while blue denotes negative correlation. The intensity of colors reflects the magnitude of correlation strength. ssGSEA single-sample Gene-Set Enrichment Analysis, DN Diabetic nephropathy
Discussion
DN is a serious complication of diabetes mellitus that significantly affects patients' health and quality of life. It is the leading cause of ESRD worldwide, contributing to increased morbidity and mortality [1]. Current treatments for diabetic nephropathy focus mainly on two approaches: intensive insulin therapy for glycemic control and RAAS blockade (using ACE inhibitors or ARBs) to reduce renal damage caused by glomerular hypertension and hyperfiltration [4]. Despite these efforts, many patients progress to ESRD, highlighting the limited effectiveness of current treatments. Intensive glycemic control carries the risk of hypoglycemia and requires strict adherence, while RAAS inhibitors may cause hyperkalemia and are contraindicated in certain populations. Consequently, alternative or adjunctive therapies are needed. New therapeutic targets show promise for improving treatment effectiveness and reducing side effects. These include understanding the molecular mechanisms of glomerular injury, identifying biomarkers for disease progression, and exploring innovative approaches such as targeted drug delivery and gene editing. Addressing these research gaps is crucial to improving clinical outcomes and quality of life for DN patients amidst its growing global burden.
We combined two GEO datasets after carefully removing batch effects, which greatly improved the reliability of our data. We identified 708 DEGs associated with DN, including 119 MRRDEGs. The enrichment analysis showed that these genes are involved in important processes, including wound healing, muscle tissue development, and MAPK signaling pathways. Additionally, constructing a PPI network identified nine Hub genes, highlighting their regulatory roles in the pathogenesis of DN. The analysis of immune infiltration revealed significant changes in 15 immune cell types in DN samples, indicating that immune dysregulation plays a crucial role in the progression of the disease. These findings provide a comprehensive molecular framework for understanding DN and potential therapeutic targets. The identification of DEGs in our study provides critical insights into the molecular mechanisms underlying DN. Notably, DEGs may exhibit tissue- or cell-type specificity, crucial for understanding their roles in DN pathophysiology. For instance, genes like CASP3 and PTEN, involved in apoptosis and signaling, may show distinct expression patterns in renal versus extrarenal tissues, suggesting tailored regulatory mechanisms in the renal microenvironment [34, 35]. This specificity highlights the importance of context in gene expression studies, requiring further investigation into where these DEGs are located in cells and their functional implications. Furthermore, interactions between upregulated and downregulated genes may unveil significant biological interactions contributing to DN progression. For example, upregulation of inflammatory mediators such as IL33 and CCL2, alongside downregulation of protective factors like HSPG2, may indicate a shift towards a pro-inflammatory state exacerbating renal injury [36,37,38]. Understanding these interactions offers insights into homeostatic dysregulation in DN, guiding therapeutic strategies to restore renal balance. Lastly, our study introduces novel DN-related candidates like ANGPT2 and HIF1A, less studied in DN contexts. Their inclusion enriches DN pathogenesis discussions, potentially uncovering new therapeutic targets and biomarkers for improved clinical outcomes. In sum, comprehensive functional studies are imperative to elucidate roles of these DEGs in DN.
The MAPK signaling pathway plays a pivotal role in cellular responses to various stimuli, including stress and growth factors, and is implicated in DN pathogenesis. This pathway regulates a series of phosphorylation events that affect gene expression, cell growth, and programmed cell death. In DN, MAPK activation promotes renal fibrosis and inflammation, deteriorating kidney function [39, 40]. MRRDEGs enrichment within this pathway suggests their critical role in mediating renal cellular responses in DN. Targeting MAPK pathway presents a promising therapeutic strategy, potentially mitigating DN progression by modulating inflammation and fibrosis [40, 41]. The AGE-RAGE signaling pathway, significant in our analysis, mediates advanced glycation end-products (AGEs) effects in diabetic complications. AGE-RAGE interaction triggers inflammatory responses exacerbating renal injury in DN [42]. MRRDEGs presence in this pathway indicates their involvement in DN's inflammatory milieu. Understanding the roles of MRRDEGs in AGE-RAGE signaling may identify potential intervention targets, leading to new therapeutic options for managing diabetic nephropathy. Furthermore, pathways such as focal adhesion and proteoglycans, which are also involved in cancer, highlight the potential interactions between signaling networks in diabetic nephropathy. Focal adhesion signaling maintains cellular architecture and mediates responses to altered extracellular matrix in diabetes [43]. Interaction between MRRDEGs in focal adhesion and AGE-RAGE pathways suggests synergistic renal damage effects. Investigating these interactions could elucidate DN's multifaceted nature and unveil therapeutic strategies targeting multiple pathways simultaneously, enhancing treatment efficacy.
The identification of the nine hub genes—FN1, CD44, KDR, EGF, HSPG2, HGF, FGF9, IGF1, and ALB—presents significant implications for clinical practice, particularly in the early diagnosis and management of DN. These genes have been implicated in various biological processes, including cell proliferation, migration, and angiogenesis, which are critical in the pathophysiology of DN. Identifying FN1 as a key hub gene in our analysis highlights its likely role in the development of diabetic nephropathy. Fibronectin 1 (FN1) is a glycoprotein essential for cell adhesion, migration, and tissue repair. Previous studies have demonstrated that FN1 is significantly upregulated in various diabetic conditions, contributing to fibrosis and inflammation in DN [44]. In our study, the upregulated gene FN1 is involved in the AGE-RAGE signaling pathway in diabetic complications. Studies have shown that high levels of FN1 are associated with fibrosis and renal function decline, and it may also influence the recruitment of immune cells, thereby further affecting diabetes-related diseases [45]. Therefore, understanding the role of FN1 in this pathway could be crucial for elucidating the pathogenesis of diabetic nephropathy. Furthermore, the association of FN1 with renal fibrosis highlights its relevance in the pathophysiology of DN, aligning with the notion that targeting FN1 may offer therapeutic avenues for mitigating renal damage in diabetic patients [46]. Future studies should investigate FN1 alongside other hub genes to improve diagnostic precision and optimize patient management. CD44, another pivotal hub gene identified, facilitates cell interactions and extracellular matrix organization. Research indicates that CD44 is upregulated in the context of hyperglycemia, promoting inflammatory responses and cellular senescence [47]. Our study's results align with these findings, as CD44 was significantly differentially expressed in our dataset. Investigating CD44's interactions with other hubs may reveal synergistic diagnostic benefits, warranting validation in larger cohorts for clinical relevance. Collectively, these hub genes provide strong evidence for further research as potential biomarkers and therapeutic targets in chronic kidney disease. Integrating these hub genes with existing biomarkers could enhance diagnostic accuracy and patient stratification. For example, combining the expression profiles of these hub genes with traditional markers such as albuminuria and serum creatinine levels may provide a more comprehensive assessment of renal function and disease progression. This multi-biomarker approach could facilitate earlier intervention strategies, ultimately improving patient outcomes [48].
Our analysis of immune cell infiltration in DN has yielded important insights, especially regarding Regulatory T cells (Tregs), Immature B cells, and Natural Killer (NK) cells. Tregs, recognized for their immunosuppressive roles in maintaining immune homeostasis and mitigating excessive inflammation, are notably correlated positively with Immature B cells (r = 0.8, p < 0.05) in our study. This correlation indicates that Tregs may affect the activation and differentiation of Immature B cells, which in turn shapes immune responses in DN. Previous studies underscore the therapeutic potential of enhancing Treg function to manage DN effectively [49]. Conversely, Immature B cells are essential for the development of adaptive immunity, and their increased presence in DN samples may suggest ongoing immune responses or disrupted B cell maturation processes. The established roles of Immature B cells in autoimmune conditions suggest that they may contribute to a pathological immune profile in DN [50]. Investigating mechanisms underlying their accumulation could unveil new therapeutic targets for restoring immune balance in DN. Additionally, the strong positive correlation between Fibronectin 1 (FN1) and NK cells (r = 0.88, p < 0.05) highlights FN1's potential role in regulating NK cell activity. FN1, known for its influence on cell adhesion and migration, may enhance NK cell cytotoxic functions through interactions with these cells. Targeting FN1 could thus represent a promising avenue for boosting NK cell-mediated immunity in DN, offering novel therapeutic strategies. Together, these insights into DN's immune landscape not only deepen our understanding but also pave the way for future research aimed at developing targeted immunotherapies.
Although this study provides meaningful insights into genes associated with diabetic nephropathy (DN), there are several limitations that should be considered. First, the ROC analysis used in this study may be prone to overfitting. Future studies should aim to validate the findings using larger, independent cohorts to assess the robustness and external validity of the predictive model. Second, the choice of datasets could introduce potential bias, particularly given the differences in sample characteristics across various databases. To address this limitation, future research could incorporate more homogeneous datasets or utilize cross-validation techniques to mitigate the impact of dataset heterogeneity. Moreover, the reliance on bioinformatics analyses without experimental validation may affect the robustness of the conclusions. Therefore, future studies should integrate experimental approaches to further confirm our findings and explore their biological significance.
Conclusion
In conclusion, this study identified 708 DEGs and 119 MRRDEGs by integrating two GEO datasets and removing batch effects. Notably, we identified nine Hub genes—FN1, CD44, KDR, EGF, HSPG2, HGF, FGF9, IGF1, and ALB—that are significantly associated with DN. These findings enhance our understanding of the molecular mechanisms underlying DN. They also pave the way for future experimental validation and the exploration of new therapeutic strategies targeting these key genes.
Availability of data and materials
No datasets were generated or analysed during the current study.
References
Shan S, Luo Z, Yao L, Zhou J, Wu J, Jiang D, Ying J, Cao J, Zhou L, Li S, et al. Cross-country inequalities in disease burden and care quality of chronic kidney disease due to type 2 diabetes mellitus, 1990–2021: findings from the global burden of disease study 2021. Diabetes Obes Metab. 2024;26(12):5950–9.
Kim K, Crook J, Lu CC, Nyman H, Sarker J, Nelson R, LaFleur J. Healthcare costs across diabetic kidney disease stages: a veterans affairs study. Kidney Med. 2024;6(9): 100873.
The Diabetes Control and Complications (DCCT) Research Group. Effect of intensive therapy on the development and progression of diabetic nephropathy in the Diabetes Control and Complications Trial. Kidney Int. 1995;47(6):1703–20.
Samsu N. Diabetic nephropathy: challenges in pathogenesis, diagnosis, and treatment. Biomed Res Int. 2021;2021:1497449.
Ren X, Kang N, Yu X, Li X, Tang Y, Wu J. Prevalence and association of diabetic nephropathy in newly diagnosed Chinese patients with diabetes in the Hebei province: a single-center case-control study. Medicine (Baltimore). 2023;102(11): e32911.
Ghose S, Satariano M, Korada S, Cahill T, Shah R, Raina R. Advancements in diabetic kidney disease management: integrating innovative therapies and targeted drug development. Am J Physiol Endocrinol Metab. 2024;326(6):E791-e806.
Neuen BL, Heerspink HJL, Vart P, Claggett BL, Fletcher RA, Arnott C, de Oliveira CJ, Falster MO, Pearson SA, Mahaffey KW, et al. Estimated lifetime cardiovascular, kidney, and mortality benefits of combination treatment with SGLT2 inhibitors, GLP-1 receptor agonists, and nonsteroidal MRA compared with conventional care in patients with type 2 diabetes and albuminuria. Circulation. 2024;149(6):450–62.
Fang T, Zhang Q, Wang Z, Liu JP. Bidirectional association between depression and diabetic nephropathy by meta-analysis. PLoS ONE. 2022;17(12): e0278489.
Li S, Chen J, Zhou W, Liu Y, Zhang D, Yang Q, Feng Y, Cha C, Li L, He G, et al. To develop biomarkers for diabetic nephropathy based on genes related to fibrosis and propionate metabolism and their functional validation. J Diabetes Res. 2024;2024:9066326.
Mageswari R, Sridhar MG, Nandeesha H, Parameshwaran S, Vinod KV. Irisin and visfatin predicts severity of diabetic nephropathy. Indian J Clin Biochem. 2019;34(3):342–6.
Dias AS, Almeida CR, Helguero L, Duarte IF. Antitumoral activity and metabolic signatures of dichloroacetate, 6-aminonicotinamide and etomoxir in breast-tumor-educated macrophages. J Proteome Res. 2024.
Liu G, Dou J, Zheng D, Zhang J, Wang M, Li W, Wen J, Lu J, Ji L, He Y. Association between abnormal glycemic phenotypes and microvascular complications of type 2 diabetes mellitus outpatients in China. Diabetes Metab Syndr Obes. 2020;13:4651–9.
Woroniecka KI, Park AS, Mohtat D, Thomas DB, Pullman JM, Susztak K. Transcriptome analysis of human diabetic kidney disease. Diabetes. 2011;60(9):2354–69.
Shi JS, Qiu DD, Le WB, Wang H, Li S, Lu YH, Jiang S. Identification of transcription regulatory relationships in diabetic nephropathy. Chin Med J (Engl). 2018;131(23):2886–90.
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 2013;41(Database issue):D991-995.
Davis S, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics. 2007;23(14):1846–7.
Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, Stein TI, Nudel R, Lieder I, Mazor Y, et al. The GeneCards suite: from gene data mining to disease genome sequence analyses. Curr Protoc Bioinf. 2016;54:1.
Honkoop H, de Bakker DE, Aharonov A, Kruse F, Shakked A, Nguyen PD, de Heus C, Garric L, Muraro MJ, Shoffner A et al. Single-cell analysis uncovers that metabolic reprogramming by ErbB2 signaling is essential for cardiomyocyte proliferation in the regenerating heart. Elife. 2019; 8.
Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28(6):882–3.
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7): e47.
Ben Salem K, Ben Abdelaziz A. Principal component analysis (PCA). Tunis Med. 2021;99(4):383–9.
Mi H, Muruganujan A, Ebert D, Huang X, Thomas PD. PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Res. 2019;47(D1):D419-d426.
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27–30.
Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7.
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102(43):15545–50.
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27(12):1739–40.
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47(D1):D607-d613.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504.
Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8(Suppl 4):S11.
Yang X, Li Y, Lv R, Qian H, Chen X, Yang CF. Study on the multitarget mechanism and key active ingredients of herba siegesbeckiae and volatile oil against rheumatoid arthritis based on network pharmacology. Evid Based Complement Alternat Med. 2019;2019:8957245.
Zhou KR, Liu S, Sun WJ, Zheng LL, Zhou H, Yang JH, Qu LH. ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data. Nucleic Acids Res. 2017;45(D1):D43-d50.
Li JH, Liu S, Zhou H, Qu LH, Yang JH. starBase v2.0: decoding miRNA-ceRNA, miRNA–ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res. 2014;42(1):D92-97.
Xiao B, Liu L, Li A, Xiang C, Wang P, Li H, Xiao T. Identification and verification of immune-related gene prognostic signature based on ssGSEA for osteosarcoma. Front Oncol. 2020;10: 607622.
Si Y, Zhu Y, Liu J, Liu S, Cai X, Gu Y, Li H, Pan F, Wang W, Shangguan J, et al. Exploring the mechanism of cardiorenal protection with finerenone based on network pharmacology. Cardiorenal Med. 2024;14(1):334–49.
Wang H, Wang Y, Wang X, Huang H, Bao J, Zhong W, Li A. PTEN alleviates maladaptive repair of renal tubular epithelial cells by restoring CHMP2A-mediated phagosome closure. Cell Death Dis. 2021;12(12):1087.
Chen WY, Chang YJ, Su CH, Tsai TH, Chen SD, Hsing CH, Yang JL. Upregulation of Interleukin-33 in obstructive renal injury. Biochem Biophys Res Commun. 2016;473(4):1026–32.
Tesch GH. MCP-1/CCL2: a new diagnostic marker and therapeutic target for progressive renal injury in diabetic nephropathy. Am J Physiol Renal Physiol. 2008;294(4):F697-701.
Lord MS, Tang F, Rnjak-Kovacina J, Smith JGW, Melrose J, Whitelock JM. The multifaceted roles of perlecan in fibrosis. Matrix Biol. 2018;68–69:150–66.
Wang Y, Song S, Qiu D, Wu G, Zheng R, Zhao L, Shi Y, Duan H: Effects of MiR-23b/ MAPK on renal fibrosis in rats with diabetic nephropathy. Minerva Med 2021.
Han X, Wei J, Zheng R, Tu Y, Wang M, Chen L, Xu Z, Zheng L, Zheng C, Shi Q, et al. Macrophage SHP2 deficiency alleviates diabetic nephropathy via suppression of MAPK/NF-κB-dependent inflammation. Diabetes. 2024;73(5):780–96.
Han J, Pang X, Zhang Y, Peng Z, Shi X, Xing Y. Hirudin protects against kidney damage in streptozotocin-induced diabetic nephropathy rats by inhibiting inflammation via P38 MAPK/NF-κB pathway. Drug Des Devel Ther. 2020;14:3223–34.
Sanajou D, Ghorbani Haghjo A, Argani H, Aslani S. AGE-RAGE axis blockade in diabetic nephropathy: current status and future directions. Eur J Pharmacol. 2018;833:158–64.
Ge D, Luo T, Sun Y, Liu M, Lyu Y, Yin W, Li R, Zhang Y, Yue H, Liu N. Natural diterpenoid EKO activates deubiqutinase ATXN3 to preserve vascular endothelial integrity and alleviate diabetic retinopathy through c-fos/focal adhesion axis. Int J Biol Macromol. 2024;260(Pt 2): 129341.
Tian L, Yu Q, Zhang L, Zhang J. Accelerated fibrosis progression of diabetic nephropathy from high uric acid's activation of the ROS/NLRP3/SHP2 pathway in renal tubular epithelial cells under high glucose conditions. Altern Ther Health Med. 2024.
Dou F, Liu Q, Lv S, Xu Q, Wang X, Liu S, Liu G. FN1 and TGFBI are key biomarkers of macrophage immune injury in diabetic kidney disease. Medicine (Baltimore). 2023;102(45): e35794.
Leo CH, Ou JLM, Ong ES, Qin CX, Ritchie RH, Parry LJ, Ng HH. Relaxin elicits renoprotective actions accompanied by increasing bile acid levels in streptozotocin-induced diabetic mice. Biomed Pharmacother. 2023;162: 114578.
Diwan B, Yadav R, Goyal R, Sharma R. Sustained exposure to high glucose induces differential expression of cellular senescence markers in murine macrophages but impairs immunosurveillance response to senescent cells secretome. Biogerontology. 2024;25(4):627–47.
Sun L, Wu Y, Sinha SK, Nicholas SB, Zou LX. Performance of multi-biomarker panels based on urinary N-terminal osteopontin for prediction of diabetic kidney disease in patients with diabetes mellitus. Eur J Intern Med. 2023;118:140–2.
Wang D, Zhang Q, Dong W, Ren S, Wang X, Su C, Lin X, Zheng Z, Xue Y. SGLT2 knockdown restores the Th17/Treg balance and suppresses diabetic nephropathy in db/db mice by regulating SGK1 via Na(). Mol Cell Endocrinol. 2024;584: 112156.
Shimizu C, Kawamoto H, Yamashita M, Kimura M, Kondou E, Kaneko Y, Okada S, Tokuhisa T, Yokoyama M, Taniguchi M, et al. Progression of T cell lineage restriction in the earliest subpopulation of murine adult thymus visualized by the expression of lck proximal promoter activity. Int Immunol. 2001;13(1):105–17.
Acknowledgements
Not applicable.
Funding
None.
Author information
Authors and Affiliations
Contributions
HC and ZJL designed the research study. XXS performed the research. YL and CD analyzed the data. HC wrote the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The studies involving human participants were reviewed and approved by the original studies.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, H., Su, X., Li, Y. et al. Identification of metabolic reprogramming-related genes as potential diagnostic biomarkers for diabetic nephropathy based on bioinformatics. Diabetol Metab Syndr 16, 287 (2024). https://doi.org/10.1186/s13098-024-01531-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13098-024-01531-5








