Introduction of the DiaGene study: clinical characteristics, pathophysiology and determinants of vascular complications of type 2 diabetes

Background Type 2 diabetes is a major healthcare problem. Glucose-, lipid-, and blood pressure-lowering strategies decrease the risk of micro- and macrovascular complications. However, a substantial residual risk remains. To unravel the etiology of type 2 diabetes and its complications, large-scale, well-phenotyped studies with prospective follow-up are needed. This is the goal of the DiaGene study. In this manuscript, we describe the design and baseline characteristics of the study. Methods The DiaGene study is a multi-centre, prospective, extensively phenotyped type 2 diabetes cohort study with concurrent inclusion of diabetes-free individuals at baseline as controls in the city of Eindhoven, The Netherlands. We collected anthropometry, laboratory measurements, DNA material, and detailed information on medication usage, family history, lifestyle and past medical history. Furthermore, we assessed the prevalence and incidence of retinopathy, nephropathy, neuropathy, and diabetic feet in cases. Using logistic regression models, we analyzed the association of 11 well known genetic risk variants with type 2 diabetes in our study. Results In total, 1886 patients with type 2 diabetes and 854 controls were included. Cases had worse anthropometric and metabolic profiles than controls. Patients in outpatient clinics had higher prevalence of macrovascular (41.9% vs. 34.8%; P = 0.002) and microvascular disease (63.8% vs. 20.7%) compared to patients from primary care. With the exception of the genetic variant in KCNJ11, all type 2 diabetes susceptibility variants had higher allele frequencies in subjects with type 2 diabetes than in controls. Conclusions In our study population, considerable rates of macrovascular and microvascular complications are present despite treatment. These prevalence rates are comparable to other type 2 diabetes populations. While planning genomics, we describe that 11 well-known type 2 diabetes genetic risk variants (in TCF7L2, PPARG-P12A, KCNJ11, FTO, IGF2BP2, DUSP9, CENTD2, THADA, HHEX, CDKAL1, KCNQ1) showed similar associations compared to literature. This study is well-suited for multiple omics analyses to further elucidate disease pathophysiology. Our overall goal is to increase the understanding of the underlying mechanisms of type 2 diabetes and its complications for developing new prediction, prevention, and treatment strategies. Electronic supplementary material The online version of this article (doi:10.1186/s13098-017-0245-x) contains supplementary material, which is available to authorized users.


Background
Type 2 diabetes mellitus (T2DM) is a complex metabolic disease characterized by overweight, insulin resistance and beta-cell dysfunction [1][2][3]. Because of ageing and the rising prevalence of obesity, the incidence and prevalence of T2DM are increasing [4][5][6][7]. T2DM accounts for a large proportion of present and future health care expenditure in Western societies [5,7,8]. People affected by T2DM have an increased risk of cardiovascular events [9][10][11][12][13], and a poor prognosis after these events [14,15]. In addition, T2DM gives rise to microvascular complications such as retinopathy, nephropathy and neuropathy [16][17][18][19]. We have collected a new large cohort of individuals with and without T2DM with prospective follow-up in the Netherlands: the DiaGene study.
The care for T2DM in the Netherlands is organized in primary care by general practitioners and at hospitalbased outpatients clinics by medical specialists. This systematic care is based on local and international treatment guidelines aiming to reduce morbidity and mortality through optimal treatment of hyperglycemia and associated metabolic complications, such as dyslipidemia, vascular dysfunction and high blood pressure [20,21]. Treatment of these components has proven to reduce the risk of cardiovascular morbidity and mortality in T2DM [22][23][24][25][26][27][28][29][30]. However, a substantial residual risk remains. Improving knowledge on genetic, biochemical and environmental (lifestyle and anthropometric) determinants of T2DM and its micro-and macrovascular complications can have large implications for prevention, treatment and prognosis of T2DM [2,23,31]. Through high throughput sequencing, about 80 common genetic variants associated with T2DM have been discovered [31,32]. These common variants only explain 5-10% of the overall predisposition of T2DM [33]. There clearly is a need to expand these analyses to additional populations.
In this paper, we present the DiaGene study, a new, multicenter T2DM cohort study collected in the Netherlands in both primary and secondary care. The main purpose of the DiaGene study is to study the analyses of genetic, biochemical and environmental determinants of T2DM and its complications. Here we describe the characteristics of our population, the prevalence of complications and future perspectives.

Study design
The DiaGene study is a multicenter cohort study that was coordinated by the vascular section of internal medicine of the Erasmus Medical Center and the Diabetes subunit of the Máxima Medical Center, and collected in the city of Eindhoven, The Netherlands. Eindhoven is a medium-sized city with 170,668 adult (>21 years) inhabitants in 2011. Both hospitals in Eindhoven participated in the DiaGene study: Catharina Hospital and Máxima Medical Center. In addition, the local Primary Care Diagnostic Centre participated. Hence, virtually all diabetes patients in Eindhoven were approached for inclusion through this population-based approach. Between 2006 and 2011, physicians at all three centers included a total of 2065 patients with T2DM. Of these, 179 patients were excluded from analysis. Reasons for exclusion where: no diabetes (n = 1), Type 1 diabetes (n = 30), Maturity-Onset Diabetes of the Young (n = 4), Latent auto-immune diabetes in adults (n = 3), double inclusion (n = 77), post-pancreatitis diabetes (n = 3), refusal during study period (n = 2) and missing written informed consent (n = 59); resulting in a total of 1886 patients in the study population (Additional file 1).
The control group consisted of two groups: (1) subjects recruited via advertisement in local newspapers, and (2) subjects that where included through invitation of friends and self-reported unrelated family members of participating patients. Inclusion criteria for controls was age 55 years or older. Exclusion criteria were the presence of any kind of diabetes, use of metformin or Cushing's disease. Subjects who were approached had at least 7 days of decision-time to fully reflect on research goals and methods using physician-provided information, before giving their written informed consent. Eventually, 904 diabetes-free subjects participated as controls. Of these, 50 were excluded from all analyses based on missing written informed consent (n = 14), double inclusion (n = 17), and suspected or confirmed diagnosis of diabetes (n = 19), resulting in a total of 854 controls included in the final population.

Definition of T2DM
Information on the diagnosis of T2DM was retrieved from the patient's medical records. In accordance with American Diabetes Association-and World Health Organization-guidelines [34,35], diabetes was defined as a fasting plasma glucose ≥7.0 mmol/L and/or a nonfasting plasma glucose level ≥11.1 mmol/L measured at least at 2 separate time points, treatment with oral glucose-lowering medication or insulin, and/or the diagnosis of T2DM as registered by a medical specialist. Persons with the diagnosis of type 1 diabetes (as derived from medical records and patient-questionnaires) or other types of diabetes mellitus were excluded from the study. Control subjects with fasting glucose ≥7.0 mmol/L or glycated hemoglobin (HbA1c) ≥47.5 mmol/mol were excluded. Information on T2DM status was checked by two investigators. If they did not reach consensus, the participant's treating physician was consulted.

Medical and family history
Each participant filled out an extensive questionnaire on their medical history (history of diabetes, metabolic disease, vascular disease, medication use and intoxications) and ethnicity of their parents (Additional file 2). We classified a participant to be Caucasian if both parents were reported to be Caucasian. Furthermore, the participant's family history regarding diabetes and cardiovascular disease and medication usage was recorded through the questionnaire.

Sample collection
A 20 cc Ethylene diamine tetra acetic (EDTA) fasting blood sample was taken from all participants. Samples were centrifuged (3000 rpm; 1800g for 15 min at 4 °C). Directly after centrifugation, the plasma and the buffy coat were separated and stored (at −80 °C) for DNA analysis and future measurements.

Diabetes and complications of diabetes
Data on body mass index (BMI) (kg/m 2 ) and blood pressure (mmHg) were extracted from medical records at inclusion. Similarly, laboratory results were extracted around time of inclusion and contained fasting glucose, glycated hemoglobin (HbA1c), total cholesterol, low-density lipoprotein cholesterol (LDL-cholesterol), high-density lipoprotein-cholesterol (HDL-cholesterol), triglycerides, creatinine and urinary albumin/creatinineratio. The majority of measurements were collected within 6 months prior to or after the actual date of inclusion. To estimate kidney function, the estimated glomerular filtration rate was calculated with the Modification of Diet in Renal Disease-formula. Information on the presence of cardiovascular disease in the patients treated in the hospital-based outpatient clinics was retrieved from their medical records. Cardiovascular disease comprised myocardial infarction, percutaneous coronary intervention/coronary arterial bypass graft (PCI/CABG), cerebrovascular accident, transient ischemic attack and peripheral arterial disease. PCI/CABG was defined as any invasive intervention to treat coronary arterial disease (PCI, CABG). Peripheral arterial disease was defined as an ankle-brachial index below 0.80 or below 0.90 with typical complaints, any intervention to treat peripheral arterial disease (supervised exercise training, stenting, bypass and percutaneous transluminal angioplasty, or the self-reported presence of intermittent claudication. Information on cardiovascular disease in patients from primary care and diabetes-free controls was based on self-reporting. Microvascular complications were subdivided into retinopathy, nephropathy and neuropathy. Diabetic foot was additionally assessed. Retinopathy was scored according to the report of an ophthalmologist as absent or present and classified as non-proliferative, proliferative, or retinopathy treated with photo coagulation or intra-vitreal injections. Neuropathy was defined by a podotherapist, neurologist or the patients' treating physician. Nephropathy was defined present when microalbuminuria [Albumin/creatinine-ratio (ACR) ≥2.5 for men or ≥3.5 for women] was present at two of three consecutive measurements, or when high micro-albuminuria or macro-albuminuria was present at one measurement (ACR ≥12.5 for men or ≥17.5 for women). Diabetic foot was established by a podotherapist or physician according to the SIMM's classification [36]. All information on laboratory data, macrovascular, and microvascular events in case and control subjects at baseline that was retrieved from medical records was separately checked by two investigators. When they did not reach consensus, the participant's physician was consulted.

Follow-up data
Currently, we are finalizing the first collection of prospective follow-up in our study population. This encompasses all anthropometric and laboratory measurements and data on metabolic, microvascular and macrovascular complications of T2DM and enables us to perform prospective analyses.

Statistical analysis
Continuous variables are expressed as median with interquartile range unless otherwise specified. Comparisons between groups were performed with Mann-Whitney U tests for continuous and χ 2 tests for categorical data. Deviation from the Hardy-Weinberg equilibrium was assessed by χ 2 testing. Associations of the genotypes with T2DM were tested using logistic regression models. We have calculated interaction effects of odds ratios for T2D to compare our results with previous genetic studies according to the method of Altman et al. [43]. All models were adjusted for age and sex. Additionally, models were adjusted for center of inclusion as a categorical covariate. Cases and controls of non-Caucasian ethnicity were excluded from the genetic analyses. P values smaller than 0.05 were considered to be statistically significant. Statistical analysis was performed with SPSS-software version 22.0 (SPSS, Chicago, IL, USA).

General characteristics
The most relevant general characteristics of the cohort are displayed in Table 1 3.6 (0.9) mmol/L; P < 0.001]. A larger proportion of cases had reduced estimated glomerular filtration rate (19.7% vs. 4.7%, P < 0.001) and prevalent macrovascular disease (38.0% vs 8.3%, P < 0.001) compared to diabetes-free controls. More cases had a first-degree relative with T2DM compared to controls (64.4% vs 33.3%, P < 0.001). More baseline characteristics can be found in Table 1.  20.7%) compared to patients with T2DM from primary care. We could not retrieve reliable data on neuropathy nor diabetic foot in primary care population. More patients from the outpatient clinic had a first-degree relative with T2DM compared to controls (64.4% vs. 33.3%, P < 0.001). Table 3 shows the associations of 11 well-established genetic T2DM variants in our study population. Hardy-Weinberg's equilibrium was met for all variants. With the exception of the variant in KCNJ11, all T2DM susceptibility variants had higher allele frequencies in cases with T2DM than in controls. TCF7L2 showed the highest odds ratio for prevalent T2DM [OR 1.37 (95% CI 1.17, 1.60; P < 0.001]. These results were unaffected by additional correction for center of inclusion. After calculation of interaction effects, the associations of all genetic variants except for KCNJ11 did not significantly differ from the large scale meta-analyses of Morris et al. [44].

Discussion
In this manuscript, we present the baseline characteristics and future perspectives of the DiaGene study, a new multi-centre cohort study with prospective followup on biochemical and genetic determinants of T2DM and its complications. We show that the population is representing both primary and secondary care and that despite treatment, considerable rates of macrovascular and microvascular complications are present. To further elucidate determinants of T2DM and its complications, multi-layer omics and prospective analyses will be of great value. Our study offers excellent opportunities to perform these analyses.
In the Netherlands, primary care practices are led by general practitioners, who are easily accessible and offer essential family medicine. Outpatient clinics of hospitals provide specialized care and require referral by the general practitioner for reimbursement by insurance companies. Therefore, complex and more severely affected patients will be referred to the hospital-based outpatient clinics. This is reflected in the higher prevalence of micro-and macrovascular complications at the outpatient clinics in our population.
The risk of microvascular disease can be reduced substantially by glycemic control and general measures to prevent Table 1 General baseline characteristics of participants with and without T2DM cardiovascular disease such as lifestyle, blood pressure and lipid optimization [22,23,25]. Rates of microvascular disease in our study at baseline were 17.3, 23.0 and 31% for retinopathy, nephropathy and neuropathy, respectively. This incidence of retinopathy in T2DM is comparable to a report from the Dutch National Institute for Public Health and the Environment [45] and in line with a worldwide meta-analyses for diabetes with a duration of less than 10 years [46], but higher than in a screening study for T2DM from the Netherlands [47]. In the latter study, the duration of T2DM was short and this probably explains the difference. For nephropathy, our rate is slightly lower than in the United Kingdom Prospective Diabetes Study (25%), also probably because of shorter follow-up [16]. Our primary care population appeared to have lower rates of nephropathy compared to studies on prevalent diabetes and newly diagnosed diabetes in patients of general practitioners in the Netherlands [47,48]. Although the single urinary measurement-based Table 2 General baseline characteristics of participants with and without T2DM prevalence rates in the latter could be an explanation for this discrepancy. The percentage of patients with T2DM and neuropathy in our population (31%) is lower compared to a prospective study (50%) with 25 years of follow-up from diagnosis [19] and comparable to a cross-sectional study on peripheral neuropathy in the United Kingdom [49].
The risk of macrovascular disease in T2DM can be successfully reduced by applying lifestyle interventions, lipid lowering therapies and antihypertensive treatment. The relationship with glycemic control is more complex. Even though glycemic control epidemiologically is strongly related to cardiovascular disease in T2DM, interventions applying strict glycemic control were unsuccessful [22,50] or even showed adverse effects [51]. Macrovascular disease rates in our population with T2DM is comparable to previous reports in the Netherlands [45,52], but lower than in an interview-based study in diabetes patients in the USA [53]. Our population is on average 5 years older than the patients in this American study, and also contains a significant proportion of patients from outpatient clinics having further progressed disease.
T2DM and its complications are multifactorial in their pathophysiology's. Genetics, epigenetics, biological mechanisms and environmental factors are probably interacting at multiple levels. Therefore a pathway-based approach in well-defined cohorts is needed, supported by full use of information technology. High throughput research has been mainly focused on genome wide genetic associations. This has elucidated interesting associations. Yet the results only explain disease susceptibility to a small extent [31,33]. We are planning to perform genome-wide association analysis in the near future. The quality control of this future genomic work will include analyses of the genetic-based ethnic background to definitively determine population sub-structures. Here, we restricted our analyses of well-known genetic T2DM risk variants to the sub-group of self-reported Caucasians. These DNA polymorphisms showed similar associations in our mainly Caucasian population as in previous extensive meta-analyses: most genetic variants had similar direction of their associations as earlier reported and for TCF7L2, THADA, KCNQ1 and CDKAL1 this was Table 3 Allele frequencies, odds ratios and 95% confidence intervals of genetic variants and risk of T2D in DiaGene, original discovery studies and most recent meta-analysis of genome-wide-association studies significant [32,44,54]. KCNJ11 and CENTD2 showed a slight but not statistically significant opposite association to what has previously described, with estimates close to 1 and confidence intervals embracing the estimates from literature [32]. Except for KCNJ11, all genetic variants had non-significant interaction effects for odds ratios of T2DM-risk variants compared to the latest meta-analysis [44]. The significant difference for KCNJ11 can be an effect of population-specific variance, differences in environmental factors, age or interactions of these factors with the genetic variant [44].
To study the aetiology of T2DM and its complications we need well phenotyped cohorts with prospective follow-up. Our population has these characteristics. We therefore plan to analyse several omics layers for their associations with T2DM and its complications. We are currently measuring total N-glycomics with matrixassisted laser desorption-ionization-time of flight (MALDI-TOF), matrix-assisted laser desorption-ionization-Fourier transform ion cyclotron resonance mass spectrometry (MALDI-FTICR) [55] and IgG-glycomics with ultra-performance liquid chromatography [56]. In the near future, we aim to include lipidomics, with a focus on lipoprotein(a), metabolomics, and proteomics. Also we plan to perform genomics using the Illumina chip, for mendelian randomization and multilayer interaction analyses. The overall goal being to elucidate new pathophysiological pathways for prediction, prevention and treatment of T2DM.
Although we have performed our study with precision, we need to consider a number of limitations. A large majority of our population is of self-reported Caucasian ethnicity, which limits extending conclusions from our analyses to non-Caucasian populations. However, it also makes our analyses less vulnerable to genetic population stratification bias. Self-reported Caucasian mono-ethnicity in two generations results in a very limited risk of misclassifying genetic admixtures [57,58]. In addition, a small proportion of diabetes-free subjects where recruited by asking T2DM subjects to invite unrelated family members and friends. Hence, absence of family ties was selfreported, with a small possibility of hidden relatedness. In the near future, we will perform genome-wide association analysis, which will allow us to perform formal quality control and accurately account for hidden relatedness and genetic population stratification bias [59]. Another limitation of this study was our inability to retrieve information on neuropathy in primary care setting. Conclusions on neuropathy are therefore restricted to the secondary care setting. We have made extensive efforts to optimise the reliability of our data by having two independent investigators collect the data and reach consensus. This means we did have to rely on common clinical practice and adequate record keeping in primary and secondary care. For macrovascular events in primary care we had to rely on self-reported data. For validation, we have therefore checked self-reported myocardial infarction data from hospital-based participants and found that in only 6.0% of participants with self-reported myocardial infarction this diagnosis was not confirmed in hospital data. These events have therefore been scored as missing. Underestimation of the incidence of myocardial infarction based on hospital discharge data has however been described before [60]. And although the questionnaire on lifestyle, medication, clinical events and family history was straight-forward and easy to use, it is not an externally validated questionnaire. At last, our preliminary genetic results had approximately 10% missing values. We are currently collecting additional samples from the participants whose DNA was not available at the time of the current genetic analysis to improve our genetic analysis. Further strengths of our study are the meticulous hands-on medical file review for each patient by two separate physicians, which produced high-quality data that enable us to research both T2DM itself as well as its complications in great detail. Currently, we are finalizing the first collection prospective follow up on all T2DM complications. The prospective cohort setting with concurrent inclusion of diabetes-free individuals at baseline, will allow us to perform cross-sectional and prospective end-point analyses to study aetiology and progression of type 2 diabetes and its complications.

Conclusion
In conclusion, this manuscript describes the design and baseline characteristics of the DiaGene study, a large multi-centre prospective follow-up cohort study on environmental, biochemical and genetic risk factors of T2DM and related vascular complications. By studying both clinical and complex biochemical parameters with a current focus on glycomics, genomics and lipidomics, the DiaGene study aims to contribute to the pathophysiological understanding of T2DM and all its vascular complications in a prospective case-control setting.