The association of body mass index with the risk of type 2 diabetes: a case–control study nested in an electronic health records system in the United States

Objectives Obesity is a known risk factor for type 2 diabetes (T2D). We conducted a case–control study to assess the association between body mass index (BMI) and the risk of being diagnosed with T2D in the United States. Methods We selected adults (≥ 18 years old) who were diagnosed with T2D (defined by ICD-9-CM diagnosis codes or use of anti-diabetic medications) between January 2004 and October 2011 (“cases”) from an electronic health records database provided by an integrated health system in the Middle Atlantic region. Twice as many individuals enrolled in the health system without a T2D diagnosis during the study period (“controls”) were selected based on age, sex, history of cardiac comorbidities or hyperinflammatory state (defined by C-reactive protein and erythrocyte sedimentation rate), and use of psychiatric or beta blocker medications. BMI was measured during one year prior to the first observed T2D diagnosis (for cases) or a randomly assigned date (for controls); individuals with no BMI measure or BMI < 18.5 kg/m2 were excluded. We assessed the impact of increased BMI (overweight: 25–29.9 kg/m2; Obesity Class I: 30–34.9 kg/m2; Obesity Class II: 35–39.9 kg/m2; Obesity Class III: ≥40 kg/m2), relative to normal BMI (18.5–24.9 kg/m2), on a T2D diagnosis using odds ratios (OR) and relative risks (RR) estimated from multiple logistic regression results. Results We included 12,179 cases (mean age: 55, 43% male) and 25,177 controls (mean age: 56, 45% male). We found a positive association between BMI and the risk of a T2D diagnosis. The strength of this association increased with BMI category (RR [95% confidence interval]: overweight, 1.5 [1.4–1.6]; Obesity Class I, 2.5 [2.3–2.6]; Obesity Class II, 3.6 [3.4–3.8]; Obesity Class III, 5.1 [4.7–5.5]). Conclusions BMI is strongly and independently associated with the risk of being diagnosed with T2D. The incremental association of BMI category on the risk of T2D is stronger for people with a higher BMI relative to people with a lower BMI.


Background
Diabetes, the most common metabolic disorder, is associated with substantial disease burden, including increased mortality risk and significant long-term morbidity [1][2][3][4]. Diabetes was diagnosed in 22.3 million people (7% of the population) in the United States (US) in 2012 and was responsible for $176 billion of direct medical costs and $69 billion of indirect (lost productivity) costs [5]. Type 2 diabetes mellitus (T2D) comprises about 90%-95% of all diabetes cases [6,7], and its prevalence has been steadily increasing [8]. Obesity, classified as body mass index (BMI) ≥30 kg/m 2 , is a known predictor of T2D and has become a major public health problem in the US [9], affecting over one-third (35.7%) of the population [10]. It costs about $190.2 billion (in 2005 dollars) annually to treat obesity for the non-institutionalized US adult population, which accounts for almost 21% of US healthcare expenditures [11]. Healthcare costs attributable to obesity and overweight in the US are projected to reach $860.7 billion by 2030 [12].
The impact of BMI classification, including overweight and various grades of obesity, on the risk of T2D in the real-world practice is a well-investigated topic [9,13]. However, the evidence for the US is dated, with most of the studies relying on data before the year 2005 and does not reflect recent changes in the obesity 'epidemic' [14][15][16][17][18][19][20][21]. We designed this case-control study to obtain more recent evidence of the association between BMI and the risk of being diagnosed with T2D in US.

Data and Study Sample
Data were obtained from the MedMining® database, which contains electronic health records from the Geisinger Health System. The Geisinger Health System, which serves more than 4 million individuals in the state of Pennsylvania, is an integrated health system with an 880+ multi-specialty physician group practice, 5 hospital campuses, 72 primary and specialty clinic sites, and a health plan. Individuals' health records, which have been kept in electronic form at Geisinger Health System since 1996, contain information on demographic characteristics (age, sex, and race/ethnicity), encounter details from inpatient, outpatient, and office-based settings (such as ICD-9-CM diagnosis codes, and CPT-4 procedure codes), medication orders, lab findings, and actual costs incurred by the Geisinger Health System for those encounters. This dataset has been widely used to address the health economic evaluations in real-world settings. [22][23][24][25][26] Cases were selected if their first diagnosis of T2D (defined by ICD-9-CM diagnosis codes 250.x0 or 250.x2 or by an anti-diabetic medication order, whichever came first) while in the Geisinger Health System database was observed between January 2004 and October 2011 (study period). Events and measurements were anchored by each individual's "index date," which was defined as the date of their incident, or first observed, T2D diagnosis in the MedMining database.
We used ICD-9-CM diagnosis codes and laboratory values to measure the history of any cardiac comorbidities (lipid abnormalities, coronary heart disease, acute myocardial infarction, angina pectoris, or hypertension), hyperinflammatory state (defined by clinical biomarkers of C-reactive protein and erythrocyte sedimentation rate), psychiatric medication use (anticonvulsants or antipsychotic medication), and beta-blocker medication use during each case individual's 12-month pre-index period.
We created a group of potential controls by randomly selecting two individuals with no history of diabetes (defined by ICD-9-CM diagnosis codes 250.xx or use of any anti-diabetic medication) during the study period for each case. Cases and potential controls were selected based on age group (< 65 or ≥ 65), sex, and history of any cardiac comorbidities, hyperinflammatory state, psychiatric medication use, and beta-blocker medication use. Since the potential controls never received a T2D diagnosis, we assigned them a random index date between the beginning and end of the study period. Individuals (both cases and potential controls) were further required to be alive as of October 31, 2011, be ≥18 years old at index date, and satisfy all of the selection criteria listed in the Appendix. Baseline BMI was defined as the last value observed during the 12-month pre-index period for both cases and controls, and was classified according to the World Health Organization's definition of BMI: normal (18.5-24.9 kg/m 2 ), overweight (25-29.9 kg/m 2 ), Obesity Class I (30-34.9 kg/m 2 ), Obesity Class II (35-39.9 kg/m 2 ), and Obesity Class III (≥40 kg/m 2 ).

Analysis
We compared individual demographic and clinical characteristics, including baseline BMI, between cases and controls and assessed the statistical significance (p < 0.05) of the differences between groups using the Student's t-test for continuous variables and the chi-square test for categorical variables.
We assessed the impact of baseline BMI on the risk of T2D diagnosis via the odds ratios (ORs) estimated from an unconditional multiple logistic regression model that adjusted for other covariates such as the index year, smoking status, employment status, payer status, Geisinger Health Plan coverage, history of depression, use of selected medications (to treat depression/anxiety, obesity, hyperlipidemia, and hypertension) and medical burden during the 12-month pre-index period. Medical burden was measured by any use of outpatient, inpatient, or emergency services, as well as the logarithm of the total annual encounter costs adjusted to 2011 dollars using the US Consumer-Price Index [27]. Age group was also included to address the fact that the age variable used in the matching procedure was categorical (≥ 65 years or not). In addition to the adjusted ORs, we estimated the adjusted relative risk of T2D diagnosis for each BMI category (with normal BMI as the comparator) using the method of recycled predictions [28]; 95% confidence intervals (CIs) around the relative risks were estimated by the 2.5 and 97.5 percentiles of 1,000 bootstrap replications [29]. Because we sampled cases and controls independent of exposure status (i.e., BMI), which is consistent with the case-cohort sampling approach, these relative risk estimates are applicable to the whole Geisinger Health System patient population [30]. Data were compiled and analyzed using SAS (version 9.2, SAS Institute Inc., Cary, NC).

Results
We identified 25,241 individuals who experienced their incident T2D diagnosis between January, 2004 and October, 2011, and 50,482 matched control individuals with no history of diabetes during the same time frame. The final sample included 37,356 individuals who had a baseline BMI ≥18.5 kg/m 2 (12,179 cases and 25,177 controls) after we applied the sample selection criteria ( Figure 1). Table 1 displays the individuals' demographic and clinical characteristics. Compared with control individuals, the case individuals had higher baseline BMI values (mean ± standard deviation: 35.4 ± 8.5 kg/m 2 vs. 29.4 ± 6.3 kg/m 2 , p < 0.01). Cases were more likely to be younger, male, and to have higher healthcare resource use as measured by costs during the 12-month pre-index period than controls. Cases were also more likely to have experienced comorbidities related to diabetes and/or obesity and used medications related to diabetes or obesity during the 12-month preindex period than controls.
As shown in the first 4 rows of Table 2 and in the upper panel of Figure 2, compared with individuals with a normal BMI, individuals who were overweight or obese were more likely to be diagnosed with T2D (OR [95%CI]: ranging from 1.6 [1.5-1.8] for overweight adults to 11.6 [10.5-12.8] for adults in Obesity Class III, all p-values < 0.01). The relative risks displayed a similar pattern: the relative risk was 1.5 (95%CI: 1.4-1.6) for overweight adults, 2.5 (2.3-2.6) for adults in Obesity Class I, 3.6 (3.4-3.8) for adults in Obesity Class II, and 5.1 (4.7-5.5) for adults in Obesity Class III (lower panel of Figure 2). Furthermore, we found that the change in the magnitude of the ORs from one BMI category to the next was larger for individuals in higher BMI categories than individuals in lower BMI categories, as illustrated by the increasing slope of the lines connecting the ORs and, to a lesser degree, the lines connecting the relative risks in Figure 2. These patterns of ORs and relative risks imply that individuals in higher BMI categories were increasingly more likely to be diagnosed with T2D than individuals in lower BMI categories (p < 0.05).
Other individual characteristics, aside from BMI, were also significantly associated with the risk of being diagnosed with T2D. Individuals who were 45-64 years old (compared with 18-44 years old), were black or other race (compared with white), or ever smoked (compared with never) were associated with an increased risk of T2D diagnosis. In addition, individuals who experienced comorbidities (any cardiac comorbidities, hyperinflammatory state, or depression) or who used medications (psychiatric drugs, antidepressants or anxiolytics, antihyperlipidemia drugs, and antihypertensives) were more likely to have been diagnosed with T2D than those who did not, as were individuals with more medical costs in the pre-index period. However, the ORs of the individual characteristics (except for BMI) were not the focus of this study since they were included in the regression in order to adjust for the impact of BMI.

Discussion
After adjusting for a number of characteristics associated with the risk of T2D, we found that, compared with normal BMI, overweight and obesity was statistically significantly associated with the risk of being diagnosed with T2D among individuals without any other prior evidence of T2D. We further found that the risk of a T2D  [20]. According to the Nurses' Health Study, the adjusted relative risk of T2D associated with each 5-unit increment in BMI ranged from 1.55 (95% CI: 1.36-1.77) to 2.36 (95% CI: 1.83-3.04) among women, depending on the participants' race/ethnicity, in the 1980-2000 prospective cohort [15]; and the overall relative risk of non-insulin-dependent T2D among women with BMI ≥ 29.9 kg/m 2 relative to women with BMI ≤ 20.1 kg/m 2 in the 1986-1994 cohort was 11.2 (95% CI: 7.9-15.9) [14]. Although regional data were used, the current study covered more recent years.
Moreover, BMI values were clinically measured in the current study, compared with BMI calculated from selfreported height and weight in those earlier studies. Selfreported weight and height considerably underestimate the individuals' measured BMI [31,32] and may thus have weakened the association between obesity and risk of T2D and/or biased the estimated results. This may explain the lower ORs associated with BMI levels in the Mokdad et al. study, compared to the current study.
Our results are also consistent with the studies that investigated the association between BMI and risk of T2D among individuals with pre-diabetes. The Diabetes Prevention Program (DPP) is a large randomized clinical trial that ran from 1996 to 2001 (average follow-up: 2.8 years) and that enrolled individuals at higher risk for T2D (all subjects had impaired glucose tolerance at  baseline) [33]. The incidence of T2D was 58% lower (95% CI: 48-66%) among subjects who were assigned to the lifestyle-modification program (with a goal of at least a 7% weight loss of the baseline body weight) than those in the placebo group [33]. Additionally, weight loss among subjects in the lifestyle-modification program was significantly and independently associated with reductions in blood glucose from pre-diabetic to normal levels [34]. Weight loss was also associated with longterm benefit in a follow-up study of the DPP program, which found that the 10-year cumulative incidence of T2D among participants in the lifestyle-modification program was lower compared with those treated with metformin or in the placebo group [35].
The risk of T2D associated with each BMI level was estimated, adjusting for other covariates. To assess the impact of the other covariates, we estimated an unadjusted logistic regression model (with BMI level as the only covariate); the resulting ORs were 1.9 (95%CI: 1.7-2.0) for overweight, 3.7 (3.4-4.0) for Obesity Class I, 6.7 (6.2-7.4) for Obesity Class II, and 12.7 (11.6-14.0) for Obesity Class III (all p-values < 0.01). The unadjusted ORs were slightly higher than the adjusted ORs. This implies that some factors, such as age, are associated with both increased BMI and increased risk of T2D, but the impact of these factors on the association between BMI and risk of T2D is limited.
Besides the association between BMI and risk of T2D, our study also revealed other interesting findings. For example, the BMI value considered in our study was the last one observed during the 12 months before the index date, which tended to be closer to the index date among case individuals than control individuals (9.0 vs. 80.3 days, p < 0.01). This implies that individuals (and/or their physicians) may have started monitoring their BMI (and probably other clinical biomarkers) more frequently when there were indications that they may be at risk of developing T2D.
The risk of developing T2D for individuals who were overweight or obese was about 1.5-5 times higher than for individuals with normal BMI, as estimated in our study. This demonstrates the importance of continuous weight management, which not only can reduce the disease burden of obesity but also may prevent further progression to T2D. Weight management is particularly important for people with severe obesity, who were disproportionally at higher risk of developing T2D than individuals with less severe obesity. Physicians should regularly monitor the weight of their patients with obesity.
Our results should be interpreted in light of the study's limitations. First, and foremost, the use of a retrospective cohort design prevented us from understanding the causal effect of BMI on the risk of developing T2D. Second, although cases and controls were matched on broad demographic and clinical characteristics (except for BMI measures) while selecting them from the Geisinger Health System database, the distribution of demographic and baseline clinical characteristics between cases and controls in our final analytic sample were not balanced. We addressed this imbalance by adjusting for a detailed list of confounding factors, but the potential for unmeasured, and unadjusted, confounding in baseline characteristics may have remained. Another limitation is that Geisinger Health System data cannot capture health services provided outside of the system. Without available enrollment data, we considered individuals to have continuously received care in the Geisinger Health System during the 12-month pre-index period if they had activity recorded in 365 or more days before the index date. This approach is unlikely, we feel, to introduce substantial bias since there is no evidence that the likelihood for individuals to seek care outside of the Geisinger system was correlated with T2D or obesity (personal communication, Christopher Still and Thomas Graf). A similar approach has been used in previous retrospective electronic health records database studies [36,37]. Significant clinical and/or body size differences between people with and without BMI measures may also serve as another potential source of bias. Finally, the findings are based on data from a single integrated health system caring for individuals in Pennsylvania and may not be generalizable to larger populations and to other regions in the US.

Conclusion
Using a large cohort of individuals with detailed electronic health records, we were able to show that not only is BMI strongly and independently associated with the risk of being diagnosed with T2D, but also that the magnitude of this positive association is larger for higher BMI values. Further research on the association between BMI and the risk of developing T2D should include the time to the incident T2D diagnosis and, if data are available, account for individuals' pre-diabetic status and the timing and duration of obesity.

Appendix. Sample Selection Criteria
Case and control individuals needed to satisfy the following criteria: During their 12-month pre-index period they had to have: • Continuously received care in the Geisinger Health System (defined by having encounters that occurred 365 or more days before the index date) • Not be pregnant • Not had bariatric surgery • At least one valid BMI measurement Cases and controls were excluded if they had: • Any BMI measurement <18.5 kg/m 2 during the 12-month pre-index period • Any evidence of type 1 diabetes, during the study period, defined by ▪ ICD-9-CM diagnosis codes of 250.x1 or 250.x3 ▪ A medication order for insulin without a diabetes diagnosis code