All patients included in our study were recruited between 2007 and 2011 from the Lithuanian High Cardiovascular Risk (LitHiR) primary prevention programme . This long-term programme has focused on employable-aged women (aged 50–65) and men (aged 40–55) without overt cardiovascular disease. Cardiovascular disease was defined as angina pectoris, known coronary stenosis, myocardial infarction, coronary artery bypass grafting, percutaneous coronary intervention, transient ischemic attack or stroke, and peripheral artery disease. As part of the programme, a two-level approach involving primary healthcare institutions (PHCI) and specialized cardiovascular prevention units (CVPU) was applied. Five secondary-level institutions having CVPU participated in the LitHiR programme across Lithuania, including the Vilnius University Hospital Santariškių Klinikos. Participants of the first level of the programme were recruited in three ways. The first group consisted of people registered in PHCI and invited by general practitioners to participate in the programme. The second group consisted of people who visited PHCIs for reasons other than cardiovascular problems. The third group included people who found out about the programme via local mass media. All participants had to match the programme criteria. After cardiovascular risk evaluation at the PHCI level, subjects for whom high cardiovascular risk was established were sent for additional examination and treatment plans in the CVPUs (secondary level). High cardiovascular risk was defined as having one or more of the following conditions: 1) a Systematic Coronary Risk Evaluation (SCORE)  risk assessment of over 11, 2) diabetes, 3) metabolic syndrome, 4) positive family history of cardiovascular disease and/or 4) severe dyslipidemia.
The number of PHCIs taking part in this program was 385/420, which comprise 91.6% of all PHCI in Lithuania. From 2006 to 2010, 266,391 patients were examined overall. Out of those patients, our cohort includes 2891 [1072 (37%) men and 1819 (63%) women] patients who were diagnosed with MetS and referred to the CVPU at the Vilnius University Hospital Santariškių Klinikos for additional assessment, risk stratification, and setting up of a prevention plan.
We carried out follow-up calls between January 2011 and August 2011 for 650 out of the 2891 subjects with MetS initially referred to the CPVU in Vilnius University Hospital Santariškių Klinikos. These follow-up calls were made with preference to subjects who were examined earlier in the programme. After we excluded four subjects whose follow-up periods were less than two years, the median of the follow-up period was 3.3 years. We also excluded 117 participants who already had diabetes at the baseline examination and 4 participants with missing information on their diabetic status. As a result, the final study cohort consisted of 525 individuals, with 187 (36%) men and 338 (64%) women.
The study was approved by the Local Ethics Committee of the Vilnius University Hospital Santariškių Klinikos.
Diagnosis of MetS
We diagnosed patients as having MetS if they met three or more of the revised National Cholesterol Education Program Adult Treatment Panel III (NCEP ATPIII) criteria [15
Waist circumference ≥ 102 cm in men, ≥ 88 cm in women
Triglycerides ≥ 1.7 mmol/L
High-density lipoprotein cholesterol < 1.03 mmol/L in men, < 1.29 mmol/L in women
Blood pressure (BP) ≥ 130/85 mmHg
Fasting plasma glucose (FPG) ≥ 5.6 mmol/L
We calculated the MetS score as the sum of MetS components present.
All participants in our study underwent a baseline examination, which included gathering information on their medical history, physical examination, risk profile and lifestyle assessment, evaluation of cardiovascular (CV) family history, 12-lead electrocardiogram (ECG), laboratory blood tests, and non-invasive assessment of arterial markers of subclinical atherosclerosis. Weight, height, and waist circumference were measured with the subject wearing light clothing and without shoes. BMI was calculated as weight in kilograms divided by the square of height in meters. Blood pressure was measured after the patient rested at least five minutes, using an oscillometric semiautomatic device (Schiller Argus VCM) with a standard bladder (12–13 cm long and 35 cm wide), validated according to standardized mercury sphygmomanometer. We took at least one measurement on each arm and additional measurements if the first two were significantly different. The higher value was taken as the reference one and the average of the two highest values, if measured more than twice. Assessment of arterial stiffness was carried out by applanation tonometry (Sphygmocor v.7.01, AtCor Medical).
Information about smoking and drug use was collected by a questionnaire. Current smoking was recorded if the subject smoked at least one cigarette a day. Positive CV family history was recorded if first-degree relatives of the patient had any CV events at a young age (men ≤ 45 years, women ≤ 55 years old).
Laboratory tests and assessment of glucose metabolism
Venous blood samples were collected after patients completed a 12-hour fast. Serum cholesterol [17, 18], triglycerides [19, 20], and plasma glucose concentrations were determined enzymatically. High-density lipoprotein cholesterol was analyzed by the Accelerator Selective Detergent method (Architect ci8200; Abbott Laboratories, Abbott Park, Illinois, USA). Low-density lipoprotein cholesterol was calculated with the Friedewald formula . High-sensitivity serum C-reactive protein (hs-CRP) was analyzed by a latex turbidimetric immunoassay kit (Architect ci8200; Abbott Laboratories, Abbott Park, Illinois, USA). Multigent HbA1c was determined by turbidimetric microparticle immunoinhibition assay (Architect ci8200; Abbott Laboratories, Abbott Park, Illinois, USA). Plasma fasting and oral glucose tolerance test (OGTT) insulin were measured by chemiluminescent microparticle immunoassay (CMIA) (Architect ci8200; Abbott Laboratories, Abbott Park, Illinois, USA). A standard 75-g OGTT was carried out after patients completed a 12-hour overnight fast. Plasma glucose and insulin concentrations were measured at 0 and 120 minutes. The examination protocol allowed the omission of OGTT, HbA1c, and fasting insulin tests for patients with FPG < 5.6.
We classified the subjects into various categories of glucose tolerance using the WHO criteria . Normal glucose tolerance (NGT) was defined by fasting glucose <6.1 mmol/l and 2-h OGTT glucose <7.8 mmol/l. Impaired fasting glucose was defined by fasting glucose ≥ 6.1 mmol/l and <7.0 mmol/l and 2-h OGTT glucose <7.8 mmol/l. Impaired glucose tolerance was defined by fasting glucose <7.0 mmol/l and 2-h OGTT glucose between 7.8 and 11.0 mmol/l inclusive. Diabetes was defined by fasting glucose ≥7.0 mmol/l and/or 2-h OGTT glucose ≥11.1 mmol/l.
Insulin resistance indices
In this study, we considered four surrogate indices for the assessment of insulin resistance (IR) or insulin sensitivity. The Homeostasis Model Assessment insulin resistance (HOMA-IR) index  was calculated as fasting insulin [μU/ml] × FPG [mmol/l] / 22.5. The quantitative insulin-sensitivity check index (QUICKI) index  was calculated as 1/[log(fasting insulin [μU/ml]) + log(FPG [mg/dl])]. The Cederholm insulin sensitivity index (ISI), which represents peripheral insulin sensitivity, was calculated as ISICederholm = 75000 + (G0-G120) × 1.15 × 180 × 0.19 × weight/120 × Gmean × log (Imean) , where G0 and G120 are plasma glucose (mmol/l) concentrations at 0 and 120 minutes, and Gmean and Imean are the mean glucose (mmol/l) and insulin (mU/l) values calculated from values at 0 and 120 minutes. Finally, the Matsuda insulin sensitivity index, which reflects a composite estimate of hepatic and muscle insulin sensitivity, was calculated as ISIMatsuda = 10,000 / sqrt (G0 x I0 x G120 x I120) [26, 27], where G0, G120, and I0, I120 are the plasma glucose (mg/dl) and the plasma insulin (μU/ml) concentrations respectively at time 0 and 120 minutes.
We conducted descriptive statistics on the study cohort at the baseline; we calculated the mean and standard deviation (SD) for the continuous variables and the frequency and proportion for the categorical variables. The investigated set of variables included: age, gender, smoking status (never, former, current), BMI, waist circumference, weight, FPG, HbA1c, fasting plasma insulin, OGTT glucose, OGTT insulin, serum triglycerides, total cholesterol, HDL cholesterol, LDL cholesterol, lipid treatment (1=yes, 0=no), hs-CRP, aortic and radial pulse wave velocity (aPWV, rPWV), aortic augmentation index adjusted for heart rate 75 beats per minute (AIx@75), mean arterial pressure (MAP), MetS score, HOMA-IR, QUICKI, ISIMatsuda, and ISICederholm.
We measured the association between each variable and the development of T2DM by calculating gender-adjusted odds ratios (ORs). We initially included the gender variable in any set of predictors tested. We investigated the dependency between variables and their cumulative contribution to the prediction based on their combined logistic regression model. P values are based on two-sided tests with a cutoff for statistical significance of 0.05. To address the inherent problem of multiple hypotheses testing, we applied the Bonferroni correction, multiplying the P value by the number of independent tests.
We performed all tests on complete data; that is, excluding those patients with data missing for the relevant variables. We used Little’s  missing completely at random (MCAR) test to identify systematic differences between the missing values and the observed values. A significant P value in Little’s MCAR test, indicating the existence of such systematic differences, means that it is plausible that data are missing at random (MAR), but not completely at random (MCAR). In these cases, since restricting analyses to complete cases can introduce bias, we validated the results using multiple imputation [29, 30]. We used the fully conditional specification  imputation method, as implemented in SPSS MULTIPLE IMPUTATION command, to make 20 complete datasets. We then combined (pooled) multiple analyses’ results using Rubin’s Rules [30, 32].
In a separate analysis, we considered the tested variables using a stepwise algorithm that automatically selected variables for a multivariate logistic regression model. This method used the Bayesian Information Criterion (BIC), which assesses model fit based on a log-likelihood function . The model with the lowest value of BIC is the one preferred. We took a “forward” approach, starting with a model initialized with the gender variable, adding at each step one variable that maximally reduced the BIC statistic and terminated when the BIC statistic stopped decreasing. We estimated the accuracy of the predictive models using leave-one-out cross-validation; that is, each subject in its turn was used as a validation set, while the remaining subjects were used to generate the model. We assessed the predictive discrimination of the model using the receiver-operating characteristic (ROC) curve of the scores of all subjects by plotting the sensitivity against the corresponding false positive rate. We used the area under the ROC curve, calculated by the trapezoidal rule, to measure how well a model predicts the development of T2DM. The model generation involved a preliminary step of data imputation for missing values using mean values. We also used an alternative analyses using K-nearest-neighbors data imputation, which yielded similar results; only the mean imputation results are presented.
All statistical and modeling analysis was done using MATLAB 7.13 (R2011b) and SPSS Statistics 19.0.0.