Sample number | ML models | Refs. |
---|---|---|
Early diagnosis and prediction of diabetes | ||
T2DM | ||
15,005 subjects with ageĀ ā„ Ā 3 | XGBoost, DNN, and RF | [63] |
1512 subjects | LR, RF, Naive Bayes (NB), SVM, XGBT, ANN, K-nearest neighbor (KNN), DT, XceptionResNet 50, DenseNet121, Vgg16, Vgg19, and InceptionV3, Stacking model of non-invasive variables and the Resnet50 model | [53] |
530 participants: 272 were diabetic patients and 258 were non-diabetic patients | Deep autoencoder learning algorithm with CNN networks and deep radial basis function neural network (RBFNN) classifier | [52] |
217 participants with diabetes, prediabetes and normal conditions | SVM, K-nearest neighbors, RF, XGBoost, hybrid feature selection-XGBoost | [91] |
2371āT1-weighted whole-body MRI data sets | DenseNet architecture | [54] |
8454 subjects over five years of follow- up | XGBoost, SVM, LR, RF, and ensemble algorithms | [64] |
16,429 men and non-pregnant womenāā„ā20āyears of age | ANN, LR, and RF models | [55] |
453,487 T2DMĀ patients | Reverse engineering and forward simulation (REFS) | [124] |
82 obese women (40 non-diabetic and 42 diabetes) | Separability-correlation measure (SCM) and ANN | [57] |
13,309 Canadian patients | GBM and LR | [92] |
Kaggle diabetes dataset | RF | [58] |
1492 healthy individuals | SVM | [59] |
10 patients | LR, CNN, Multi-Layer Perceptrons (MLPs), and ensembling methods | [60] |
4870 subjects (2955 females and 1915 males) | Bayes classifier and LR | [79] |
768 individuals, 500 healthy and 268 with T2DM (UCI Machine Learning Repository: Pima Indians diabetes data set) | AIRS2 and MAIRS2 | [61] |
Pima Indian women | DT and LR | [65] |
2970 youth aged 12ā19āyears (NHANES dataset) | LR, LogitBoost, and decision tree | [66] |
Ā Ā 746 subjects | SVM, XGBoost, RF, and their combinations | [62] |
GDM | ||
Ā Ā 22,242 singleton pregnancies (3182 women developed GDM) | RF, logistic, decision tree, XGB, GDBT, LGB, AdaBoost, Vote, logistic regression with RCS and stepwise logistic regression | [75] |
Ā Ā 490 pregnant women, 215 with GDM and 275 controls | SVM and light gradient boosting machine (lightGBM) | [76] |
Ā Ā 588,622 pregnancies from 368,351 women | Gradient-boosting machine model constructed by decision-tree base-learners | [74] |
Ā Ā 4378 cases | CSHM, BN, LR, CHAID tree, SVM, and NN | [77] |
Ā Ā 152 women | AIRS | [80] |
Ā Ā 4771 pregnant women in early gestation | Multivariate Bayesian logistic regression using Markov Chain Monte Carlo simulation algorithm | [81] |
All types of Diabetes | ||
2001 cases with diabetes (Kaggle dataset) | Filter based DT-(ID3) algorithm for features selection and Hold out, K-fold, and LOSO for classification | [83] |
Ā 852 454 individuals with pre-diabetes | LightGBM | [86] |
Ā 1050 curves of glucose concentration of type 1 and type 2 diabetics | Double-Class AdaBoost | [87] |
Ā 268 females and 500 controls | Gaussian process (GP)-based classification approach | [88] |
Ā 5301 African Americans | RF | [89] |
Ā 268 females and 500 controls | Fuzzy c-means (FCM)- on adaptive network-based fuzzy inference system (ANFIS) | [90] |
Pima Indian Diabetes Dataset and Biostat Diabetes Dataset | RLEFRBS | [95] |
Prediction of blood glucose (BG) | ||
OhioT1DM dataset: six participants with T1D between 40 and 60āyears old | SVM, extended tree classifier (ETC), and random forest classifier (RFC) | [121] |
IDIAB, OhioT1DM dataset, and T1DMS datasets | Fully convolutional neural network | [101] |
225 T1DM patients with 315,000āh of CGM data | Linear extrapolation, NNs, last observation carried forward, ensemble methods using LSBoost and bagging, one with error-weights, and one without error-weights | [122] |
Blood glucose concentration values of 180āh in diabetic patients and GCM of every 5āmin | Multi-scale blood glucose prediction model (VMD-KELM-AdaBoost) | [102] |
OhioT1DM dataset | Autoregression with ARX model, ML-based regression models, and DL models including a TCN and a vanilla LSTM Network | [104] |
Ā 10 adult T1DM subjects which was generated using the UVA/Padova T1D | Multi-layer convolutional recurrent neural network (CRNN) architecture | [105] |
OhioT1DM dataset | LSTM-based deep RNN | [107] |
104 people who had experienced at least one hypoglycemia alert value during a three-day CGM session | SVM using radial basis or linear functions, RF, LR, and K-nearest neighbor | [116] |
10 T1DM patients with continuous glucose monitoring system data points | A combination of AR, SVR, and ELM | [109] |
10 T1DM adults studied during 12āweeks | SVM and MLP | [119] |
10,000 users with more than 1Ā million nights of CGM data | RF | [120] |
26 participants | LSTM-NN-TF-DTW model | [143] |
8501 eligible participants | LASSO regression and RF | [130] |
124 CGM traces collected over 10ādays | Autoregressive, autoregressive moving average, and autoregressive integrated moving average (ARIMA)) and nonlinear machine-learning procedures (SVR, feed-forward neural network (fNN), regression random forest, and LSTM-NN | [110] |
Six subjects suffering from T1DM aged between 23 and 52 (average 39āĀ±ā10) | Jump Neural Network | [154] |
124 people (22,804 valid nights of data) with T1D | SVR | [117] |
Ā 463 people with T1DM | Linear discriminant analysis | [118] |
154 observations of in-clinic aerobic exercise in 43 adults with T1DM | Decision tree and Random forest | [126] |
16 children with T1DM | Extreme learning machine (ELM)-based neural network | [127] |
8 patients (320 data points) and a testing set with 8 patients (269 data points) | ELM trained feed-forward neural network | [128] |
24Ā 331 adults | Bayesian scoring algorithm | [78] |
Ten male subjects with T1DM | pattern classification algorithm | [125] |
27,050 adult individuals with no prior diagnosis of T2DM | XGBoost, RF, Glmnet, and LightGBM | [112] |
6.8Ā million data points | Combination of GBD and SVR | [131] |
10 patients using 70āmg/dL and 54āmg/dL as thresholds according to the consensus for Level 1 and Level 2 hypoglycemia | Developed SVM | [129] |
The health data associated with 18 691 ICU stays and 14 742 critical care patients (MIMIC-III database) | GBT | [132] |
29,601 entries from 47 different patients | SVR, WNN, KNN, RFR, GPR, ANN, and RR | [134] |
54 978 inpatients who had a minimum of 4 BG measurements and took a minimum of 1 U of insulin during hospitalization | RF classification, multivariable logistic regression, stochastic gradient boosting (SGB), and naive Bayes | [114] |
25 T1DM patients | RF | [135] |
OhioT1DM dataset | LR, vanilla LSTM, and BiLSTM | [137] |
Detection of blood glucose | ||
12 healthy subjects | Back-propagation neural network (BPNN) and multivariate polynomial regression | [151] |
540 patients with T2DM | Nonlinear and linear predictive algorithms | [152] |
2787 consecutive participants | Combination of elastic network with RF, SVM, and back-propagation artificial neural network (BP-ANN) algorithms as well as LR | [155] |
1772 paired data varying from 65ā~ā492āmg/dl and 80ā~ā352āmg/dl | AdaBoost | [156] |
15 patients with T1DM under free-living conditions | RReliefF, RF, Gaussian, SVR | [157] |
Ā EMR of 127 patients for the first 72āh of ICU care who upon admission to the ICU had a diagnosis of type 1 (Nā=ā8) T2DM (Nā=ā97) or a glucose valueā>ā150āmg/dl (Nā=ā22) | GBT | [133] |
Insulin resistance predicting models | ||
8842 Koreans participants | LR, XGBoost, random forest, and ANN | [159] |
1344 samples | HOMA-IR model | [160] |
2433 T2DM patients | MIL-Boost | [161] |
968 patients not affected by T2DM (FIMMG_obs dataset) | TyG-er | [162] |
Ā 315 T1DM patients | MARSplines and ANN | [163] |
Determination of the start of treatment and its effect | ||
13 904 diabetes individuals | LASSO | [164] |
100 virtual adult subjects | LASSO and MLR | [166] |
100 virtual subjects | GBT and RF | [165] |
87 patients | Reinforcement learning | [168] |
The two studies had a similar design but enrolled patients who were treatment- naĆÆve (study 1, nā=ā677) or receiving background metformin (study 2, nā=ā686) | RF and classification tree algorithms | [169] |
12,147 commercially-insured adults and Medicare Advantage beneficiariesĀ with prediabetes or diabetes | RL both with and without regularization and/or stepwise feature selection, Tree-based models, SVM, multivariate adaptive regression splines, and flexible discriminants | [170] |
1270 patients with T2DM | Weighted SVM | [171] |
100 virtual adults | Neural networks | [167] |
3029 patients | Logistic ML algorithm | [172] |
Risk assessment of Diabetes | ||
25,186 patients | Regularized and weighted RSF | [177] |
273 678 patients | DeepSurv and RSF | [178] |
11,000 persons | RF classifier | [179] |
15,928 Chinese adults without diabetes at baseline (DRYAD) | XGBoost | [180] |
1,832, 270 cases of type 2 diabete | Gradient boosting decision tree algorithm and LightGBM | [181] |
6025 participants | Naive Bayes approaches and LR | [182] |
40,124 patients from the GIANTT database | Ridge logistic regression, logistic regression with backward selection, LASSO LR, elastic net logistic regression, and RF | [183] |
36,652 eligible participants from the Henan Rural Cohort Study | Classification and regression tree (CART), RF, GBM, LR, SVM, and ANN | [184] |
997 subjects with CT scans and contextual EMR scores | Deep neural network | [185] |
17,658 in-patients with diabetes who underwent 32,758 admissions | LR and XGBoost | [186] |
10,464 diabeticĀ patients | LR | [187] |
34 patients | Aggregation method | [188] |
1647 obese, hypertensive patients | KNN and RF | [189] |
800 T2DM patients | BN, ANN, CRT, CHAID, discriminate, QUEST, and ensemble models | [190] |
112 patients over a range of 90ādays | LR and RF | [191] |
Dietary and insulin dose modifications | ||
23 adults with newly diagnosed T2DM | Algorithm-based personalized postprandial-targeting | [194] |
100 adults under different realistic scenarios lasting three simulated months | Reinforcement learning | [195] |
Diabetes management | ||
12 subjects with T1DM | Linear discriminant analysis, ensemble learning, Gaussian process regression, KNN, SVM, decision trees, and deep neural networks with LSTM | [199] |
16,848 inpatients receiving subcutaneous insulin who achieved target blood glucose control of 100ā180 mg/dL on a calendar day | A combination of RF, regularized regression, and GBT | [200] |
110 pediatric patients with T1DM | RF and quantile regression forest | [202] |
68,274 samples collected from 1119 subjects | Deep learning | [204] |
116 subjects | SVM | [205] |
D1NAMO dataset contains data for nine patients with T1DM | RNN-LSTM | [207] |
70 participants with T1DM | K-means clustering | [208] |
100 subjects over a two-month scenario | XBM | [209] |
250 24āh CGM plots | SVR and multilayer perceptrons | [210] |
15 patients with T1DM | SVR | [211] |
3 real subjects | Multiple boundaries and domain-based, density-based, reconstruction-based, and unsupervised models | [212] |