Authors: David D. Williams, Sarina Dass, Jon Bass, Susana Patton, Sanjeev Mehta, Ryan McDonough, Colin Mullaney, Leonard D’Avolio, Mark Clements
Preventing dangerous and costly episodes of DKA is a goal of diabetes care, but clinicians lack tools to predict DKA events. We sought to compare performance characteristics of an RNN model with that of a LR model to predict hospital admission for DKA in the next 180 days.
We developed an RNN model using training (n=1453) and testing (n=1530) datasets restricted to youth <18 yo with T1D >30 days. In this sample, 6% (90/1530) of youth were admitted for DKA over 180 days. The model considered >500 features per quarter from discrete medical record data and free-text clinical documents. For comparative purposes, we developed an LR model using a testing dataset derived from the same sample (n=1388). The LR model included 10 evidence-based variables derived from critical review of the literature: number of previous DKAs (p<0.001), most recent A1c (p<0.001), age, gender, race, ethnicity, duration of T1D, days since most recent A1c, multiple daily injections vs. pump therapy, and public vs. private insurance.
We used model output parameters to produce probability of DKA admission. RNN yielded an AUC=0.77 while LR produced an AUC=0.79. We generated rank-ordered lists of the 90 youth with highest probability of DKA admission for both models and measured performance at different thresholds in rank-ordered lists. For ranks 1-5, RNN yielded precision (Pr)=100% (all 5 members on the list experienced a DKA admission) and LR yielded Pr=80%. For ranks 1-10, RNN yielded Pr=100% and LR yielded Pr=70%. For ranks 1-25, RNN yielded Pr=48% and LR yielded Pr=44%.
An RNN model, in terms of precision, performs favorably relative to LR in predicting future DKA for a rank-ordered list of youth. Patient lists rank-ordered by DKA risk can be used by clinicians to identify youth with T1D for intensive intervention.