Imputation-Enhanced Prediction of Septic Shock in ICU Patients

Sepsis and septic shock are potentially fatal complications that frequently occur in intensive care unit patients. The ability to predict which patients are at risk for sepsis and septic shock is therefore crucial to limiting the effects of these complications. Potential indications for sepsis risk are scattered in a wide range of clinical measurements, includ- ing high-temporal resolution physiological waveforms, X- rays and gene expression levels, etc., leading to a non-trivial prediction problem. Thus previous works on sepsis predic- tion have used very small, carefully curated datasets, with limited applicability. Recently however, a large, rich ICU dataset called MIMIC-II has been made publicly available, providing opportunity for more extensive modeling of this problem. However, such a large dataset inevitably comes with a substantial higher amount of missing data. In this paper, we investigate how different imputation methods can overcome the handicap of missing information while leverag- ing such a large dataset. Our results show that imputation approaches in conjunction with predictive modeling lead to a decent boost in accuracy of sepsis risk prediction and a huge improvement in prediction of septic shock, even when one is restricted to only using non-invasive measurements. Our models can be applied to any ICU patient and lead to a generalized approach for predicting sepsis related compli- cations.