Exploring Preprocessing Techniques for Prediction of Risk of Readmission for Congestive Heart Failure Patients

Congestive Heart Failure (CHF) is one of the leading causes of hospitalization, and studies show that many of these ad- missions are readmissions within a short window of time. Identifying CHF patients who are at a greater risk of hospitalization can guide the implementation of appropriate plans to prevent these readmissions. Developing predictive modeling solutions for such disease related risk of readmissions is extremely challenging in healthcare informatics. It involves integration of socio-demographic factors, health conditions, disease parameters, hospital care quality parameters, and a variety of variables specific to health care providers making the task immensely complex. This work, in collaboration with experts from Multicare Health Systems (MHS), describes a soon to be deployed prototype to predict risk of readmission within 30 days of discharge for CHF patients at MHS. We focus on data extraction and data preprocessing steps to improve prediction outcomes, including feature se- lection, missing value imputation and data balancing. We perform comprehensive empirical evaluations using the real- world health care data set provided by MHS. Our empirical evaluation demonstrates that we outperform one of the nearest competing previous results.
Workshop on Data Mining for Healthcare (DMH), Chicago, IL
S.-C Chin, N. Meadem, N. Verbiest, K. Zolfaghar, J. Agarwal and S. Basu Roy