JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]


Journal of Information Science and Engineering, Vol. 37 No. 5, pp. 1025-1038


A Comparative Study of Machine Learning Models for Predicting Length of Stay in Hospitals


RACHDA NAILA MEKHALDI1, PATRICE CAULIER1, SONDES CHAABANE1,
ABDELAHAD CHRAIBI2 AND SYLVAIN PIECHOWIAK1
1Laboratory of Industrial and Human Automation Control
Mechanical Engineering and Computer Science
Polytechnic University of Hauts-de-France
Valenciennes CEDEX 9, 59313 France
E-mail: frachdanaila.mekhaldi; patrice.caulier; sondes.chaabane; sylvain.piechowiakg@uphf.fr

2Alicante Company
Seclin, 59113 France
E-mail: abdelahad.chraibi@alicante.fr


There has been a growing interest in recent years in correctly predicting the Length of Stay (LoS) in a hospital setting. Estimating the LoS on patient’ admission helps hospitals in planning, controlling costs and, providing better services. In this paper, we consider predicting the LoS as a regression problem for which we implement and compare different Machine Learning (ML) algorithms. Multiple Linear Regression (MLR), Support Vector Machines (SVM), Random Forests (RF), and Gradient Boosting model (GBM) are implemented using an open-source dataset. The methodological process involves a preprocessing step combining data transformation, data standardization, and categorical data encoding. Moreover, the Synthetic Minority Over Sampling Technique for Regression (SMOTER) is applied to handle unbalanced data. Then, ML algorithms are employed, with a hyperparameter tuning phase to obtain optimal coefficients. Finally, Mean Absolute Error (MAE), R-squared (R2), and Adjusted R-squared (Adjusted R2) metrics are selected to evaluate the model with parameters.


Keywords: length of stay in hospitals, data preprocessing, machine learning, unbalanced data, parameters tuning

  Retrieve PDF document (JISE_202105_03.pdf)