Scientific reports
9 Oct 2024 Predicting SARS-CoV-2 infection among hemodialysis patients using deep neural network methodsCOVID-19 has a higher rate of morbidity and mortality among dialysis patients than the general population. Identifying infected patients early with the support of predictive models helps dialysis centers implement concerted procedures (e.g., temperature screenings, universal masking, isolation treatments) to control the spread of SARS-CoV-2 and mitigate outbreaks. We collect data from multiple sources, including demographics, clinical, treatment, laboratory, vaccination, socioeconomic status, and COVID-19 surveillance. Previous early prediction models, such as logistic regression, SVM, and XGBoost, require sophisticated feature engineering and need improved prediction performance. We create deep learning models, including Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN), to predict SARS-CoV-2 infections during incubation. Our study shows deep learning models with minimal feature engineering can identify those infected patients more accurately than previously built models. Our Long Short-Term Memory (LSTM) model consistently performed well, with an AUC exceeding 0.80, peaking at 0.91 in August 2021. The CNN model also demonstrated strong results with an AUC above 0.75. Both models outperformed previous best XGBoost models by over 0.10 in AUC. Prediction accuracy declined as the pandemic evolved, dropping to approximately 0.75 between September 2021 and January 2022. Maintaining a 20% false positive rate, our LSTM and CNN models identified 66% and 64% of positive cases among patients, significantly outperforming XGBoost models at 42%. We also identify key features for dialysis patients by calculating the gradient of the output with respect to the input features. By closely monitoring these factors, dialysis patients can receive earlier diagnoses and care, leading to less severe outcomes. Our research highlights the effectiveness of deep neural networks in analyzing longitudinal data, especially in predicting COVID-19 infections during the crucial incubation period. These deep network approaches surpass traditional methods relying on aggregated variable means, significantly improving the accurate identification of SARS-CoV-2 infections.