Using SHAP and LIME to Explain Machine Learning Models Predicting Comorbid Depression and Stroke From Daily Dietary Nutrient Intake in a US Population-Based Study

By:
Hongwei Liu, Minghui Wu, Peng Wei et al.
Date:
2026

This population-based study used data from the US National Health and Nutrition Examination Survey (NHANES 2005–2018) to investigate how daily dietary nutrient intake relates to the co-occurrence of depression and stroke among adults aged 50 years and older. Machine learning models were developed to predict comorbidity using 46 dietary components alongside demographic and clinical factors. Although weighted quantile sum regression did not identify a statistically significant overall nutrient mixture effect, machine learning—particularly a Random Forest model—achieved high predictive accuracy (AUC = 0.945). Explainable artificial intelligence techniques, including SHAP and LIME, revealed consistent nutrient-specific signals, identifying vitamin B1, vitamin B12, vitamin C, zinc, caffeine, and alcohol-related intake as influential predictors of comorbidity risk. The findings demonstrate that integrating dietary data with interpretable machine learning can improve transparency in risk prediction and help generate hypotheses for future longitudinal and intervention studies aimed at reducing neurovascular and mental health comorbidities.