Machine learning-based interpretation and visualization of nonlinear interactions in prostate cancer survival

Introduction

A short introduction to SHAP values:

When applied to the prediction of overall survival in cancer patients, machine learning models take in an input of predictive variables and output a prediction of mortality risk. Until recently, the ability to interpret how these predictions are made was very limited due to complex modeling of nonlinear relationships and interaction effects.

SHAP values, described by Lundberg et al., represent a unified framework to interpret machine learning predictions. Essentially, a SHAP value represents how much weight was placed on a variable in the risk prediction of a given patient, with positive or negative values indicating the direction of effect.

For example, a 80-year-old patient with a positive SHAP value of 0.4 for age indicates that this patient’s age contributed to prediction of greater mortality for this patient relative to other patients in our cohort, when controlling for the other variables in the model. A different 40-year-old patient might have a negative SHAP value of -0.2 for age, indicating that this patient’s young age predicted for a better prognosis.

The easiest way to illustrate is with a visualization as shown in the next section.

Visualizing nonlinear relationships between risk factors and survival

We first trained a gradient-boosted tree model using the open-source xgboost library and generated SHAP values based on the model predictions. Data processing steps are omitted for purposes of brevity

In [ ]:
import xgboost as xgb
import shap

gbt = xgb.XGBRegressor(objective='survival:cox',
                       n_estimators=400,
                       learning_rate=0.01,
                       max_depth=4,
                       n_jobs=4,
                       silent=True,
                       subsample=0.8,
                       colsample_bytree=1)

gbt.fit(X_train, y_train_xgb, verbose=True)

shap_values = shap.TreeExplainer(gbt).shap_values(X_train)
shap_interaction_values = shap.TreeExplainer(gbt).shap_interaction_values(X_train)

We then investigated consistency with clinical intuition in visualizing nonlinear relationships. We examine the relationship between age and mortality risk in prostate cancer.

Intuitively, we know that this relationship should be nonlinear. A 40-year-old patient and a 50-year-old patient, controlling for other factors such as PSA and Gleason score, should be relatively similar in terms of mortality risk. In contrast, a 70-year-old and an 80-year-old should have substantially different mortality risk. This relationship is well captured by the SHAP value graph:

In [2]:
shap.dependence_plot("AGE", shap_values, X_train, display_features=X_display)
Out[2]:

Visualizing the same relationship using conventional linear models such as Cox proportional hazards regression is difficult. The typical approach would be to arbitrarily define bins (such as age 41-50, age 51-60, etc.) and calculate relative hazard ratios. However, this approach is cumbersome to implement, difficult to effectively visualize, and the results are generally not robust to changes in the bin thresholds. SHAP values provide a much more elegant approach.

SHAP values can also help supplement clinical intuition for risk stratifying patients. For example, percent positive cores of 50% is considered the cut-off for unfavorable versus favorable intermediate-risk prostate cancer. A clinician may wonder: is there a steep change between a patient with 47% positive cores versus 53% positives cores?

In [4]:
fig = shap.dependence_plot("PPC", 
                           shap_values[X_train['cGleason'].isin([6, 7])], 
                           X_train[X_train['cGleason'].isin([6, 7])], 
                           interaction_index="cGleason", 
                           display_features=X_display[X_train['cGleason'].isin([6, 7])], 
                           show=False)

plt.savefig('shapppc.png')
Out[4]:

Using SHAP values we can see that the relationship is fairly linear. Although there does appear to be an inflection point near 50%, the average SHAP value for a patient with 53% positive cores is fairly similar to the average SHAP value of a patient with 47% positive cores.

Modeling nonlinear interaction effects between PSA and Gleason score

We next apply SHAP interaction values, which isolate the impact of interaction effects on predictions, to visualize nonlinear interactions between variables. We use the recently published example of low Prostate-specific Antigen (PSA), High-grade prostate cancer, which has been identified as a unique entity with particularly poor prognosis (Mahal et al. Eur Urol. 2018).

By plotting the SHAP interaction values, we can rapidly visualize that patients with low PSA and Gleason 9-10 have high SHAP values (higher mortality) with a rapidly sloping U-shaped distribution. The same interaction effect is not visualized for Gleason 7-8.

In [5]:
shap.dependence_plot(("PSA", "cGleason"), 
                     shap_interaction_values[X_train['cGleason'] != 6], 
                     X_train[X_train['cGleason'] != 6], 
                     display_features=X_display[X_train['cGleason'] != 6], 
                     interaction_index="cGleason", 
                     show=False)
Out[5]:

Modeling nonlinear interaction effects between percent positive cores and Gleason score

The same application of SHAP interaction values can reveal new insights. For example, percent positive cores are generally used to risk stratify patients in intermediate-risk prostate cancer (with Gleason 6-7 disease). However, plotting the SHAP interaction values demonstrates that an even more pronounced interaction exists between Gleason score and percent positive cores in patients with Gleason 8+ disease.

In [6]:
shap.dependence_plot(("PPC", "cGleason"), 
                     shap_interaction_values, 
                     X_train, 
                     display_features=X_display, 
                     interaction_index="cGleason", 
                     show=False)
Out[6]:

Subsequent confirmatory analysis using linear models supported this finding: using Kaplan-Meier estimates, 5-year OS was 87.7% in Gleason 8+ patients with PPC < 50% versus 77.2% in patients with PPC≥50% (p < 0.001), compared to 89.1% versus 86.0% in Gleason 7 patients (p < 0.001). The Cox interaction term between PPC≥50% and Gleason score 8+ was highly significant (p < 0.001).

Future studies might investigate this interaction further and identify subsets of high-risk prostate cancer patients for which therapy could be intensified or de-intensified.