Model development pipeline and top proteomic machine learning model features. (A) Outline of the pipeline used to predict myeloma by integrating proteomic and clinical data from the UK Biobank (UKB). Starting with 2920 potential proteomic predictors, a tree-based XGBoost algorithm, combined with SHAP values, was employed to rank and identify the top 10 predictors. These were then used to develop a proteomics Cox model. Clinical predictors, including age, sex, symptoms, and hematologic parameters were used to develop a clinical Cox model. Finally, the top proteomic and clinical predictors were combined to create a combined Cox model. All models were evaluated on the test data set with performance assessed using the C index and time-dependent area under the receiver operating characteristic curve. This pipeline demonstrates how advanced machine learning can be combined with traditional modelling to enhance the prediction of myeloma. (B) A bar plot of the mean absolute SHAP values for the top 10 features. In the context of a model with a Cox-loss function, a SHAP value represents the marginal contribution of each feature to the log-relative hazard (ie, risk score) from baseline for an individual. This panel provides a summary of the average of all individual contributions to the model’s predictions. The features are ranked with higher values indicating greater importance in influencing the model’s output, thereby providing a comparison of which proteomic markers are most critical in ranking myeloma hazard. (C) A scatterplot (beeswarm plot) in which each dot represents an individual data point in the data set. The points are distributed horizontally along the x-axis according to their SHAP value. Where there is a high density of similar SHAP values, points are stacked vertically. The color of the dots reflects the feature value with red indicating high feature values and blue indicating low feature values. The plot provides a granular view of how each feature contributes to the prediction at an individual level. It shows the distribution of SHAP values for each feature, revealing how consistently (or inconsistently) a feature affects the model’s output across different data points. Features with a wide range of SHAP values indicate a strong but varied impact on the model’s predictions, whereas a narrow range suggests a more uniform influence.