• A novel CHARM was developed and validated in older patients to predict NRM and survival after allogeneic transplants.

  • CHARM performed better than HCT-comorbidity index per DCA for NRM.

Abstract

Allogeneic hematopoietic cell transplantation (allo-HCT) is potentially curative for older adults with hematologic malignancies. Concerns on nonrelapse mortality (NRM) in older adults limit allo-HCT utilization. We executed a prospective, observational study BMT-CTN 1704 (Blood and Marrow Transplant Clinical Trials Network) enrolling allo-HCT recipients aged ≥60 years from 49 centers in the United States. We analyzed associations between 13 measurements of older adult health and NRM within 1 year to construct a comprehensive health assessment risk model (primary-CHARM) using multivariate Fine-Gray model and grouped penalized variable selection. Two machine learning (ML) models (Cox and pseudo-value boosting) were also explored. Models’ performances were compared using area under the curve (AUC), with bootstrap and cross-validation sampling to correct for optimism, decision curve analysis (DCA), calibration, and Brier scores. Among 1105 patients with median age of 67 (range, 60-82) years who received allo-HCT, NRM was 14.4% and overall survival (OS) 71.7% at 1 year. Factors statistically selected for inclusion in primary-CHARM were higher comorbidity burden, lower albumin, higher C-reactive protein, older age, higher weight-loss percentage, lower patient-reported performance score, and cognitive impairment. Primary-CHARM scores were independently associated with higher NRM (hazard ratio [HR], 2.72; P < .0001) and worse OS (HR, 2.09; P < .0001). Bootstrap bias–corrected AUC for primary-CHARM was 0.591. Comparing primary-CHARM with HCT-comorbidity index and 2 ML-CHARM models, calibration, Brier score, and DCA analysis favored primary-CHARM. Primary-CHARM, with mostly simple and readily available parameters, risk stratifies older adults for allo-HCT. Adopting primary-CHARM in practice may promote broader use of HCT by quantifying risk and enhance the design of strategies to improve outcomes. This trial was registered at www.ClinicalTrials.gov as #NCT03992352.

Hematologic malignancies, such as acute myeloid leukemia, are more frequent among adults 60 years of age or older.1 Outcomes of these older adults are generally worse than those of younger patients.2 Allogeneic hematopoietic cell transplantation (allo-HCT) provides a potential cure for hematologic malignancies,3 with continued evidence of improving allo-HCT outcomes in older adults.4 Yet, only a small fraction of older adults with hematologic malignancies are offered allo-HCT,5 indicating uncertainty about the benefit of allo-HCT in this population.6,7 Older age is one of the largest barriers to referral for allo-HCT.8 One method to address this uncertainty is by optimizing methods of prognostic evaluation for nonrelapse mortality (NRM).

Before this study, an HCT-specific comorbidity index (HCT-CI) was widely used to risk stratify patients.9 The 2014 Blood and Marrow Transplant Clinical Trials Network (BMT-CTN) State of the Science Symposium highlighted that the optimal care for older allo-HCT recipients should include a comprehensive prognostic assessment that considers other potentially important patient-specific risk factors.10 Parameters included in geriatric assessments have been found to be linked to HCT outcomes,6,11 and such assessments are, in general, strongly recommended by international geriatric associations for older adults with cancer.12 Specifically, physical function, cognition, and gait speed are suggested to affect mortality.6,13-15 Serum laboratory biomarkers such as C-reactive protein (CRP) and albumin are also independently associated with NRM.16,17 

Here, we report the results of the first large prospective national study, BMT-CTN 1704, executed to build and validate a novel comprehensive health assessment risk model (CHARM) from relevant prognostic parameters. The aim is to improve prediction of NRM and other outcomes among older recipients of allo-HCT that could enhance patient counseling and improve outcomes.

We followed the Enhancing the QUAlity and Transparency of Health Research (EQUATOR) reporting guidelines18 that use the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) reporting criteria for observational studies.19 Likewise, we followed the Transparent Reporting of a multivariate prediction model for Individual Prognosis or Diagnosis (TRIPOD) guidelines.20 

Study design, setting, and participants

This was a multicenter (n = 49), prospective, observational clinical trial (ClinicalTrials.gov identifier: NCT03992352) among potential recipients of allo-HCT, who were ≥60 years of age.

Primary objective

To determine the set of assessments and biomarkers that could together constitute a robust and valid composite health risk model for accurate personalized estimation of NRM. We report on this outcome here.

Secondary objectives

To determine the association of CHARM with differences in overall survival, frailty-free survival, disability, skilled-facility admission, quality of life and acute and chronic graft-versus-host disease (GVHD), serious organ toxicities, and survival after acute GVHD. We only report on overall survival here.

The trial was approved by the National Marrow Donor Program institutional review board. All participants provided written informed consent.

Inclusion criteria were patients (1) aged ≥60 years; (2) who had a diagnosis of a hematologic malignancy; (3) eligible for allo-HCT per institutional standards; (4) able to speak and read English, Spanish, and/or Mandarin; and (5) willing and able to provide informed consent. Exclusion criterion was previous HCT.

There were no racial/ethnic/gender expectations for this study. Furthermore, there were no restrictions for choice of conditioning regimen, stem cell source, GVHD prophylaxis regimen, or donor type.

Additional details are provided in the supplemental Material and supplemental Tables 1-3.

Variables, data sources/measurement, quantitative and categorical variables

Primary end point

The primary end point was 1-year NRM. NRM was defined as death without relapse or progression of the primary hematologic malignancy. Relapse or progression was a competing risk.

Secondary end point

The secondary end point was 1-year OS, with the event for this end point defined as death from any cause, and the time to the event was the time interval between date of transplant and death, with patients censored at last follow-up or 1 year, whichever was first.

Exposure

Exposure was receiving allo-HCT regardless of type of conditioning regimen, donor, donor-recipient HLA matching, stem cell source, or disease status. All patients receiving allo-HCT were considered evaluable participants for analyses.

The variables considered for CHARM and studies supporting the rationale of their use

The variables are the following: patient age at allo-HCT21-24; HCT-CI score25,26; serum albumin level16,17; CRP17; cognition level per the Montreal Cognitive Assessment (MoCA)6,27,28; percentage of weight loss over the preceding year29,30; scoring on patient-reported Karnofsky performance status (KPS)31-33; 4-meter gait speed6,34-36; Patient-Reported Outcomes Measurement Reporting System (PROMIS) physical function scale13,37,38; Instrumental Activities of Daily Living36,39; number of falls in the preceding 6 months40; scoring on PROMIS Depression scale6,36,37,41,42; and number of prescribed medications.40,43 Information about all CHARM variables was collected within the 2 to 3 weeks before start of conditioning for allo-HCT. Additional details, including handling of missing data, are provided in the supplemental Material.

Before allo-HCT, a survey assessed the treating HCT physicians’ estimates of their respective patients’ chances of 1-year survival, inspired by a previously used questionnaire.44 

Statistical methods

Study size and sample calculation

Study size was based on targeting an events-per-variable ratio (EPV) of 12, considering an EPV of 10 to 15 as guidance for building prediction models with time-to-event outcomes.45 We projected the EPV of 12 using an NRM rate of 22% at 1 year based on historical data reported to the Center for International Blood and Marrow Transplant Research (CIBMTR) for patients aged ≥60 years from 2012 to 2016 receiving allo-HCT. Planning 16 variables in the NRM model (13 health variables and 3 adjustment variables) required a sample size of 880 subjects to meet the EPV target. With this sample size, a binary predictor with a frequency of 0.25 would have at least 80% power to detect a hazard ratio (HR) of 1.75. To account for potential dropout of patients before receiving allo-HCT (estimated at ∼20%), the sample size was inflated to target 1100 subjects with complete data on each CHARM variable. Because of the significant amount of missing data (especially CRP) among the first 1100 patients, a decision was made to continue accrual for an additional 126 patients. Additional details are provided in the supplemental Material.

Primary outcome

We fit a multivariate Fine-Gray model for the subdistribution hazard of NRM to build the primary-CHARM model. A penalized variable selection strategy with a smoothly clipped absolute deviation penalty was used to identify potential variables to retain in the model.46 This analysis was the primary analytic approach per protocol, and the resulting model is referred to as primary-CHARM. The 13 covariates were analyzed as continuous covariates whenever possible with linear and quadratic terms. In sensitivity analysis, the model was adjusted for donor type and HLA matching, donor/recipient cytomegalovirus status, and intensity of conditioning regimen.47 Generalized cross-validation was used to choose the best lambda penalty parameter. The primary-CHARM score was constructed as a linear predictor based on the log subdistribution HRs. The predictive performance of the model was summarized using the time-dependent area under the receiver operating characteristic (ROC) curve48 for prediction of NRM at 1 year, as implemented in the R package time receiver operating characteristic with patients experiencing competing risks considered as “controls.” Bias-corrected measures of predictive model performance49 using the bootstrap50 and cross-validation resampling methods were used to correct for optimism or overestimation of the performance metrics because the primary-CHARM model was trained on the full data set. To describe clinical effects, primary-CHARM scores and outcomes were displayed by tertiles.

Other considerations

We performed exploratory machine learning (ML) modeling using boosting applied to pseudovalues.51 This approach directly models the cumulative incidence of NRM at 1 year. We also performed Cox boosting ML model, which models the cause-specific hazard. The pseudovalue boosting ML model can provide predicted probabilities required for model performance metrics. These predicted probabilities were not directly available for the Cox boosting approach without also modeling the cause-specific hazard of relapse.

Assessment of model performance included, other than comparisons of area under the curve (AUC), calibration measures, slope, and intercept of the calibration plot of 1-year NRM. Apparent performance is calculated on the training sample and is subject to bias. Bootstrap bias correction with 200 bootstrap samples was the primary approach to correct for this bias, whereas cross-validation (10-fold, with 200 replications) was also performed to confirm the bias-corrected results. The ideal slope is 1 and the ideal intercept is 0.

We also performed overall fit measures, the Brier score, and the scaled Brier score (R2) for prediction of 1-year NRM.52,53 A lower Brier score indicates less difference between the observed and predicted outcomes and is considered favorable. The Brier score R2 indicates improvement from a model with no covariates, with higher percentage as favorable. The Cox boosting model was not included because it only models the cause-specific hazard for NRM and does not directly provide predictions of cumulative incidence to use in assessing overall fit.

In addition, a decision curve analysis (DCA) was performed to evaluate the potential clinical benefits of using the CHARM vs other models to identify patients at high risk for NRM. DCA54-60 evaluates the value of a predictive model when making clinical decisions. The following 3 strategies were compared: selecting all patients for intervention or allo-HCT (ie, treating all), selecting no patients (ie, treating none), and selecting patients based on the models (Figure 1). A high-performing model will demonstrate higher values of the net benefit across the targeted range for decision cutpoints in terms of predicted NRM incidence.

Figure 1.

DCA plot of net benefit vs threshold probability for NRM. All plots are bias-corrected using cross-validation to generate predicted probabilities. This figure reveals that CHARM exhibited higher net benefit compared with a “treat all” or “treat none” and that benefit was higher than that per the HCT-CI approach with a wide range of threshold probability for NRM.

Figure 1.

DCA plot of net benefit vs threshold probability for NRM. All plots are bias-corrected using cross-validation to generate predicted probabilities. This figure reveals that CHARM exhibited higher net benefit compared with a “treat all” or “treat none” and that benefit was higher than that per the HCT-CI approach with a wide range of threshold probability for NRM.

Close modal

To further understand the possibility of better AUC for 1-year NRM by the ML-CHARM including all 13 health covariates, we evaluated a Shapley value plot to summarize the contribution and rank of individual factors in the ML-CHARM on NRM.

Shapley values measure the contribution of a particular variable (eg, a particular CHARM variable) to the prediction for each patient. In this case, they are being applied to the prediction from the Cox boost ML model. Each patient has a Shapley value for each CHARM variable. The summary plot reveals the variability in Shapley values across patients from the most important variable at the top to the least important variable at the bottom. The scale of the Shapley value represents change in log-hazard predictions with vs without a particular variable included in the model.

Secondary outcome

OS was analyzed using Cox proportional hazards regression. Stepwise variable selection was used to select demographic and baseline characteristics for risk adjustment, before adding primary-CHARM scores to the model. OS by primary-CHARM tertiles was summarized using Kaplan-Meier estimates.

Handling of missing data and loss to follow-up

Supplemental Table 6 describes the completeness of variables used in these analyses. Censored data methods were used to handle missingness on NRM and OS due to loss to follow-up. Multiple imputations were used to address missingness in CHARM variables using the R package “mice.”61 We implemented group variable selection with the R package “gcrrp” to perform consistent variable selection for the NRM model across multiply imputed data sets.62 Once the variables are selected, we refit a regular Fine-Gray model for NRM to each imputed data set and applied Rubin’s rule for inference. Similarly, Cox regression for OS on the multiply imputed data sets is implemented after selecting adjustment variables, and the results are combined using Rubin’s rule.

Participant characteristics and exposure

The study enrolled a total of 1226 patients, of whom 121 dropped out mostly due to not proceeding to allo-HCT (n = 74, 61%). The CONSORT (Consolidated Standards of Reporting Trials) diagram is found in Figure 2. There were no significant differences in age, sex, race/ethnicity, or primary disease between the 2 groups (supplemental Table 4). Among 74 patients who did not proceed to allo-HCT, disease relapse was the most common reason (41%). Table 1 reveals the distribution of the baseline CHARM variables for the primary analysis population of 1105 allo-HCT recipients. Supplemental Table 5 shows baseline characteristics of the study cohort (n = 1105) and supplemental Table 6 shows completeness of CHARM baseline variables. Supplemental Figure 1 reveals graphical description of actual vs projected enrollment per each quarter for years 2019 to 2022. The median age was 67 years (range, 60-82). Most participants received reduced-intensity regimens (68%); 13% received high-dose conditioning regimens and 19% received nonmyeloablative regimens. Additional details are provided in the supplemental Material.

Figure 2.

CONSORT diagram describing study flow.

Figure 2.

CONSORT diagram describing study flow.

Close modal

Primary outcome

NRM at first year was 14.4% (95% confidence interval [CI], 12.4-16.5). Estimates were 12.5 (95% CI, 9.9-15.3) for years 2019 to 2020 and 16.6 (95% CI, 13.5-19.9) for years 2021 to 2022. Supplemental Table 7 presents comparisons of univariate outcomes among study participants vs a control group of older patients who received allo-HCT during the same time and were reported to CIBMTR. Supplemental Table 8 presents the primary causes of death.

CHARM variables

Supplemental Figure 2 reveals histograms of baseline CHARM variables. The primary-CHARM model for NRM (Table 2) included higher values for comorbidity burden, CRP, weight loss, and age and lower values for albumin, patient-reported performance score, and cognitive score; each of which had independent associations with the risk for NRM.

Coefficients for each of the 7 variables are provided, and after adjusting for clinical factors, the model coefficients were largely unchanged (Table 2). Patients in the low, intermediate, and high CHARM score tertiles had NRM rates at 1 year of 8.1% (95% CI, 5.6-11.1), 12.1% (95% CI, 9.1-15.7), and 23.3% (95% CI, 19.0-27.7), respectively (Figure 3).

Figure 3.

Primary-CHARM tertiles stratifying for NRM.

Figure 3.

Primary-CHARM tertiles stratifying for NRM.

Close modal

Primary-CHARM apparent AUC was 0.627. Cross-validation bias–corrected AUC for the CHARM model was 0.592 vs 0.580 for the HCT-CI (supplemental Table 9). In comparison, ML-CHARM1 and ML-CHARM2 models had bias-corrected AUCs of 0.577 and 0.606, respectively. To better elucidate the impact of variables other than the HCT-CI, an apparent AUC for a CHARM model not including the HCT-CI was 0.607 (compared with 0.627 for primary-CHARM).

Several approaches (calibration slope, intercept, and Brier score R2 analyses) were used to compare the performances of Primary-CHARM, ML-CHARM1, ML-CHARM2, and HCT-CI to predict NRM (supplemental Figure 3; supplemental Tables 10 and 11). Overall, they indicate better performance of primary-CHARM compared with the HCT-CI and similar performance to the 2 ML-CHARM models.

Shapley value plots (supplemental Figure 4) revealed the following variables to higher values in descending order: HCT-CI, CRP, serum albumin, walk speed, age, KPS, PROMIS Depression, MoCA, PROMIS Physical Function, percent of weight loss, number of prescribed medications, Instrumental Activities of Daily Living, and number of falls.

DCA (Figure 1; supplemental Figure 5) revealed that primary-CHARM exhibited higher net benefit (ie, correct identification of patients who would experience NRM by 12 months) compared with a “treat all” or “treat none” approach across a wide range of threshold probabilities for NRM. This higher net benefit was evident compared with those based on HCT-CI alone (Figure 1) and comparable to those per the pseudo-value ML-CHARM (supplemental Figure 5).

Subgroup analyses (supplemental Table 12; supplemental Figure 6) confirmed associations between primary-CHARM scores and risks of NRM among different subgroups, such as among patients who received post-transplant cyclophosphamide (HR, 3.522; 95% CI, 2.353-5.270) or other regimens (HR, 2.701; 95% CI, 1.807-4.036) for GVHD prophylaxis; those aged <70 years old (HR, 2.679; 95% CI, 1.951-3.680) or ≥70 years old (HR, 4.353; 95% CI, 2.424-7.819); and those with low-intermediate disease risk index (DRI) (HR, 3.140; 95% CI, 2.181-4.520) or high-very high DRI (HR, 2.703; 95% CI, 1.349-5.419).

Secondary outcome (1-year OS)

A total of 313 patients died within 1 year after allo-HCT with an average 1-year OS rate of 71.7% (95% CI, 68.2-75.1). Primary-CHARM scores stratified Kaplan-Meier plots of 1-year OS to 81.2%, 73.8%, and 59.6% for low-, intermediate-, and high-risk tertiles, respectively (Figure 4A), but did not for relapse (Figure 4B). In a multivariate Cox model, primary-CHARM (HR, 2.09; P < .0001) and DRI high/very high (HR, 1.73; P = .0025) were the only factors independently associated with OS (supplemental Table 13). In an additional multivariate Cox model analysis of OS, physician estimate of OS was forced in the model (Table 3). Primary-CHARM scores (HR, 2.06; P < .0001) and DRI high/very high (HR, 1.537; P = .0025) were again the only factors independently associated with OS, whereas physician estimate of OS was not (overall P = .056) (Table 3). Physician estimates of OS failed to stratify outcomes except for the highest risk group comprising only 2% of patients (supplemental Figure 7).

Figure 4.

Stratification of outcomes per primary CHARM tertiles. Primary-CHARM tertiles stratifying for (A) overall survival and (B) relapse.

Figure 4.

Stratification of outcomes per primary CHARM tertiles. Primary-CHARM tertiles stratifying for (A) overall survival and (B) relapse.

Close modal

This US-based, multisite, prospective, observational, longitudinal study investigated the prognostic impact of 13 different health variables on the risk of NRM among patients aged ≥60 years who received allo-HCT. We were able to build and internally validate a novel, comprehensive prognostic measure of NRM risk for this population. Age, HCT-CI, albumin, CRP, weight loss, patient-rated KPS, and cognition by MoCA were the 7 variables that constituted the primary-CHARM model. The initial 6 of these are readily available in the clinic. Administering MoCA requires additional effort (10 minutes); however, screening for cognitive impairment not only holds value but is advised by the national guidelines for older adult cancer care.63 It is encouraging that the primary-CHARM score was the only prognostic factor for OS in a multivariate analysis along with disease risk assessed by the refined DRI. This met our study framework to quantify patient vulnerability to gauge NRM separate from disease risk tools. An online primary-CHARM calculator is available at the CIBMTR website (at https://cibmtr.org/CIBMTR/OffNav/DevSandbox/CHARM-Risk-NRM-Calculator), which provides risks of 1-year NRM based on the primary-CHARM score.

Our results reflect the following some important points: (1) given the high completion rates across 49 centers, comprehensive health assessment merging patient-reported and objective parameters is feasible (supplemental Table 6); (2) appropriate prognostic assessment of HCT outcomes is better served by multidimensional tools, not a single health domain64; (3) health assessment measures outperform subjective physician prognostication,64 justifying the effort required to collect this information in the clinic; (4) nutritional and inflammatory biomarkers, captured by weight loss, serum albumin, and CRP influence outcomes65; (5) impaired cognition, as a feature of physiologic aging, can adversely affect transplant outcomes; (6) the HCT-CI performed well in this older patient population in the modern era, being selected by the Fine-Gray model and Shapley values to be one of the most important predictors of NRM contradicting results from retrospective analyses66; and, last, (7) older age and patient-reported KPS are prognostic of NRM even after considering multiple other objective measures. Finally, we performed 2 exploratory ML models to prove that the primary-CHARM performs and provides confidence in the completeness of our model development approach.

Our study has the advantage of describing data from a large number (n = 1105) of patients treated at many (n = 49) transplant centers across the United States with broad eligibility criteria, which increases the generalizability of results. It includes a comprehensive assessment of geriatric syndromes, comorbidities, and readily available biomarkers to optimize chances to capture relevant outcome predictors. The identification of older adults with low NRM risks (ie, lowest tertile with 8.1% NRM and 81.2% OS at 1 year) should strongly encourage consideration of HCT with the appropriate disease indication, decreasing the current bias against offering allo-HCT for older patients. This will increase the chance of cure in the patient population most frequently affected with hematologic malignancies. However, identification of a group of patients with the highest tertile of primary-CHARM scores, associated with NRM rates of 23.3% and OS of 59.6% at 1 year, sets the stage for future trials exploring novel approaches to reduce morbidity and mortality of HCT. Of note, most CHARM variables are meant to reflect patients’ health status within the immediate 2 to 3 weeks before the date of transplant. This is thought to be the most adequate timeframe to evaluate patients for transplant eligibility while their primary cancer is under appropriate control or stability.

The current 1-year NRM in this study is the lowest ever reported for older (60-82 years old) recipients of HCT at 14.4%, and 1-year OS is the highest at 71.7%. This reinforces the improvement in transplant outcomes revealed previously.4,67 Reasons include enhancements in supportive care, better methods of preventing GVHD,68 and better tolerated induction therapies and conditioning regimens.69 Our findings are applicable to modern-day HCT practices nationwide with 21% patients receiving an HLA-haploidentical donor graft and 39% receiving post-transplant cyclophosphamide for GVHD prophylaxis.

The aim of this study was to create a model, designed and validated in a group of older recipients of allogeneic HCT, that can be used by transplant physicians to counsel patients about risks and benefits of HCT. Future trials could incorporate modifications based on CHARM score related to conditioning regimens or GVHD prophylaxis. We recognize that the model was not developed to decide whether patients should receive an allogeneic HCT or not. Such a model would need a different study design where data should be collected soon after achieving remission of primary cancer and at a duration that precedes the decision to refer a patient or not for allogeneic HCT. Nonetheless, investigating the value of CHARM scores earlier in the patients’ treatment history, ideally months before potential HCT, would be of the tremendous future interest.

We acknowledge that the cross-validation bias–corrected AUC of 0.591 for primary-CHARM was modest and that there is room for further improvement in predictive accuracy. We likewise found modest AUC for HCT-CI and more sophisticated ML-CHARM models. The AUC as a measure of discriminative capacity of a model has its own limitations.70 Reassuringly, all assessments currently recommended by the TRIPOD guidelines (Brier scores and DCA)57-59 revealed enhanced predictive performance by primary-CHARM compared with the current standard, the HCT-CI. Furthermore, the primary-CHARM strongly predicted OS, and its calibration was good (supplemental Figure 3; supplemental Table 10). Furthermore, removing the HCT-CI from CHARM maintained a level of prediction per AUC at a magnitude of 0.607 of acceptable apparent AUC (ie, without bias correction). Our results also suggest that primary-CHARM has at least comparable performance to 2 exploratory ML-CHARM models. Of note, the bootstrap bias–corrected AUC was superior for ML-CHARM2 probably because the apparent AUC was unusually high due to overfitting from the ML model. DCA has emerged as a recent technique to compare model performance. DCA suggests that primary-CHARM performs better than HCT-CI alone through a higher net benefit to identify patients with HCT compared with “treat all” or “treat none” approaches for a wide range of acceptable thresholds of 1-year NRM.

We recognize study limitations. Given the large patient sample needed to design the model and the declining NRM (and thus fewer events), a parallel external validation cohort was impractical and likely inefficient. However, the bootstrap-corrected internal validation and cross-validation analyses are frequently accepted approaches in epidemiology,71 and both were robust with multiple sensitivity analyses, thereby confirming the value of CHARM.71 Furthermore, unlike biomarker or tumor marker discovery studies, where external validation sample might be necessary,72 primary-CHARM as a model of established health factors follow TRIPOD guidelines that support conducting internal validation of the model in the population in which it is intended to be used.20 Adoption of CHARM in practice should enable future validation, similar to real-world validation of comorbidity by HCT-CI.26 The contribution of patients from under-represented minority groups was modest in this study despite broad eligibility and supporting 3 languages; however, they were similar to the distribution of race and ethnicity in the general United States of allo-HCT recipients aged ≥60 years when compared with data from the CIBMTR. This suggests that diminished access to HCT in general may be the main contributing factor for the low numbers of minorities gathered for the trials. Studying access barriers to clinical trials is an important future goal of the BMT-CTN. Only then can we know how well a model such as CHARM performs to capture risks across different races, ethnicities, and languages. Finally, the study was done in US centers. Whether CHARM applies equally well in other countries should be tested.

In summary, in a first-of-its-kind study, we were able to prospectively design and validate an easily implemented composite health risk assessment model inclusive of geriatric assessment, biomarkers, patient-reported outcomes, and comorbidities to better risk stratify older recipients of allo-HCT. The CHARM should improve decision-making, selection of the best transplant strategy by weighing risks vs benefits, allow calibration of data across trials and institutions, and ensure that appropriate older patients are not excluded from curative-intent allo-HCT. Intervention trials focusing on comorbidities, nutritional deficiencies, inflammation, and impaired cognition could further improve transplant outcomes. Future efforts to improve prognostic capacity may require artificial-intelligence modeling integrating broader clinical, sociodemographic, and/or biologic data.

Support for this study was provided by grants U10HL069294 and U24HL138660 to the Blood and Marrow Transplant Clinical Trials Network from the National Heart, Lung, and Blood Institute and the National Cancer Institute.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Contribution: A.S.A. and M.L.S. initiated the conception of the study; A.S.A., B.L., M.L.S., M.M.H., and W.S. contributed to study design; all authors contributed to acquisition of data; A.B., A.S.A., B.L., M.L.S., M.M.H., and N.G. contributed to data analysis; A.B., A.S.A., B.L., J.M.M., M.L.S., M.M.H., N.G., and W.W. contributed to interpretation of data; M.L.S. drafted the article; A.S.A. contributed significantly to article drafting; A.S.A., B.L., J.M., M.M.H., N.G., R.O., S.M.D., S.R.M., S.A.W., W.S., and V.R.B. contributed to critical revision of important intellectual content; all authors provided final approval of the article; and A.S.A., B.L., and M.L.S. agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the article are appropriately investigated and resolved.

Conflict-of-interest disclosure: M.L.S. reports receiving consultancy and receiving honoraria from JAZZ Pharmaceuticals for giving educational talks and receiving research funding from BlueNote. W.W. reports receiving research support from Pfizer and Genentech; having equity and providing consulting to Koneksa Health; and providing consulting for Teladoc Health, Quantum Health, and American Society of Hematology Research Collaborative. A.M. reports receiving grant support from Novartis. P.H.I. reports receiving research support from Janssen. V.R.B. reports participating in the Safety Monitoring Committee for Protagonist; serving as an Associate Editor for the journal, Current Problems in Cancer; serving as a contributor for BMJ Best Practice; providing consultancy for Imugene, Sanofi, and Taiho; receiving research support from MEI Pharma, Actinium Pharmaceutical, Sanofi U.S. Services, AbbVie, Pfizer, Incyte, Jazz, and National Marrow Donor Program; and receiving drug support (institutional) from Chimerix for a trial. R.O. reports receiving research support from Cellectis and providing consulting for Servier and Riger. J.M. reports receiving research support from Gilead, Atara, CRISPR, Precision Biosciences, Scripps Research Institute, VOR Bio, and Affimed. S.A.W. reports serving on the speaker bureau for Sobi. A.S.A. reports having advisory role for AstraZeneca and Magenta Therapeutics and providing consulting to AbbVie. The remaining authors declare no competing financial interests.

Correspondence: Mohamed L. Sorror, Clinical Research Division, Fred Hutchinson Cancer Center, 1100 Fairview Ave North, Seattle, WA 98109-1024; email: msorror@fredhutch.org.

1.
Howlader
N
,
Noone
AM
,
Krapcho
M
, et al
.
SEER cancer statistics review, 1975-2013
. Accessed 1 January 2025. https://seer.cancer.gov/csr/1975_2013/.
2.
Thein
MS
,
Ershler
WB
,
Jemal
A
,
Yates
JW
,
Baer
MR
.
Outcome of older patients with acute myeloid leukemia: an analysis of SEER data over 3 decades
.
Cancer
.
2013
;
119
(
15
):
2720
-
2727
.
3.
Cornelissen
JJ
,
van Putten
WL
,
Verdonck
LF
, et al
.
Results of a HOVON/SAKK donor versus no-donor analysis of myeloablative HLA-identical sibling stem cell transplantation in first remission acute myeloid leukemia in young and middle-aged adults: benefits for whom?
.
Blood
.
2007
;
109
(
9
):
3658
-
3666
.
4.
McDonald
GB
,
Sandmaier
BM
,
Mielcarek
M
, et al
.
Survival, non-relapse mortality, and relapse-related mortality after allogeneic hematopoietic cell transplantation: comparing 2003-2007 versus 2013-2017 cohorts
.
Ann Intern Med
.
2020
;
172
(
4
):
229
-
239
.
5.
Medeiros
BC
,
Satram-Hoang
S
,
Hurst
D
,
Hoang
KQ
,
Momin
F
,
Reyes
C
.
Big data analysis of treatment patterns and outcomes among elderly acute myeloid leukemia patients in the United States
.
Ann Hematol
.
2015
;
94
(
7
):
1127
-
1138
.
6.
Sorror
ML
,
Gooley
TA
,
Storer
BE
, et al
.
An 8-year pragmatic observation evaluation of the benefits of allogeneic HCT in older and medically infirm patients with AML
.
Blood
.
2023
;
141
(
3
):
295
-
308
.
7.
Mishra
A
,
Preussler
JM
,
Bhatt
VR
, et al
.
Breaking the age barrier: physicians' perceptions of candidacy for allogeneic hematopoietic cell transplantation in older adults
.
Transpl Cel Ther
.
2021
;
27
(
7
):
617.e1
-
617.e7
.
8.
Flannelly
C
,
Tan
BE
,
Tan
JL
, et al
.
Barriers to hematopoietic cell transplantation for adults in the United States: a systematic review with a focus on age
.
Biol Blood Marrow Transpl
.
2020
;
26
(
12
):
2335
-
2345
.
9.
Sorror
ML
,
Maris
MB
,
Storb
R
, et al
.
Hematopoietic cell transplantation (HCT)-specific comorbidity index: a new tool for risk assessment before allogeneic HCT
.
Blood
.
2005
;
106
(
8
):
2912
-
2919
.
10.
Appelbaum
FR
,
Anasetti
C
,
Antin
JH
, et al
.
Blood and marrow transplant clinical trials network state of the science symposium 2014
.
Biol Blood Marrow Transpl
.
2015
;
21
(
2
):
202
-
224
.
11.
Olin
RL
,
Fretham
C
,
Pasquini
MC
, et al
.
Geriatric assessment in older alloHCT recipients: association of functional and cognitive impairment with outcomes
.
Blood Adv
.
2020
;
4
(
12
):
2810
-
2820
.
12.
Decoster
L
,
Van Puyvelde
K
,
Mohile
S
, et al
.
Screening tools for multidimensional health problems warranting a geriatric assessment in older cancer patients: an update on SIOG recommendations†
.
Ann Oncol
.
2015
;
26
(
2
):
288
-
300
.
13.
Muffly
LS
,
Boulukos
M
,
Swanson
K
, et al
.
Pilot study of comprehensive geriatric assessment (CGA) in allogeneic transplant: CGA captures a high prevalence of vulnerabilities in older transplant recipients
.
Biol Blood Marrow Transpl
.
2013
;
19
(
3
):
429
-
434
.
14.
Kelly
DL
,
Buchbinder
D
,
Duarte
RF
, et al
.
Neurocognitive dysfunction in hematopoietic cell transplant recipients: expert review from the Late Effects and Quality of Life Working Committee of the Center for International Blood and Marrow Transplant Research and Complications and Quality of Life Working Party of the European Society for Blood and Marrow Transplantation
.
Biol Blood Marrow Transpl
.
2018
;
24
(
2
):
228
-
241
.
15.
Holmes
HM
,
Des Bordes
JK
,
Kebriaei
P
, et al
.
Optimal screening for geriatric assessment in older allogeneic hematopoietic cell transplantation candidates
.
J Geriatr Oncol
.
2014
;
5
(
4
):
422
-
430
.
16.
Vaughn
JE
,
Storer
BE
,
Armand
P
, et al
.
Design and validation of an augmented hematopoietic cell transplantation-comorbidity index comprising pretransplant ferritin, albumin, and platelet count for prediction of outcomes after allogeneic transplantation
.
Biol Blood Marrow Transpl
.
2015
;
21
(
8
):
1418
-
1424
.
17.
Artz
AS
,
Logan
B
,
Zhu
X
, et al
.
The prognostic value of serum C-reactive protein, ferritin, and albumin prior to allogeneic transplantation for acute myeloid leukemia and myelodysplastic syndromes
.
Haematologica
.
2016
;
101
(
11
):
1426
-
1433
.
18.
Simera
I
.
Centre for Statistics in Medicine, University of Oxford. Equator Network
. Accessed 21 June 2024. http://www.equator-network.org/.
19.
von Elm
E
,
Altman
DG
,
Egger
M
, et al
.
The strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies
.
Lancet
.
2007
;
370
(
9596
):
1453
-
1457
.
20.
Collins
GS
,
Reitsma
JB
,
Altman
DG
,
Moons
KG
.
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement
.
Ann Intern Med
.
2015
;
162
(
1
):
55
-
63
.
21.
Sorror
ML
,
Sandmaier
BM
,
Storer
BE
, et al
.
Long-term outcomes among older patients following nonmyeloablative conditioning and allogeneic hematopoietic cell transplantation for advanced hematologic malignancies
.
JAMA
.
2011
;
306
(
17
):
1874
-
1883
.
22.
Dhawale
T
,
Steuten
LM
,
Deeg
HJ
.
Uncertainty of physicians and patients in medical decision making
.
Biol Blood Marrow Transpl
.
2017
;
23
(
6
):
865
-
869
.
23.
McClune
BL
,
Weisdorf
DJ
,
Pedersen
TL
, et al
.
Effect of age on outcome of reduced-intensity hematopoietic cell transplantation for older patients with acute myeloid leukemia in first complete remission or with myelodysplastic syndrome
.
J Clin Oncol
.
2010
;
28
(
11
):
1878
-
1887
.
24.
Sorror
ML
,
Storb
RF
,
Sandmaier
BM
, et al
.
Comorbidity-age index: a clinical measure of biologic age before allogeneic hematopoietic cell transplantation
.
J Clin Oncol
.
2014
;
32
(
29
):
3249
-
3256
.
25.
Raimondi
R
,
Tosetto
A
,
Oneto
R
, et al
.
Validation of the hematopoietic cell transplantation-specific comorbidity index: a prospective, multicenter GITMO study
.
Blood
.
2012
;
120
(
6
):
1327
-
1333
.
26.
Sorror
ML
,
Logan
BR
,
Zhu
X
, et al
.
Prospective validation of the predictive power of the hematopoietic cell transplantation comorbidity index: a Center for International Blood and Marrow Transplant research study
.
Biol Blood Marrow Transpl
.
2015
;
21
(
8
):
1479
-
1487
.
27.
Klepin
HD
,
Geiger
AM
,
Tooze
JA
, et al
.
Geriatric assessment predicts survival for older adults receiving induction chemotherapy for acute myelogenous leukemia
.
Blood
.
2013
;
121
(
21
):
4287
-
4294
.
28.
Lacy
M
,
Fong
M
,
Bolton
C
,
Maranzano
M
,
Bishop
M
,
Artz
A
.
Cognitive functioning of older adults prior to hematopoietic stem cell transplantation
.
Bone Marrow Transpl
.
2021
;
56
(
10
):
2575
-
2581
.
29.
Fried
LP
,
Tangen
CM
,
Walston
J
, et al
.
Frailty in older adults: evidence for a phenotype
.
J Gerontol A Biol Sci Med Sci
.
2001
;
56
(
3
):
M146
-
M156
.
30.
Mohile
SG
,
Dale
W
,
Somerfield
MR
,
Hurria
A
.
Practical assessment and management of vulnerabilities in older patients receiving chemotherapy: ASCO guideline for geriatric oncology summary
.
J Oncol Pract
.
2018
;
14
(
7
):
442
-
446
.
31.
Hurria
A
,
Cirrincione
CT
,
Muss
HB
, et al
.
Implementing a geriatric assessment in cooperative group clinical cancer trials: CALGB 360401
.
J Clin Oncol
.
2011
;
29
(
10
):
1290
-
1296
.
32.
Deschler
B
,
Binek
K
,
Ihorst
G
, et al
.
Prognostic factor and quality of life analysis in 160 patients aged > or = to 60 years with hematologic neoplasias treated with allogeneic hematopoietic cell transplantation
.
Biol Blood Marrow Transpl
.
2010
;
16
(
7
):
967
-
975
.
33.
Sorror
M
,
Storer
B
,
Sandmaier
BM
, et al
.
Hematopoietic cell transplantation-comorbidity index and Karnofsky performance status are independent predictors of morbidity and mortality after allogeneic nonmyeloablative hematopoietic cell transplantation
.
Cancer
.
2008
;
112
(
9
):
1992
-
2001
.
34.
Studenski
S
,
Perera
S
,
Patel
K
, et al
.
Gait speed and survival in older adults
.
J Am Med Assoc
.
2011
;
305
(
1
):
50
-
58
.
35.
Castell
MV
,
Sánchez
M
,
Julián
R
,
Queipo
R
,
Martín
S
,
Otero
Á
.
Frailty prevalence and slow walking speed in persons age 65 and older: implications for primary care
.
BMC Fam Pract
.
2013
;
14
(
1
):
86
.
36.
Muffly
LS
,
Kocherginsky
M
,
Stock
W
, et al
.
Geriatric assessment to predict survival in older allogeneic hematopoietic cell transplantation recipients
.
Haematologica
.
2014
;
99
(
8
):
1373
-
1379
.
37.
Jensen
RE
,
Potosky
AL
,
Moinpour
CM
, et al
.
United States population-based estimates of patient-reported outcomes measurement information system symptom and functional status reference values for individuals with cancer
.
J Clin Oncol
.
2017
;
35
(
17
):
1913
-
1920
.
38.
Wood
WA
,
Le-Rademacher
J
,
Syrjala
KL
, et al
.
Patient-reported physical functioning predicts the success of hematopoietic cell transplantation (BMT CTN 0902)
.
Cancer
.
2016
;
122
(
1
):
91
-
98
.
39.
Lawton
MP
,
Brody
EM
.
Assessment of older people: self-maintaining and instrumental activities of daily living
.
Gerontologist
.
1969
;
9
(
3
):
179
-
186
.
40.
Hurria
A
,
Togawa
K
,
Mohile
SG
, et al
.
Predicting chemotherapy toxicity in older adults with cancer: a prospective multicenter study
.
J Clin Oncol
.
2011
;
29
(
25
):
3457
-
3465
.
41.
Loberiza
FR
,
Rizzo
JD
,
Bredeson
CN
, et al
.
Association of depressive syndrome and early deaths among patients after stem-cell transplantation for malignant diseases
.
J Clin Oncol
.
2002
;
20
(
8
):
2118
-
2126
.
42.
El-Jawahri
A
,
Chen
YB
,
Brazauskas
R
, et al
.
Impact of pre-transplant depression on outcomes of allogeneic and autologous hematopoietic stem cell transplantation
.
Cancer
.
2017
;
123
(
10
):
1828
-
1838
.
43.
Nightingale
G
,
Skonecki
E
,
Boparai
MK
.
The impact of polypharmacy on patient outcomes in older adults with cancer
.
Cancer J
.
2017
;
23
(
4
):
211
-
218
.
44.
Sekeres
MA
,
Stone
RM
,
Zahrieh
D
, et al
.
Decision-making and quality of life in older adults with acute myeloid leukemia or advanced myelodysplastic syndrome
.
Leukemia
.
2004
;
18
(
4
):
809
-
816
.
45.
Concato
J
,
Peduzzi
P
,
Holford
TR
,
Feinstein
AR
.
Importance of events per independent variable in proportional hazards analysis. I. background, goals, and general strategy
.
J Clin Epidemiol
.
1995
;
48
(
12
):
1495
-
1501
.
46.
Fu
Z
,
Parikh
CR
,
Zhou
B
.
Penalized variable selection in competing risks regression
.
Lifetime Data Anal
.
2017
;
23
(
3
):
353
-
376
.
47.
Bacigalupo
A
,
Ballen
K
,
Rizzo
D
, et al
.
Defining the intensity of conditioning regimens: working definitions
.
Biol Blood Marrow Transpl
.
2009
;
15
(
12
):
1628
-
1633
.
48.
Blanche
P
,
Dartigues
JF
,
Jacqmin-Gadda
H
.
Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks
.
Stat Med
.
2013
;
32
(
30
):
5381
-
5397
.
49.
Harrell
FE
. Regression Modelling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis. 1st ed.
Springer
;
2001
.
50.
Collins
GS
,
Dhiman
P
,
Ma
J
, et al
.
Evaluation of clinical prediction models (part 1): from development to external validation
.
BMJ
.
2024
;
384
:
e074819
.
51.
Clift
AK
,
Dodwell
D
,
Lord
S
, et al
.
Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study
.
BMJ
.
2023
;
381
:
e073800
.
52.
Brier
GW
.
Verification of forecasts expressed in terms of probability
.
Monthly Weather Review
.
1950
;
78
(
1
):
1
-
3
.
53.
Rufibach
K
.
Use of Brier score to assess binary predictions
.
J Clin Epidemiol
.
2010
;
63
(
8
):
938
-
939
. author reply 939.
54.
Vickers
AJ
,
Elkin
EB
.
Decision curve analysis: a novel method for evaluating prediction models
.
Med Decis Making
.
2006
;
26
(
6
):
565
-
574
.
55.
Steyerberg
EW
,
Vickers
AJ
,
Cook
NR
, et al
.
Assessing the performance of prediction models: a framework for traditional and novel measures
.
Epidemiology
.
2010
;
21
(
1
):
128
-
138
.
56.
Kerr
KF
,
Brown
MD
,
Zhu
K
,
Janes
H
.
Assessing the clinical impact of risk prediction models with decision curves: Guidance for correct interpretation and appropriate use
.
J Clin Oncol
.
2016
;
34
(
21
):
2534
-
2540
.
57.
Fitzgerald
M
,
Saville
BR
,
Lewis
RJ
.
Decision curve analysis
.
JAMA
.
2015
;
313
(
4
):
409
-
410
.
58.
Holmberg
L
,
Vickers
A
.
Evaluation of prediction models for decision-making: beyond calibration and discrimination
.
PLoS Med
.
2013
;
10
(
7
):
e1001491
.
59.
Vickers
AJ
,
Van Calster
B
,
Steyerberg
EW
.
Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests
.
BMJ
.
2016
;
352
:
i6
.
60.
Moons
KG
,
Altman
DG
,
Reitsma
JB
, et al
.
Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration
.
Ann Intern Med
.
2015
;
162
(
1
):
W1
-
73
.
61.
van Buuren
S
,
Groothuis-Oudshoorn
K
.
mice: Multivariate imputation by chained equations in R
.
Journal of Statistical Software
.
2011
;
45
(
3
):
1
-
67
.
62.
Chen
Q
,
Wang
S
.
Variable selection for multiply-imputed data with application to dioxin exposure study
.
Stat Med
.
2013
;
32
(
21
):
3646
-
3659
.
63.
Dale
W
,
Klepin
HD
,
Williams
GR
, et al
.
Practical assessment and management of vulnerabilities in older patients receiving systemic cancer therapy: ASCO guideline update
.
J Clin Oncol
.
2023
;
41
(
26
):
4293
-
4312
.
64.
Sorror
ML
.
The use of prognostic models in allogeneic transplants: a perspective guide for clinicians and investigators
.
Blood
.
2023
;
141
(
18
):
2173
-
2186
.
65.
Miyazaki
T
,
Tachibana
T
,
Suzuki
T
, et al
.
Pretransplantation Inflammatory and Nutritional Status in Elderly Allogeneic Hematopoietic Stem Cell Transplantation: Prognostic Value of C-Reactive Protein-to-Albumin Ratio
.
Transplant Cell Ther
.
2024
;
30
(
4
):
400.e1
-
400.e9
.
66.
Malagola
M
,
Polverelli
N
,
Rubini
V
, et al
.
GITMO Registry Study on Allogeneic Transplantation in Patients Aged ≥60 Years from 2000 to 2017: Improvements and Criticisms
.
Transplant Cell Ther
.
2022
;
28
(
2
):
96.e1
-
96.e11
.
67.
Cooper
JP
,
Storer
BE
,
Granot
N
, et al
.
Allogeneic hematopoietic cell transplantation with non-myeloablative conditioning for patients with hematologic malignancies: Improved outcomes over two decades
.
Haematologica
.
2021
;
106
(
6
):
1599
-
1607
.
68.
Salas
MQ
,
Eikema
DJ
,
Koster
L
, et al
.
Impact of post-transplant cyclophosphamide (PTCy)-based prophylaxis in matched sibling donor allogeneic haematopoietic cell transplantation for patients with myelodysplastic syndrome: a retrospective study on behalf of the Chronic Malignancies Working Party of the EBMT
.
Bone Marrow Transplant
.
2024
;
59
(
4
):
479
-
488
.
69.
DiNardo
CD
,
Jonas
BA
,
Pullarkat
V
, et al
.
Azacitidine and venetoclax in previously untreated acute myeloid leukemia
.
N Engl J Med
.
2020
;
383
(
7
):
617
-
629
.
70.
Pencina
MJ
,
D’Agostino
RB
.
Sr. Evaluating discrimination of risk prediction models: The C statistic
.
JAMA
.
2015
;
314
(
10
):
1063
-
1064
.
71.
Steyerberg
EW
,
Harrell
FE
,
Borsboom
GJ
,
Eijkemans
MJ
,
Vergouwe
Y
,
Habbema
JD
.
Internal validation of predictive models: efficiency of some procedures for logistic regression analysis
.
Journal of Clinical Epidemiology
.
2001
;
54
(
8
):
774
-
781
.
72.
McShane
LM
,
Altman
DG
,
Sauerbrei
W
,
Taube
SE
,
Gion
M
,
Clark
GM
.
Reporting recommendations for tumor marker prognostic studies
.
J Clin Oncol
.
2005
;
23
(
36
):
9067
-
9072
.

Author notes

This is not an interventional clinical trial. There is not a specific data-sharing plan for the study because the requirements by the International Committee of Medical Journal Editors are not applicable. However, all Blood and Marrow Transplant Clinical Trials Network trial data are deposited in the BioLINCC within 30 days of publication of the primary manuscript and made publicly available according to BioLINCC processes.

The full-text version of this article contains a data supplement.