© 2003 AlphaMed Press
Baseline and Early Treatment Factors are not Clinically Useful for Predicting Individual Response to Erythropoietin in Anemic Cancer Patientsa John Radcliffe Hospital, Headington, Oxford, UK; b Medical Affairs, Ortho-Biotech, High Wycombe, UK; c Goulds, Staverton, Totnes, Devon, UK Correspondence: Timothy J. Littlewood, M.D., John Radcliffe Hospital, Headington, OX3 9DU Oxford, UK. Telephone: 44-0-1865-220364; Fax: 44-0-1865-221778; e-mail: Tim.Littlewood{at}btinternet.com
Access and take the CME test online and receive one hour of AMA PRA category 1 credit at CME.TheOncologist.com
Recombinant human erythropoietin (rHuEPO) is an effective treatment for anemia in patients with cancer, and recent studies show that over two-thirds of patients can be expected to respond with a large increase (>2 g/dl) in hemoglobin concentration. However, it would be helpful to identify likely responders and nonresponders before initiating treatment. Previous studies have suggested that high pretreatment endogenous erythropoietin levels are associated with a lower response to erythropoietin, especially in certain patient groups, such as patients with hematological malignancies, nonchemotherapy patients, or patients with myelodysplastic syndrome. Various algorithms have therefore been developed to predict patient response to rHuEPO using baseline serum erythropoietin levels and other baseline factors. We performed an analysis of data pooled from four randomized clinical trials of 604 patients with nonmyeloid malignancies, examining the clinical usefulness of pretreatment and early treatment factors for predicting response to erythropoietin. The analysis confirms several other reports that the most predictive models combined pretreatment and early treatment factors, including change in hemoglobin at 4 weeks, but even these models did not increase sensitivity above 85% (total response in unselected patients was 68.1%), while specificity remained poor. We conclude that clinically useful prediction of response to erythropoietin is not possible using baseline or early response variables because of poor sensitivity and specificity of prediction compared with generally accepted clinical tests. Key Words. Erythropoietin • Erythropoiesis • Anemia • Cancer • Hemoglobin
Anemia occurs frequently in cancer patients, either as a result of the disease or its therapy, and can have a major impact on quality of life, which often is underestimated by clinicians [1]. Anemia may also reduce the effectiveness of anticancer therapy [2]. Recombinant human erythropoietin (rHuEPO; epoetin alfa) is an effective treatment for anemia in many cancer patients, but response rates are variable [3, 4]. Individual studies have shown response rates of 32%-85%, and overall, about 70% of patients will have an increase in hemoglobin (Hb) concentration of >2 g/dl [3, 5, 6]. In order to direct the potential benefit of rHuEPO treatment to those most likely to have the greatest response, it would be useful to be able to identify nonresponders before the start of treatment or within the first few weeks of treatment. In cancer patients, the serum EPO level is often not appropriately elevated in the presence of anemia [7], and previous studies have suggested that poor response to rHuEPO in cancer patients may be related to high pretreatment serum EPO concentrations [810]. This has led to the development of a number of predictive algorithms [1113]. Unfortunately, the applicability of the findings is limited by small sample sizes and/or analysis of data from specific tumor types. In several studies, pretreatment and early treatment factors have been examined for their usefulness in predicting a response to EPO, often in combination with pretreatment serum EPO concentrations. Several authors have suggested that initial reticulocyte and Hb response to a short course of EPO treatment (2 to 4 weeks) is the most accurate predictor of eventual response [4, 6, 14, 15]. Various measures of iron status have also been put forward as potential predictors of response. To our knowledge, no study has yet verified these proposed algorithms in large studies enrolling a heterogeneous group of cancer patients, the type of population that would most likely be encountered in standard practice. The use of epoetin alfa has now been reported in several large, published randomized trials of patients with a range of malignancies, and a substantial dataset is available to reassess the relationship between treatment response and baseline and early response variables that may predict response. We have analyzed data pooled from four studies to determine the relationship, if any, between a large number of pre- and early treatment factors and response to epoetin alfa treatment [3, 16, 17] (Abstract/Ortho-Biotech data on file for the remaining study).
Study Population Data were pooled from the four pivotal, randomized, double-blind, placebo-controlled multicenter studies used to register epoetin alfa in the U.S. and Europe. The results of all but one of these studies are reported individually elsewhere [3, 16, 17]. A total of 604 patients, treated with 150 IU/kg rHuEPO administered subcutaneously three times per week, was included in these studies. Of these, 561 patients were clinically evaluable and were included in the actual pooled analysis. In all but one study, the dose was increased to 300 IU/kg after 4 weeks if response was unsatisfactory (usually defined as an increase in Hb concentration <1 g/dl); in the remaining study, the dose was not increased but could be titrated down to achieve a hematocrit of 38%-40%. The pooled population comprised patients aged 18-92 years (median, 62 years): 361 (64%) were women; 292 (52%) had solid tumors; 217 (39%) had bone marrow involvement; and 190 (34%) were transfusion dependent at baseline. The most common types of cancer were breast cancer (23%), myeloma (20%), and lymphoma (16%). Treatment efficacy was defined by abrogation of transfusion requirements or change in Hb concentration (or hematocrit in the U.S. study).
Measurement of Pre- and Early Treatment Tests
Statistical Analysis
The sensitivity and specificity of the baseline and early response factors were examined for patients who had the values available in the dataset. Standard definitions of sensitivity and specificity were used throughout the study. Sensitivity was defined as the number of test positives divided by the total number of true positives. Specificity was defined as the number of test negatives divided by the total number of true negatives. For clarity and to aid clinical interpretation, specificities are generally presented inversely as the response rate (false-negative rate) for patients not meeting the designated criterion or criteria. Youdens Index for the tests having the best sensitivities were calculated as follows: if is the probability of a false positive and ß is the probability of a false negative, then Youdens Index = 1 ( + ß), with higher values indicating a better test [18]. Variables were selected for inclusion in the analysis if they had been present in other studies that addressed EPO response or measured a value associated with hematological status, iron status, or chemotherapy treatment. Cut-off values were selected to match values that had previously been used in the literature, and in some cases, multiple cut-off values were examined. Single-factor sensitivities and specificities were examined for all of the above baseline and early response variables and all cut-off values. Two-factor analyses were evaluated that combined the factors shown to be most predictive from the logistic regression and one-factor analyses. For completeness, three-factor analyses were also evaluated that looked at the predictive value of the most favorable factors from the one- and two-factor analyses. The effect of prior chemotherapy was addressed by looking separately at the predictive ability of several factors in patients who had and had not received prior chemotherapy.
Overall Response From the four studies, data suitable for evaluation were available for 561 of the 604 patients in the combined database who received rHuEPO in the four randomized clinical trials.
Overall, 68.1% of patients (382) responded to rHuEPO treatment according to hematological parameters (increase in Hb
Logistic Regression
Single-Factor Analysis Baseline Factors
Even more significant is the fact that the specificities of all of the baseline single-factor analyses were poor. In the case of patients with EPO levels >100 mU/ml, 55% of patients still responded to rHuEPO by 8 weeks. Even at extremely high baseline EPO levels (>500 mU/ml), 42% (5/12) ultimately responded, revealing a poor ability of the factors to accurately predict nonresponders regardless of the pretreatment serum EPO level. Youdens Index for the use of baseline serum EPO testing using a level of 100 as a cut-off was 64.8% compared with a Youdens Index of 68.1% for treatment of all patients without any testing.
Early Response Variables
Two-Factor Analysis
Only the results for baseline combined with 4-week analyses are shown. (Models using baseline combined with 2-week data were similar and are not shown). When baseline factors were combined with 4-week factors, the most sensitive two-factor tests (87.5%-90.3%) included Hb at 4 weeks regardless of the baseline factor. Serum EPO 100 mU/ml and ferritin 400 ng/ml were present in the two-factor model, consistent with the logistic regression analyses. Looking more closely at each of these analyses, it appears that Hb at 4 weeks dominates each analysis, with the baseline factor contributing relatively little. In each of these two-factor tests, an increase in baseline Hb >1 g/dl at 4 weeks was associated with highest response, regardless of whether the baseline factor was above or below the cut-off value. The sensitivities for the remaining two-factor tests were always in the 70%-80% range, and essentially no better than the single-factor sensitivities. The specificity of the two-factor tests was only slightly higher than for the best single factors, with the two-factor tests including Hb at 4 weeks having a specificity of 53.5%-56.4% for patients having neither factor, meaning that 43.6%-46.5% of patients not meeting either criterion would still respond. The two additional groups of patients having "mixed" test results had response rates between the two extremes.
Three-Factor Analysis
The overall response rate for patients included in the three-factor tests was 66.1%-68.3%, comparable with the response rate for the entire population. For three-factor baseline tests, ferritin <400 ng/ml enhanced the sensitivity of a baseline serum EPO level <100 mU/ml from 74% to 78%. Looking at the three-factor analyses at 2 weeks and 4 weeks, Hb change was still the most important factor, and three-factor analyses did not appear to have any higher sensitivity or specificity than Hb change alone or two-factor analyses. Evaluation of three separate tests also produces a total of eight possible testing outcomes. The number of patients available for each test outcome possibility became quite small in a number of cases, making interpretation of multifactor analyses more difficult than one- or two-factor tests.
Effect of Prior Chemotherapy
The aim of these analyses was to investigate the clinical utility of using the simple criterion of measuring pretreatment and early-treatment factors to predict response to rHuEPO treatment. While logistic regression models for the data from this large analysis population confirmed that there is a statistical relationship between several of these factors and ultimate response (defined as a rise in Hb of 2 g/dl), it appears that none of the analyses resulted in tests that would be clinically useful in a large, heterogeneous group of anemic cancer patients. Our analysis reveals that statistical relationships do not always result in clinically useful information and such relationships must be further investigated using simple measures like sensitivity and specificity. Very telling is the fact that Youdens Index for even the best predictive factors was less than the rate of response using no testing at all. Other authors have also expressed doubts about the utility of measuring prefactors as the basis for response prediction, including baseline EPO serum measurement [19, 20].
Some studies have, however, suggested that prediction of response is possible. Ludwig et al. [11] stated, based on a study of 80 patients, that "unresponsiveness of the patient is very likely, (predictive power, 93%)" where EPO level is We have also illustrated the difficulty posed by multiple test outcomes with multifactor models. The more test factors that are used together, the more patients will have conflicting test results that make application of any algorithm very difficult. In most of the two-factor models, roughly half the patients fell into this gray area where test result interpretation is difficult. Most publications in this area, however, have ignored patients with conflicting test results, preferring to concentrate on patients who either had all or none of the positive factors for prediction. Another set of issues that has largely been ignored but should be the subject of further study is the precision of Hb testing and whether Hb response is the best measure of response for all patients. If an increase of 2.0 g/dl in Hb is used to define response, then errors in the measurement of Hb of just a few tenths of a point may have a profound impact on the outcome of testing. Adding to this is the fact that for a patient experiencing a rapid decline in Hb at the time of initiation of rHuEPO treatment, an increase in Hb of "only" 1.0 g/dl might be considered a good response for that patient. The study by Abels et al. showed that transfusion frequency was highly correlated with decline in Hb in the first two cycles of chemotherapy, suggesting that patients not meeting the standard definition of Hb response do benefit from a prevention of a decline in Hb [16]. Based on both of these issues, the use of rigid tests and cut-offs to define response and the need for continued treatment may be ignoring the more important clinical issues that are specific to individual patients. It is possible that further investigation may yet uncover groups of patients for whom prediction of response to rHuEPO is still possible. For example, it is still possible that chemotherapy exposure itself may alter the prognostic value of some of the tests, as there is evidence that EPO levels increase after chemotherapy [2123]. This observation, acting in certain populations, may explain Abels observation that pretreatment EPO concentrations were related to rHuEPO response only in those patients who were not receiving chemotherapy [16]. The results of the present study suggest, however, that many factors that have been offered as predictive in the context of one population or study may not be generalizable to other cancer populations. This appears to be true regarding pretreatment chemotherapy, as we found no meaningful association between pretreatment chemotherapy and the results of predictive factor analyses. As a result, more study in this area is required before predictive testing is used on patients in an actual clinical setting. It could also be argued that better algorithms might exist for certain tumor types but not others. Ludwig et al. [12] examined this question but found that disease-related factors such as tumor type, bone marrow involvement, and type of chemotherapy did not influence rHuEPO response. Other large studies also suggest that the response rates of different tumor types to rHuEPO are fairly constant [3, 5, 6, 17]. Cazzola et al. restricted their analysis to patients with multiple myeloma and non-Hodgkins lymphoma and were able to define a "ratio of observed to predicted logarithm of baseline serum erythropoietin level (O/P ratio)" that appears to be more predictive than we found in our analysis, but their study was actually designed as a dose-response study and used daily dosing of rHuEPO, a dosing regimen that is not commonly used in practice or studied in any other trials [24]. In addition, only 55 patients were exposed to therapeutic doses of rHuEPO. Despite the fact that the O/P ratio was their most predictive factor, only 70% of patients meeting the criteria responded (compared with 61% overall), while 27% who did not meet the criteria still responded. This rate of sensitivity does not vary much from those presented in this and other studies and was not much greater than their baseline response rate. In short, these response criteria did not appear to be clinically useful when put into practice despite the fact that they were statistically the "best" criteria for response. For any predictive algorithm to be clinically useful, it should accurately predict those patients with a high probability of responding, but must also provide a high degree of certainty that those denied treatment would not have responded. Although some of the predictive factors in this study allow somewhat improved selection of responders, none appear to be useful to exclude the likelihood of response. One strategy, presented recently at a meeting of the American Society of Clinical Oncology (2002) separately by Patton et al. and Glaspy et al., is using higher initial doses of EPO in an attempt to increase the rate of response and identify nonresponders more quickly, but more detailed data are needed in order to compare this approach with standard dosing regimes.
Based on previous studies and our own analysis of over 500 patients, we conclude that a variety of common baseline and early response variables are not useful for predicting the response to rHuEPO since they do not markedly improve sensitivity while having very poor specificity. None of the tests approached the 80%-90% levels of sensitivity and specificity that are generally regarded as appropriate for clinically useful predictive tests or screening techniques. The best models contained some early evaluation of response at either 2 or 4 weeks, but even these models, while somewhat more sensitive, suffered from poor specificity, and upwards of 40% of patients still went on to respond to treatment. It is perhaps open to debate what the sensitivity and specificity cut-offs should be for clinically useful tests in different circumstances, but a 40% or greater false-negative rate would appear to be unacceptable. For the time being, it appears that no simple test can yet substitute for clinical judgment in managing anemia in cancer patients. Although, ideally, the response to all medical treatments could be better predicted in advance, a response rate of 70% in this patient population is quite high compared with the response rate to other treatments. Use of a predictive algorithm with a >50% false-negative rate would potentially deny effective treatment to too many patients, and further study in this area is needed before such tests become useful.
This meta-analysis and the original studies were funded by the R.W. Johnson Pharmaceutical Research Institute. We thank Elizabeth Wager for editorial assistance. M.Z. and C.P. are employees of Ortho-Biotech. T.L. is a paid consultant for Ortho-Biotech and receives honoraria from Roche and Amgen. A.P. is a consultant statistician for Ortho-Biotech.
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||