| HOME | HELP | CONTACT US | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Breast Cancer |
Department of Internal Medicine, Breast Oncology Program, University of Michigan Comprehensive Cancer Center, Ann Arbor, Michigan, USA
Key Words. Tumor marker • Her-2/neu • Breast cancer • Estrogen receptor • Prognostic factor • Predictive factor
Correspondence: Daniel F. Hayes, M.D., University of Michigan Comprehensive Cancer Center, 1500 East Medical Center Drive, Ann Arbor, Michigan 48109, USA. Telephone: 734-615-6725; Fax: 734-615-3947; e-mail: hayesdf{at}umich.edu
Received February 22, 2006; accepted for publication April 11, 2006.
Access and take the CME test online and receive 1 AMA PRA category 1 credit at CME.TheOncologist.com
![]()
LEARNING OBJECTIVES
Top
Learning Objectives
Abstract
Introduction
American Society of Clinical...
When Is a Tumor...
How Useful Is the...
How Reliable Is the...
Trial Design and Frameworks...
Real-World Clinical Examples
Conclusions
Authors' Note
Disclosures
References
Additional Reading
After completing this course, the reader will be able to:
| ABSTRACT |
|---|
|
|
|---|
Recommendations for routine use of tumor markers in breast cancer have been conservative. Although several studies have been reported, few are of sufficiently high level of evidence to permit solid conclusions. Three key issues in tumor marker evaluation are utility, magnitude, and reliability. Poorly conceived study designs cloud the issue of how the marker might be used. Reliance on p-values rather than the size of the differences in outcome between patients who are positive and those who are negative for the factor obscures the importance. Technical issues result in poor reproducibility and interpretability of assays. Analytical issues lead to poorly defined cutoff values for marker levels. Poor patient selection leads to difficulty interpreting results because of confounders such as differences in treatment regimens. This review focuses on these issues, with an emphasis on currently accepted tumor markers. Finally, new tumor marker reporting recommendations are discussed, the adoption of which may lead to improved design and publication of tumor marker studies in the future.
| INTRODUCTION |
|---|
|
|
|---|
For patients with early-stage breast cancer, it would be helpful to identify which patients will relapse without adjuvant systemic therapy, so that only patients who receive benefit are exposed to the inherent toxicities. Approach to treatment of metastatic disease is generally with palliative intent rather than for cure. In this setting, identification of those patients with rapidly progressive disease permits selection of more rapidly acting but perhaps more toxic therapy.
During the past few decades, with the explosion of molecular technology and understanding of the biology of breast cancer, numerous studies have been performed to identify prognostic and predictive factors in breast cancer, with mixed success. Multiple expert panels have convened to analyze available data in order to establish guidelines for the use of tumor markers, but their recommendations have been very conservative [4, 5]. In this review, we address the pitfalls that have led to difficulties establishing tumor markers for routine clinical use, with a specific focus on tumor markers in breast cancer.
| AMERICAN SOCIETY OF CLINICAL ONCOLOGY GUIDELINES |
|---|
|
|
|---|
|
Therefore, despite the large number of research studies evaluating the prognostic and predictive ability of numerous tumor markers in breast cancer, the ASCO panel recommended few for routine use in clinical practice. Why were these recommendations so conservative? In the succeeding sections of this paper, we outline the multiple factors that underlie this conservative approach.
| WHEN IS A TUMOR MARKER USEFUL (USE)? |
|---|
|
|
|---|
|
|
|
| HOW USEFUL IS THE TUMOR MARKER (MAGNITUDE)? |
|---|
|
|
|---|
For example, a breast cancer patient with disease in the lymph nodes at the time of diagnosis is two to three times more likely to have a breast cancer event (local recurrence or distant metastasis) than a patient without lymph node involvement, regardless of treatment. Since lymph node status has classically been used to make clinical decisions, we have arbitrarily designated it as a "strong" prognostic factor, using it as the gold standard to set the criteria for consideration of other, putative markers [16]. A strong prognostic factor is depicted by Factor 1 in Figure 1A
. Alternatively, untreated patients with ER-positive breast cancer have only slightly better outcomes than those with ER-negative disease, and therefore we designated ER status as a weak prognostic factor, as portrayed by Factor 2 in Figure 1A
[17, 18]. In a previous publication, we have suggested hazard ratios of <1.5, 1.52, and >2, to distinguish weak, moderate, and strong prognostic factors, respectively, for breast cancer [16]. Such arbitrary designations would need to be established for other uses, as appropriate.
Predictive factors can also be classified as weak (Factor 1), moderate, or strong (Factor 2), depending on their ability to predict response to, and therefore benefit from, a given therapy, as illustrated in Figure 1B
. One measure to permit comparison of the relative strengths has been designated the "relative predictive value" (RPV), the ratio of the likelihood that a factor-positive patient will respond to treatment to the likelihood that a factor-negative patient will respond to treatment. As with prognostic factors, we have proposed arbitrary classes of prediction factors for breast cancer therapies based on what has been accepted by consensus, in this case ER [19]. Adjuvant tamoxifen therapy has been shown to decrease recurrence rates for ER-positive patients by 40%50%, whereas ER-negative patients obtain minimal, if any, benefit from hormonal therapy [20]. Therefore, the RPV is >8. Similarly, the majority of patients with ER-positive MBC have a clinical response to hormonal therapy, whereas patients with ER-negative disease do not respond [21, 22]. ER status is therefore a strong predictive factor for response to hormonal therapy. In this framework, we have arbitrarily proposed that, in breast cancer, weak, moderate, and strong predictive factors correspond to RPVs of 12, 24, and >4, respectively [15]. It is important to understand the clinical implications of the relative strength of a prognostic and/or predictive factor when integrating this information into routine practice, and to determine if the data support its use in a specific clinical situation.
| HOW RELIABLE IS THE TUMOR MARKER (PRECISION AND ACCURACY)? |
|---|
|
|
|---|
In addition to determining when to use a tumor marker and the magnitude of its effect, it is important to ensure that the technical aspects of the marker are reliable and reproducible and that the study design and conduct are appropriate to test the marker for a clinical use of interest. Several problems with tumor marker studies, including technical, analytical, and trial design issues, have limited the introduction of new prognostic and predictive factors into routine clinical practice [11].
What Technical Factors Influence Measurement of Markers?
From a technical standpoint, difficulties arise because of poor sensitivity and/or specificity of the assay for the analyte, poorly reproducible assays, and differences between assays that use different reagents for measurement of the same marker [11]. Even for the two most commonly used and accepted tumor markers, ER expression and Her-2/neu overexpression, standard methodologies have not yet been established [4, 5]. Two primary technical considerations are critical when measuring a tumor marker. The first is which type of assay should be used. The second is the reproducibility of the chosen assay, from both a technical and an analytical perspective.
For example, Her-2/neu status can be determined by measures of protein expression (by immunohistochemistry [IHC], Western blotting, or enzyme-linked immunosorbent assay), measures of RNA expression (by Northern blotting or reverse transcriptase-polymerase chain reaction [RT-PCR]), and/or measures of DNA amplification (by fluorescence or chromogenic in situ hybridization [FISH and CISH, respectively]). Furthermore, even within these categories, different reagents (e.g., different antibodies in IHC assays) are used in different tests. The results are not interchangeable, either within or between classes of assays, and therefore researchers must decide which methodology they will employ. Once that decision is made, researchers must then decide how to perform the assay. For example, when assessing Her-2/neu overexpression by IHC, technical issues such as antibody concentration and antigen retrieval methods may cause unacceptably high false-positive or false-negative rates.
In one study, IHC and FISH resulted in only a 65% agreement for Her-2/neu status [23]. In a different study, results obtained from local laboratories were compared with those from a central laboratory for two Her-2/neu assays, the HercepTestTM IHC assay (Dako North America, Inc., Carpinteria, CA) and the FISH assay, with 79% concordance for HercepTestTM and 85% concordance for FISH [24]. Therefore, for the same test at multiple laboratories, and for different tests for the same marker, there is a significant degree of discordance for two commonly used tests for the evaluation of Her-2/neu status.
The stakes are high. Recently reported data suggest that adjuvant trastuzumab decreases recurrence rates by 50%. However, up to 5% of patients who receive trastuzumab develop cardiac dysfunction, and the cost of 1 year of therapy may exceed $100,000. Therefore, it is essential that Her-2/neu, the target for trastuzumab, be assayed accurately and precisely for every tissue sample. Expert panels are now being convened to establish guidelines for the conduct and interpretation of common tumor marker assays, including ER and Her-2/neu. These guidelines should lead to standardization of the assays, which should allow for more reliable results both for routine clinical practice and for use of these assays in clinical trials.
What Analytical Issues Are Important to Consider?
Assay Interpretation
Determination of assay results can also vary, even for a single type of assay. For example, with visual assays such as IHC for ER and Her-2/neu, intra- and interobserver variability leads to differences in interpretation [25, 26]. Some attempts have been made to standardize interpretation, such as development of the so-called "Allred score" for semiquantitation of ER expression [27], but these have not been universally adopted. Automated and semiautomated systems appear to be highly accurate and are likely to be more reproducible. Examples of automated systems include the ChromaVision ACIS® system (ChromaVision Medical Systems, Inc., San Juan Capistrano, CA) for ER expression measurement, the CellSearchTM assay (Veridex, LLC, Warren, NJ) for detection of circulating tumor cells [28], and the Oncotype DXTM assay (Genomic Health, Inc., Redwood City, CA), a new prognostic tool for patients with hormone receptor (HR)-positive, lymph node-negative breast cancer [29].
Cutoff Point Determination
Regardless of the assay, one has to select some value or level that distinguishes positive from negative results. However, there is no consensus regarding correct methods to establish cutoff points, and different studies of the same prognostic or predictive factor can have widely varying "optimal" cutoff points [30].
Cutoff points may be defined using either arbitrary or data-derived methods (Table 4
). One approach is to consider any value greater than two standard deviations above the mean for normal subjects to be positive. Cutoff points can also represent arbitrary values within affected patients; for example, one might decide that 10%, 50%, or 90% of affected patients will be classified as "positive." Others have defined cutoff points based on technical factors, such as the limit of detection of the assay [21]. Finally, the cutoff point for a new assay can be defined by comparing it with an older assay [27].
|
Recently, a novel data-derived method to select cutoff points, designated subpopulation treatment effect pattern plot (STEPP) analysis, has been proposed [32]. STEPP analysis evaluates outcomes to specific treatments in sub-populations of patients within randomized clinical trials or meta-analyses [32]. For example, it has been proposed that recurrence rates after treatment with chemotherapy should be evaluated in the context of the endocrine responsiveness of tumors, since HR-positive and -negative tumors appear to behave differently. Rather than arbitrarily defining cutoff points for ER positive and negative, the authors performed a STEPP analysis of data from a previously conducted randomized clinical trial and were able to demonstrate a benefit from chemotherapy only in the subset of patients with very low ER values.
Cutoff Point Validation
Regardless of whether cutoff points are chosen arbitrarily or are data-derived, the selected cutoffs require subsequent validation. The initial evaluation should be performed using a "test set" of patients. In the second part of the study, the utility of the cutoff point should then be confirmed using a separate "validation set" composed of a similar, but completely independent, patient population.
For example, the Oncotype DXTM assay is based on the principle of evaluating expression of multiple candidate genes using quantitative RT-PCR [29]. The investigators initially screened more than 200 candidate genes with the aim of developing a test that would predict the likelihood of recurrence of cancer in patients with HR-positive, lymph node-negative breast cancer. Breast cancer tissues from 447 patients with HR-positive, lymph node-negative tumors were used retrospectively to generate an algorithm using 16 of these genes that permitted division of patients into subgroups with very low, intermediate, or very high risk for recurrence. Patients are assigned to these groups based on a "recurrence score" derived from the algorithm. The majority of the test samples were obtained from patients treated with tamoxifen alone in the National Surgical Adjuvant Breast and Bowel Project (NSABP) B-20 trial [33], and the data-derived algorithm was then validated using a separate retrospective cohort of patients from the NSABP B-14 trial [34], who had similar clinical characteristics and were also treated with tamoxifen alone. If the cohorts had different clinical characteristics or had been treated differently, the validation would not have been legitimate. Validation of the results in a separate patient population strongly suggests that the test is reliable and that the results are likely to be meaningful in a larger population, as long as the patients tested are similar to the cohorts included in the original studies.
Statistical Analysis
Of course, statistical analysis is necessary to determine that the observations are not a result of chance alone. However, once a tumor marker has been identified and validated, it is important to determine the relative value of the marker in the context of previously identified prognostic and/or predictive factors, such as lymph node status and tumor size, using some type of a multivariate analysis. If such an evaluation is not performed, clinicians will be unable to determine the usefulness of incorporating the new marker into routine clinical practice.
| TRIAL DESIGN AND FRAMEWORKS FOR GENERATING, REPORTING, AND EVALUATING TUMOR MARKER RESULTS |
|---|
|
|
|---|
|
For example, from the prospective randomized clinical trials of adjuvant trastuzumab versus placebo, we anticipate combined analyses of the multiple underpowered LOE II studies evaluating novel markers for benefit from this drug. These pooled results should help focus trastuzumab therapy in the subgroups of Her-2/neu-positive patients most likely to benefit. The combined analyses of small LOE III studies that contain patients with variable clinical characteristics and treatments, however, are more likely useful to generate new hypotheses than to provide clinically useful and validated results.
For all studies, whether prospective or retrospective, it is important to identify the appropriate patient population to be investigated. All patients should have a similar profile based on known prognostic factors. Importantly, the effects of systemic treatment are critical and must be considered. If the study is addressing the prognostic value of a marker, all patients should have been treated uniformly and without whatever treatment might be considered if the patients have a "poor prognosis." If the question is addressing the utility of adding any treatment, all patients should be untreated. If the question is whether more treatment should be given, then all patients should have received the same treatment.
If the study is addressing predictive factors, a control group that has been treated identically to the study group, with the exception that they did not receive the treatment in question, is essential. Although the control group might be from a selected historical control, predictive factors are ideally studied in the context of prospective, randomized, controlled trials comparing the patients who received the treatment in question with those who did not receive that treatment.
Tumor marker studies should be carefully designed, using the above criteria, to obtain clinically useful information. Researchers frequently have practical difficulties designing such studies, however, because of the need for significant numbers of patients with particular clinical characteristics in order to address a specific clinical question. As discussed above, this can sometimes be overcome by pooling the results of several well-done but underpowered studies. Another significant drawback is obtaining funding for tumor marker studies, as pharmaceutical companies and third-party payers derive relatively smaller financial benefit from the results compared to the enormous payoffs for a "blockbuster" therapeutic agent. Regardless, given the consequences, one has to question why it is acceptable for tumor marker studies to be performed with less scientific rigor than studies of new pharmaceutical agents.
Reporting of tumor marker studies has also been historically haphazard. Recently, in order to standardize reporting of tumor marker study results, the National Cancer Institute-European Organization for Research and Treatment of Cancer (NCI-EORTC) Working Group on Cancer Diagnostics developed REporting recommendations for tumor MARKer prognostic studies (REMARK) [35]. The guidelines outline items that should be addressed by researchers when reporting the results of tumor marker studies, including prospectively defining the question the study is trying to address, identifying the appropriate patient population and controls, determining the end point, and identifying potential confounding factors. Explicit recommendations are given regarding which information must be contained in publications of tumor marker studies, including patient and treatment information, specimen characteristics, assay methods, study design, and statistical analysis methods.
| REAL-WORLD CLINICAL EXAMPLES |
|---|
|
|
|---|
Her-2/neu
The first report of Her-2/neu as a prognostic factor in breast cancer was published in 1987 [36]. Since then, more than 200 papers addressing this topic have been published, with widely mixed and disparate results [37]. Different authors have concluded that Her-2/neu is associated with poor outcomes, no difference in outcome, or even favorable outcomes. Indeed, a great deal of this confusion could have been avoided if the investigators would have addressed the components described above: (a) What is the intended use, (b) What is the magnitude of difference between positive and negative for that use, and (c) How reliable is the estimate of the magnitude?
The potential uses for Her-2/neu are for prognosis and prediction, as outlined in Table 6
. Issues related to the reliability of the assays have been described in detail above. Overall, studies support that Her-2/neu overexpression is a poor prognostic factor, although its magnitude appears weak. For example, in Adjuvant! Online it has only a relative predictive value of 1.5 [38]. Its role as a prognostic factor thus remains unclear.
|
Patients with tumors that overexpress Her-2/neu appear to have relative resistance to some chemotherapy regimens, such as cyclophosphamide, methotrexate, and 5-fluorouracil (CMF), but not to others, such as anthracycline-containing regimens [37, 40, 41]. Her-2/neu status is not generally a consideration when choosing a chemotherapy regimen, however, because, as is the case with endocrine therapy, the level of available evidence does not support using Her-2/neu status to predict response to chemotherapy.
In contrast, Her-2/neu status appears to be strongly predictive of response to trastuzumab. A patient with a tumor that overexpresses Her-2/neu is usually treated with trastuzumab in either the adjuvant or metastatic settings [8, 9, 42, 43] because the benefits outweigh the risks in the majority of cases. A patient with a tumor that fails to overexpress Her-2/neu appears not to respond to treatment with trastuzumab [10, 44] and therefore would not be treated with trastuzumab to avoid both unnecessary toxicity and cost. Thus, use of Her-2/neu to select trastuzumab is recommended based on "use" and "magnitude." However, as discussed above, there are substantial apparent difficulties with the technical reliability of all available assays for the marker. Nonetheless, despite the shortcomings of Her-2/neu studies outlined above, at present there is sufficient LOE II evidence to support the routine clinical use of Her-2/neu overexpression for selection of trastuzumab therapy, as indicated in the most recent ASCO tumor marker guidelines [4].
Multigene Expression: Oncotype DXTM
A more recent example of development of a new tumor marker is provided by the case of Oncotype DXTM. The investigators specifically developed the test to determine prognosis in ER-positive, lymph node-negative patients who were treated with tamoxifen [29]. The available results suggest that, in this group of patients, Oncotype DXTM is a strong prognostic factor because the ratio of the hazard ratios of the high and low recurrence score cohorts is >2 in both the test and validation cohorts [29]. The Oncotype DXTM test is also reliable because it fulfills the criteria outlined above for technical, analytical, and trial design issues (Table 2
). The assay is reproducible and was validated in independent test and validation cohorts of patients, as described above.
Oncotype DXTM has also been evaluated as a predictive factor, although less rigorously. In the NSABP B-14 trial, the patient cohorts were treated with tamoxifen versus observation, and comparison of the cohorts suggested that the Oncotype DXTM assay is predictive for tamoxifen [45]. Similarly, analysis of NSABP B-20, in which patients were randomized to CMF and tamoxifen versus tamoxifen alone, permitted the investigators to determine that the assay is predictive for response to chemotherapy [45].
Are the available data sufficient to conclude that Onco-type DXTM has been validated to the extent that patient treatment decisions should be based on the results? Perhaps, but because these studies were all proposed using available samples from trials performed many years ago and represented only subsets of the overall population entered into the trials, concerns have been raised about wholesale clinical adoption of this assay. In that regard, the North American Breast Cancer Intergroup is developing the TailorRx clinical trial to further validate and extend the Oncotype DXTM results. The trial design assumes that the assay is prognostic, and will confirm the ability of the assay to predict response to chemotherapy.
In the TailorRX trial, tumors of patients with ER-positive and lymph node-negative breast cancer will be tested using the Oncotype DXTM assay. Patients with low recurrence scores, who have good prognoses without chemotherapy, will receive hormonal therapy alone. At the other end of the spectrum, patients with high recurrence scores will receive chemotherapy in addition to hormonal therapy. Those patients whose scores fall in the intermediate range will all receive hormonal therapy and be randomly assigned to chemotherapy or not. This trial design will permit validation of the Oncotype DXTM results in a similar patient population in a large prospective clinical trial, and will allow for generation of new data on which to base treatment recommendations for patients whose recurrence scores are intermediate.
Given the substantial technical, analytical, and trial design problems with previously performed tumor marker studies, it is imperative to address these issues. It is especially important to standardize the assays for commonly used tumor markers. Otherwise, patients with false-positive test results for predictive factors will receive treatments that are not beneficial but which may cause significant toxicity, and those with false-negative results will not be offered potentially life-saving therapies. In addition, a better understanding of the potential pitfalls in tumor marker study design will allow for the development of new, potentially more useful assays.
| CONCLUSIONS |
|---|
|
|
|---|
Frameworks such as TMUGS can be useful when designing and conducting these studies to ensure that appropriate components are included, thereby leading to the establishment of new tumor markers for routine clinical use [11]. By progressively generating and refining a hypothesis, based on data derived from increasingly well-developed studies, tumor markers with clinical utility can be identified (Fig. 2
). In addition, the new REMARK guidelines should promote better design and conduct of studies specifically focused on tumor marker validation [35]. Implementation of these recommendations when designing tumor marker studies will result in the generation and publication of appropriate and complete clinical data, leading to the adoption of new, well-validated tumor markers for routine clinical use.
|
| AUTHORS NOTE |
|---|
|
|
|---|
| DISCLOSURES |
|---|
|
|
|---|
| REFERENCES |
|---|
|
|
|---|
| ADDITIONAL READING |
|---|
|
|
|---|
Hayes DF, Bast RC, Desch CE et al. Tumor marker utility grading system: a framework to evaluate clinical utility of tumor markers. J Natl Cancer Inst 1996;88:14561466.
McShane LM, Altman DG, Sauerbrei W et al. REporting recommendations for tumour MARKer prognostic studies (REMARK). Br J Cancer 2005;93:387391.
This article has been cited by other articles:
![]() |
D. F. Hayes Clinical Utility of Tumor Markers: Study Design and Evaluation Am. Assoc. Cancer Res. Educ. Book, April 18, 2009; 2009(1): 273 - 277. [Full Text] [PDF] |
||||
![]() |
R. Simon The Use of Genomics in Clinical Trial Design Clin. Cancer Res., October 1, 2008; 14(19): 5984 - 5993. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. F. Hayes and J. Smerage Is There a Role for Circulating Tumor Cells in the Management of Breast Cancer? Clin. Cancer Res., June 15, 2008; 14(12): 3646 - 3650. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ignatiadis, G. Kallergi, M. Ntoulia, M. Perraki, S. Apostolaki, M. Kafousi, G. Chlouverakis, E. Stathopoulos, E. Lianidou, V. Georgoulias, et al. Prognostic Value of the Molecular Detection of Circulating Tumor Cells Using a Multimarker Reverse Transcription-PCR Assay for Cytokeratin 19, Mammaglobin A, and HER2 in Early Breast Cancer Clin. Cancer Res., May 1, 2008; 14(9): 2593 - 2600. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Ross, C. Hatzis, W. F. Symmans, L. Pusztai, and G. N. Hortobagyi Commercialized Multigene Predictors of Clinical Outcome for Breast Cancer Oncologist, May 1, 2008; 13(5): 477 - 493. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. F. Hayes Markers for Predicting and Evaluating Response to Therapy for Breast Cancer ASCO Educational Book, January 1, 2008; 2008(1): 30 - 34. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. A. Ioannidis Is Molecular Profiling Ready for Use in Clinical Decision Making? Oncologist, March 1, 2007; 12(3): 301 - 311. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | CONTACT US | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| THE ONCOLOGIST | STEM CELLS | CME | ALPHAMED PRESS JOURNALS |