The Oncologist, Vol. 12, No. 4, 495-504, April 2007; doi:10.1634/theoncologist.12-4-495 © 2007 AlphaMed Press
Assessing Responsiveness of Cancer-Related Fatigue Instruments: Distribution-Based and Individual Anchor-Based MethodsaSchool of Nursing, College of Medicine, National Taiwan University, Taipei, Taiwan; bCollege of Nursing, University of Utah, Salt Lake City, Utah, USA Key Words. Fatigue • Individual difference • Instrument • Neoplasm • Sensitivity Correspondence: Shiow-Ching Shun, R.N., Ph.D. School of Nursing, National Taiwan University, No. 1, Jen-Ai Rd. Sec. 1, Taipei 100, Taiwan, R. O. C. Telephone: 886-2-23123456, ext. 8434; Fax: 886-2-23219913; e-mail: scshun{at}ntu.edu.tw Received June 8, 2006; accepted for publication February 8, 2007.
Background. The aims of the study were to examine the responsiveness of Chinese versions of the Cancer Fatigue Scale (C-CFS), the Schwartz Cancer Fatigue Scale-revised (C-SCFS-r), and the Fatigue Symptom Inventory (C-FSI) based on effect sizes and patient perceptions of change. Method. Convenience sampling was used to recruit subjects at a chemotherapy treatment center for outpatients in Taiwan. Data were collected twice: on the day cancer patients were receiving chemotherapy treatment (T1) and 2 days post-treatment (T2). Results. Questionnaires were complete at T2 by 148 subjects (60.9%). The differences between T1 and T2 were statistically significant for all three scales. The effect sizes, ranging from a medium to a large change for the C-CFS, C-SCFS-r, and C-FSI were reported based on four groups (self-reported no increase, small increase, moderate increase, and large increase). Generalized estimating equations were used to compare the fatigue scores based on the four groups by controlling for the fatigue level at baseline and the time effect. The results indicate that the fatigue scores after 2 days of treatment in the three "change" groups were statistically significantly larger than in the "no increase" group. In addition, the pretreatment fatigue level in the "large increase" group was significantly higher than in the other three groups. Conclusion. Results indicate that the three scales are sensitive to change over 2 days. However, the three scales may not effectively discriminate between a moderate and large change. Therefore, further testing on cancer patients with severe fatigue to examine responsiveness to detect minimal important differences for the three scales is recommended. Disclosure of potential conflicts of interest is found at the end of this article.
The responsiveness of an instrument, in brief, means that it can detect differences when patient outcomes have a meaningful or clinically important change in a clinical state [13]. The characteristic of responsiveness is important because it supports the longitudinal construct validity of an instrument [3]. However, evaluating responsiveness of instruments has not kept pace with other forms of instrument development because a lack of clarity exists about the best approach for evaluating responsiveness. Currently, at least 25 definitions and 31 different measures of responsiveness can be found [3]. Estimating the minimal important difference (MID) for an instrument is a special case of examining responsiveness to change [4]. The MID means that the smallest change between two scores is subjectively meaningful to patients [5]. This means that to estimate the MID for an instrument is not only to calculate an instrument's quantitative scores but also to interpret the qualitative meanings to patients [2]. However, the MID is still an elusive figure because it is multifaceted [6]. In addition, it is a context-specific value rather than a fixed number and it varies from study to study and across populations [3]. Based on systematic reviews, the mean MID was either closer to 0.30 standard deviation (SD) [4] or almost equal to Cohen's moderate effect size of 0.5 [7], but most results were based on a seven-point response scale and focused on populations with chronic disease (e.g., heart failure and asthma). Distribution-based methods and anchor-based methods have been devised to determine the MID. Distribution-based methods often use statistical significance, calculation of effect size, and standard error of measurement to calculate magnitudes of change scores considered to be significant [3, 810]. The procedures of anchor-based methods require changes in scores to be linked to other measures (e.g., type of diagnosis, response to treatment for group change or self-report for individual change) that are themselves subject to interpretation [3]. The differences between the two methods are that distribution-based methods are based on statistical results and anchor-based methods are based on a comparison of patients' or clinicians' judgments of degree of change with actual changes in scores. However, the anchor-based methods suffer from a lack of external validation with other clinical parameters, and statistically significant outcomes are highly sample-size dependent [8, 10]. Cancer-related fatigue (CRF) is a complex, multidimensional and multicausal phenomenon, and it is one of the most common unrelieved symptoms of cancer impacting all dimensions of quality of life [11, 12]. Since fatigue is highly subjective and unique to the person experiencing it, self-report from patients is the only effective method to measure fatigue level. Therefore, a reliable and valid scale is critical for measuring CRF. Despite the importance of CRF, little is known about the MID for CRF instruments. Even though more than 15 scales have been developed to measure CRF [13, 14], only two studies focusing on the MID for CRF have been reported [15, 16] and neither of them has been tested on populations in the Chinese culture. However, fatigue occurs within the context of patients' daily lives and environments [17]. Thus, fatigue in different cultures affects daily activity in different ways and this could affect the perception of fatigue. Therefore, the purposes of this study were: (a) to evaluate the responsiveness of three translated instruments (Chinese versions of the Cancer Fatigue Scale, the Fatigue Symptom Inventory, and the Schwartz Cancer Fatigue Scale-revised) based on distribution-based methods using effect size (i.e., standardized response means) and on an anchor-based method by evaluating patients' self-report of the degree of change in comparison with change in scale scores and (b) to compare the effect sizes and fatigue scores among groups of patients with no increase, small increase, moderate increase, and large increase, which were regrouped by self-reported level of change.
The study was approved by the Institutional Review Boards of the University of Utah in U.S. of America and the Chang-Gung Memorial Hospital in Taiwan. A prospective, descriptive, repeated measures design was used to generate data. The purpose of this study was to examine whether the scales could detect the changes in level of fatigue experienced by patients. Instead of administering an intervention and comparing with a control group, this study proposed chemotherapy treatment as a factor that would interfere with or change the level of fatigue, and a literature review indicated that fatigue levels usually increase sharply during the first 2448 hours following chemotherapy [18, 19]. Therefore, data were collected twice: on the day when cancer patients received chemotherapy treatment (T1) and two days post-treatment (T2). If the patients could not complete the questionnaires at T2 by themselves, the researcher called them to collect data.
Sample
Measurements The Cancer Fatigue Scale (CFS) is a self-reported, 15-item questionnaire with simple summative scoring (1 = no to 5 = very much) composed of physical (seven items), affective (four items), and cognitive (four items) domains [21]. The Schwartz Cancer Fatigue Scale-revised (SCFS-r), is a six-item self-reported measure using a simple summative scoring scale (1 = not at all to 5 = extremely) and measuring two domains: physical (three items) and perceptual (three items) fatigue [22]. The Fatigue Symptom Inventory (FSI) is a 14-item self-reported measure designed to assess the intensity (four items), daily pattern (one item), and duration (two items) of fatigue, and perceived interference (seven items) [23, 24]. Twelve items use an 11-point summative scale (0 = not at all fatigued to 10 = extremely fatigued) and one item is the number of days in the past week during which fatigue was felt. However, the one item related to daily pattern provides qualitative information and is not summed into the total fatigue score. The reliability and validity of these three scales have been examined and have shown an acceptable range for newly developed instruments [2124]. The higher the score, the greater the level of fatigue.
Chinese versions of the Cancer Fatigue Scale (C-CFS), the Schwartz Cancer Fatigue Scale-revised (C-SCFS-r), and the Fatigue Symptom Inventory (C-FSI) were rigorously translated and back-translated by the bilingual committee members in this study. This committee approach uses a team of bilingual individuals to translate from the source to the target language. This approach takes advantages of different perspectives in the translation team and decreases the bias of translators [25]. After examining the psychometric properties of these three Chinese versions' scales at T1, the results indicated that the three scales had good internal consistency (Cronbach's A GRS has been used in studies to evaluate responsiveness of the instruments in terms of self-reported change in fatigue [5, 15, 27]. The GRS includes two items. One item inquires whether the level of fatigue is the same, worse, or improved compared with the baseline. The second asks participants to rate their decline or improvement on an eight-point GRS for change in fatigue, from 0 (no change) to 7 (a large change) [5]. Based on patients' ratings of the GRS, fatigue changes of 0 points indicate no change, changes of 13 points indicate a small changes in fatigue, 45 points indicates a moderate change in fatigue, and 67 points indicates a large change in fatigue. Based on a literature review, it is still controversial whether or not worsening states should be treated differently from improving states. Some studies point out the MIDs for positive and negative changes are approximately equal [5, 28], but one study argued that worsening states should be treated differently from improving states [27]. In this study, we considered that worsening states and improving states might be different. Therefore, patients reporting no change (i.e., fatigue changes of 0 points) and those reporting a decrease were grouped into a "no increase" group based on the GRS. Patients reporting a fatigue change of 13 points, 45 points, and 67 points were grouped into small, moderate, and large increase groups, respectively.
Data Analysis
Third, the relationships between the mean fatigue change scores in the three scales and the anchor (i.e., GRS) were examined by conducting Spearman's The MID represented by Jaeschke et al. [5] used the mean change in scale scores associated with a small change. Because the distributions of fatigue based on the three scales were not normally distributed (positively skewed) and the data between T1 and T2 were correlated, the generalized estimating equation (GEE) was used to adjust for the baseline level of fatigue. The GEE has been commonly used to analyze correlated longitudinal data and it can be used to analyze clinical data with a non-normal distribution and unequal variances for the groups [29]. Finally, the GEE was used to examine whether there was a group effect (i.e., no increase, small increase, and moderate-to-large increase) by controlling for the baseline fatigue score at T1 to prove that there was consistency with the results using the GRS.
Patient Characteristics Two hundred forty-three cancer patients participated at T1 and 158 (65%) patients returned the questionnaire, including 24 patients (15.2%) interviewed by phone at T2. After imputation and excluding the subjects with missing data of more than three data points, 148 subjects (60.9%) were used in this study. In total, 13 data points for 11 patients were imputed and the means and SDs between before and after imputation indicated no statistically significant differences. The demographic characteristics and clinical information for the subjects at T1 and T2 are summarized in Table 1. There were no differences in the demographic and clinical characteristics and fatigue scores at T1 and T2 between the participating patients and patients not completing T2. In the sample included in this report, 57.7% were female, with an age range of 2186 years, and with a mean age of 51.16 years. The majority were married (85.2%) and had lower than a high school education (50.3%). The patients had several different types of cancer; over one third of the patients had breast cancer, and of these, over half had been diagnosed within the previous 3 months and 24.2% (n = 36) received concurrent radiotherapy treatment.
Reliability and Ceiling and Floor Effects
Four Groups Based on the GRS The level of fatigue before chemotherapy in the large change group was significantly higher than those in the other groups (Table 3). This means that the patients experiencing higher levels of fatigue before treatment experienced higher levels of change in fatigue after treatment. Even for the very short time period (i.e., 2 days after treatment) we used to measure the change in fatigue, our results are consistent with previous studies [3032]patients with higher baseline fatigue scores are at risk for higher levels of fatigue after treatment, or for chronic fatigue. Therefore, we need to pay attention to the patients with higher levels of fatigue before receiving chemotherapy treatment and offer them more information to manage their fatigue.
Distribution-Based Methods: Paired t-Test and Effect Size
Anchor-Based Methods: Within-Patient Global Change Ratings
Because CRF is a subjective feeling of tiredness, the MID for CRF from the perspective of patients is more important than from the perspectives of clinicians or the general population [33, 34]. The ability to detect change over time and to detect the MID of responsiveness for three translated fatigue instrument scales was examined on cancer outpatients undergoing chemotherapy treatment. Analysis included calculating paired t-tests, effect sizes, and MIDs based on patients' self-report of change. The results of this study provide support for the responsiveness over 2 days of the three scales with findings as follows.
For testing MID, Wyrwich et al. [9] indicated that these instruments should have <15% floor and ceiling effects and the reliability should be at a minimum 0.700.80. In this study, the internal consistency In this study, the effect sizes of the fatigue scales (0.700.89) were medium to large. In the study of Cella et al. [16], the three samples had effect sizes that were variable from small to large (0.21.38), and in two other studies, effect sizes for fatigue instruments were small to moderate (effect sizes in Meek et al. [36] were 0.140.49; effect sizes in Schwartz et al. [15] were 0.370.78). However, only the SCFS-r was also used in the study of Schwartz et al. [15], but the effect size for the C-SCFS-r was larger in the current study (effect size, 0.89) than in Schwartz et al. [15, 36] study (effect size, 0.71). The reason might be that the baseline level of fatigue in this study (M = 8.33, n = 148) was lower than in the Schwartz et al. [15, 36] study (M = 13.1, n = 103), because baseline level could affect the results of effect sizes. The effect size was the lowest for the C-FSI. Wider response scales (i.e., 11-point Likert-type scales), or the longer reference period (within the past week) of the administration of the C-FSI might be the reason these results were different from those of the other scales (five-point Likert-type scale with 1- or 2-day reference periods). As expected, chemotherapy treatment profoundly affected the level of fatigue, and these instruments were able to detect that change. This result is consistent with the literature indicating that the first 2 days after a chemotherapy treatment are associated with increased fatigue as a side effect of the treatment [37, 38]. Overall, the level of fatigue increased after treatment and this was a statistically significant change from the baseline level of fatigue (Table 2). However, based on the results of the self-report of change, patients had variable reports of response to treatment. Around 32.4% (n = 48) of the patients reported that their level of fatigue did not change after treatment and 10 patients (6.8%) reported that their level of fatigue decreased after the treatment. Yet, few empirical data studies report amelioration of the level of fatigue after chemotherapy treatment in cancer patients [15]. It is unclear which types of patients are more likely to have a stable or improved condition after treatment. There were no statistical differences between the 10 patients with decreased fatigue and the other 138 patients in terms of demographic and disease characteristics. However, the baseline level of fatigue scores for these 10 patients (C-CFS, 15.6; C-SCFS-r, 8.4; C-FSI, 44.0) were higher than the scores of the 90 patients who reported that their level of fatigue increased after treatment (C-CFS, 13.76; C-SCFS-r, 8.59; C-FSI, 27.07), especially on the C-FSI. Compared with the level of fatigue after treatment, these patients may perceive a higher level of fatigue caused by some factors before treatment, but these factors might vanish or decrease, thus affecting their level of fatigue. More longitudinal research is needed to explore the relationship between fatigue and chemotherapy treatment in cancer patients before and after treatment. Most studies base this relationship on group-level data, masking the variable response among patients. The results also show that the level of fatigue before chemotherapy in the large increase group was significantly higher than in the other groups (Table 3). This means that patients experiencing a higher level of fatigue before treatment experience a higher level of change in fatigue score after treatment. Even for the very short time period (i.e., 2 days after treatment) we used to measure the change in fatigue, our results are consistent with previous studies [3032]patients with a higher baseline fatigue score are at risk for a higher level of fatigue after 1 week, 6 weeks, or 2.5 years of treatment. Therefore, we need to pay attention to the patients with higher levels of fatigue before receiving chemotherapy treatment and offer them more information to manage their fatigue. Several studies have pointed out that the mean change in scores for the MID was approximately 0.5 per item [5, 28, 39]. In this study, the mean changes per item of 0.5 for the small change group for the C-FSI and C-SCFS-r (Table 4) based on the results of the GEE were consistent with this finding [5], but the result for the C-CFS was lower than the previous findings (ß = 0.378). Until now, this criterion has mostly been established for chronic diseases (e.g., asthma, heart failure) based on a seven-point Likert-type scale, and the baseline level of fatigue and time effect have not been controlled for [40]. More than half (54.4%) of the patients in this study had been diagnosed within the previous 3 months. Therefore, the CRF in these newly diagnosed patients may be different from the fatigue reported for populations with chronic disease. One study indicated that the trend in the mean change scores per item for the scales increased as the level of quality of life improved; differences of 0.5 represented a small change, differences of 1.0 represented a moderate change, and differences >1.5 represented a large change [28]. The results obtained by controlling for the baseline and time effects (Table 4) indicate that the mean differences in scores for the different levels of change in the three scales did not coincide with this assumption. Therefore, this assumption needs to be further tested. However, the results based on calculating the effect sizes for the four groups (Table 3) were inconsistent with the GEE results (Table 4). All the effect sizes in the group with a large change for the three scales were smaller than for those with a moderate change. The reasons could be that: (a) the calculation of effect size is highly sample-size dependent [8, 10], and the sample in the "large increase" group consisted only of 10 patients, and (b) the three scales may not effectively discriminate between a moderate and a large change. Even though this study provides important findings in responsiveness for the three scales, there are limitations. First, the self-report method using the GRS is based on retrospective judgment of change; recall bias could influence the results [7, 15]. Patients' judgments are influenced by their baseline health status, their duration of illness, and response shift [41]. The appropriateness of an anchor is defined as having at least a moderate correlation between the anchor (e.g., GRS) and changes in health-related quality-of-life instruments [4]. However, the strengths of the associations between the mean fatigue change score and the GRS were low to moderate (r2 = 0.170.30) in this study. A better anchor may need to be explored in further study. In addition, the stability and reproducibility of the GRS have not been evaluated [39]. Second, although the sample consisted of heterogeneous cancer patients, the distribution of fatigue scores based on the three fatigue scales was positively skewed at T1 and T2, and 30.4% of the patients had the lowest scores on the C-SCFS-r at T1. Furthermore, the 35% attrition rate is high and might be a result of the patients suffering the side effects of chemotherapy at home and refusing to complete the questionnaire and return it. Most of the cancer patients with mild or moderate fatigue may have been feeling well enough to complete the questionnaire, while the cancer patients with severe fatigue may not have been willing to participate in this study. The results may therefore not be generalizable to cancer patients with extreme, severe fatigue. Third, the results in this study were based on the patients reporting increases in fatigue at T2; therefore, these findings may not be generalizable to a population with a decrease in fatigue after treatment. In conclusion, the three scales are sensitive to change over 2 days. The MIDs for the FSI and SCFS-r were higher than the threshold of small change scores, which is approximately 0.5 per item, but the MID for the CFS was lower than the suggested value based on the GEEs results. The increment among the groups based on the effect sizes and the GEE models was inconsistent, and the three scales may not effectively discriminate between a moderate change and a large change. Therefore, further testing on cancer patients with severe fatigue to examine responsiveness to detect MIDs for the three scales is recommended. In addition, the results in this study indicate that patients with a higher baseline fatigue score are at risk for a higher level of change in fatigue after treatment. Health care providers should pay more attention to the patients experiencing high levels of fatigue before receiving treatment.
The authors indicate no potential conflicts of interest.
The authors gratefully acknowledge the Chang-Gung Memorial Hospital for their support, the translators (Allison Kuo, Jui-Sheng Pai, Turrando Bih, Sui-Mei Chiu, Troy Catterson, Sharling Yao, Shi-Ying Yang, Fei-Bin Kang, and Yukio Kachi) from the two committees for their help in the process of translation and back translation, and the experts (George Plautz, Atsushi Nara, and Yoko Azuma) who assessed the discrepancy between the original scales and back-translation versions. In addition, the authors gratefully acknowledge Caren J. Frost and Patricia H. Berry for their help in the process of conducting this study.
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||