Background The 21-item Modified Fatigue Impact Scale (MFIS) has been recommended as an outcome measure for use in multiple sclerosis and is commonly used to generate an overall score of fatigue.
Objective To test if the MFIS total score is valid by application of the Rasch measurement model.
Method The MFIS was sent by post to patients with clinically definite multiple sclerosis in two centres in the UK. Data were fitted to the Rasch model.
Results Analysis was based on 415 records (55% response). The 21-item scale did not fit the Rasch model mainly because of multidimensionality. The scale was found to contain a “physical” dimension and a “cognitive” dimension, consistent with the original subscale structure. Valid physical and cognitive subscales were derived after deletion of some items.
Conclusion The MFIS cannot be used to generate a single overall score of fatigue. The conceptual interaction between the two dimensions remains unclear, which poses problems when interpreting change scores in these individual scales. Studies in which a global MFIS score was used as either an outcome measure or selection tool may need to be re-evaluated.
- Multiple sclerosis
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Fatigue is a common symptom in many neurologic diseases and is particularly prevalent in multiple sclerosis. Several scales have been developed to assess the level of fatigue in general populations and in those with multiple sclerosis.1 2 For example, the Fatigue Impact Scale (FIS) was created in order to measure the impact of fatigue on the quality of life in chronic illnesses.1 The scale was clearly intended to provide a global score of fatigue and subscale scores since measures of internal consistency and other statistical tests were given for the 40-item scale. Indeed, in later work, Fisk3 stated that the high internal consistency of the 40-item scale and the fact that the subscales had originally been determined only on a conceptual basis were sufficient to support the assumption that fatigue impact could be measured as a unitary construct.
The Modified Fatigue Impact Scale (MFIS) was created during the development of the Multiple Sclerosis Quality of Life Inventory (MSQLI)4 by shortening the 40-item FIS. This resulted in a 21-item scale (MFIS) with nine “physical” items, 10 “cognitive items” and two “psychosocial” items. In addition, the response options were reworded, the reason for which was unclear. This scale has been recommended in the American Multiple Sclerosis Council for clinical practice guidelines, which called for further psychometric evaluation to be performed in order to establish its continued suitability as an outcome measure.5
The power of psychometrics to evaluate health outcome scales has developed considerably in recent years through the addition of modern psychometric techniques to supplement traditional test theory. Pre-eminent in this development has been the application of the Rasch measurement model,6 which has special properties consistent with fundamental measurement. It is the only way in which ordinal observations of clinical phenomena can be converted into linear measurement.7 Consequently, the objective of this current study was to analyse the MFIS, as used in a UK multiple sclerosis population, by application of the Rasch model.
The MFIS was posted to 753 patients with clinically definite multiple sclerosis8 from two centres in the UK (Walton Centre for Neurology and Neurosurgery, Liverpool and Imperial College Healthcare Trust, London) identified from research databases. The study was approved by the local ethics committee (Sefton EC115.03 and Hammersmith 05/Q0401/7), and all subjects gave a written informed consent.
The Rasch measurement model
The Rasch measurement model is a mathematical description of the way in which subjects must interact with test items in order to produce linear measurement. The model states that the probability of a person giving a certain answer to an item is a logistic function of the difference between the person's “ability” and the item's “difficulty”. Both ability and difficulty must refer to the same underlying construct, hence, the Rasch model is unidimensional. In this case, the underlying construct of the scale is fatigue, and so a person who is very fatigued would be able to affirm items expressing high levels of fatigue—for example, an item worded “I am exhausted after walking 10 feet” would be very difficult to affirm and would be done so only by those with higher levels of fatigue.
The Rasch model itself defines measurement, and data from a scale can be tested against model expectations with reference to predefined χ2-based fit statistics. The Rasch analysis was performed using the RUMM 2020 (V. 4.1, build 194) computer software (http://www.rummlab.com.au/). For Rasch analysis, a sample size of 243 will provide accurate estimates of item and person locations irrespective of the scale targeting.9 Failure of items to fit Rasch model expectations led to an iterative procedure using various techniques including grouping items into subscales and item deletion. Since unidimensionality is the foremost requisite for fundamental measurement, an additional post hoc t test of unidimensionality was performed.10 The analysis employed standard techniques and fit criteria, which are described in detail elsewhere.11 12
Of the returned questionnaires, 415 were suitable for analysis (415/753=55% response). Two hundred and ninety-three (70.6%) of the respondents were women. The sample had a distribution of disease types typical of a larger population13 and of a broad range of disability.
Initial fit to the Rasch model
The 21-item scale did not fit the Rasch model (χ2=253.8, df=156, p<0.001). Multidimensionality was apparent with 23.1% of t values in the post hoc t test showing significance when only 5% or less is required to confirm unidimensionality.
The physical (including social) items were found to group by difficulty level as did the cognitive items (table 1). Physical manifestations of fatigue were much more common than cognitive manifestations, but misfit to the model shows that they do not form a necessary hierarchical order consistent with a unidimensional construct of fatigue.
Reformatting the MFIS into two subscales
Given the multidimensionality in the scale, two subscales were identified for further analysis based on a principal components analysis of item-fit residuals.
Physical scale (Phys-8)
The physical scale, including the two social items, initially did not fit the Rasch model (χ2=105.6, df=66, p=0.001, post hoc t test 9.9%). Fit was achieved by deleting items 4, 14 and 17 (χ2=56, df=48, p=0.199, person separation index (PSI)=0.902). All individual items then had an acceptable fit. The post hoc t test revealed that only 4.8% of the t values were significant, signifying unidimensionality. Person abilities were generally well matched to item difficulties; there was a small ceiling effect of 2.9% (12/415) and a negligible floor effect.
Cognitive scale (Cog-5)
Initially, the cognitive scale did not fit the Rasch model (χ2=99, df=60, p=0.001, post hoc t test 11.6%); items 1, 2, 3, 5 and 11 were removed on the basis of the fit statistics. The resulting 5-item scale showed a satisfactory fit to model expectations with appropriate unidimensionality (χ2=37.3, df=30, p=0.169, PSI=0.953, post hoc t test 5.3%). Person abilities were less well matched to item difficulties than in the physical scale; there was a small ceiling effect of 3.1% (13/415) and a floor effect of 10.1% (42/415).
Despite its widespread use, the 21-item MFIS did not fit Rasch model expectations. One of the main reasons for misfit was multidimensionality, specifically, the presence of two dimensions coinciding with the original physical and cognitive subscales. This finding renders the 21-item MFIS total score invalid and does not support the unidimensional assertion in the original papers.1 3 In clinical trials assessing therapy for multiple sclerosis fatigue, some used the total MFIS summed score as the primary outcome measure,14 15 while others used both the total and subscale scores.16 17 One study used a cut point of ≥45 in the total MFIS score as an entry criterion.14 The results of the current study indicating that the MFIS is multidimensional suggest that the results of previous studies using the MFIS total score may be compromised.
In order to achieve fit to the Rasch model, it was necessary to remove some items from the physical and cognitive subscales. From the physical group, item 14, “physically uncomfortable”, showed gross misfit, confirming a face-value observation that this item appeared unusual in the context of the other items. From the cognitive subscale, the discards appeared to relate to vigilance: ‘difficulty completing tasks’, ‘less alert’, ‘forgetful’, ‘decisions’, ‘paying attention’ and ‘thinking clearly’.
There has been much debate surrounding the theoretical dimensionality of fatigue, and strictly separate concepts of cognitive and physical fatigue are often proposed.18 19 The findings from this study support this assertion, and the subscales derived could be used to investigate the associations between physical and cognitive fatigue, and to investigate their differential impact on key aspects such as participation and quality of life.
The 21-item MFIS does not meet Rasch model expectations, and the summed score of fatigue impact is invalid because of multidimensionality. The scale was found to contain physical and cognitive subscales. The two original social items were found to be simply part of the physical dimension. The conceptual interaction between these dimensions remains unclear, which poses problems when interpreting changed scores in these individual scales. Studies in which a global MFIS score was used as either an outcome measure or selection tool may need to be re-evaluated.
The authors would like to thank Dr Richard Nicholas and Dr Omar Malik, of Charing Cross Hospital, for allowing the approach of patients under their care. Thanks also go to Dave Watling and the staff of the Clinical Trials Unit, WCNN, for their assistance with the mailout.
Competing interests None.
Patient consent Obtained.
Ethics approval This study was conducted with the approval of the local ethics committee (Sefton EC115.03 and Hammersmith 05/Q0401/7).
Provenance and peer review Not commissioned; externally peer reviewed