Article Text

Psychometric evaluation of the multiple sclerosis impact scale (MSIS-29) for proxy use
  1. F A H van der Linden1,
  2. J J Kragt1,
  3. M Klein2,
  4. H M van der Ploeg2,
  5. C H Polman1,
  6. B M J Uitdehaag3
  1. 1Department of Neurology, VU University Medical Centre, Amsterdam, the Netherlands
  2. 2Department of Medical Psychology, VU University Medical Centre
  3. 3Department of Clinical Epidemiology and Biostatistics, VU University Medical Centre
  1. Correspondence to:
 MsF A H van der Linden
 Department of Neurology, VU University Medical Centre, Amsterdam, Netherlands, PO Box 7057, 1007 MB Amsterdam, The Netherlands;


Background: There may be difficulties in the use of self report measurements in patients with cognitive impairment or serious mood disturbances which interfere with reliable self assessment, as may be the case in multiple sclerosis (MS). In such cases proxies may provide valuable information. However, before using any questionnaires in a proxy sample, the questionnaire should be evaluated for proxy use.

Objective: To evaluate the psychometric properties of the 29 item Multiple Sclerosis Impact Scale (MSIS-29) when used by proxies of MS patients.

Methods: A sample of 62 partners of MS patients completed the MSIS-29. The data were evaluated for the psychometric criteria of the MSIS-29, including data quality, scaling assumptions, acceptability, reliability, validity, and responsiveness.

Results: Psychometric evaluation was satisfactory; data quality was high, and scaling assumptions and acceptability were good. Reliability was high (α>0.80). Findings were consistent with results of a psychometric evaluation in a patient sample.

Conclusions: The MSIS-29 can be used reliably in proxies of patients with MS. As a next step the relation between data obtained from patients and proxies needs to be studied, focusing on factors that may affect agreement and discrepancies.

  • EDSS, Expanded Disability Status Scale
  • ES, effect size
  • GNDS, Guy’s Neurological Disability Scale
  • ICC, intraclass correlation coefficient
  • MS, multiple sclerosis
  • MSIS, Multiple Sclerosis Impact Scale
  • multiple sclerosis
  • clinical scale
  • proxy
  • MSIS-29
  • psychometric evaluation

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

In recent years, there has been increasing use of self report measurements for assessing quality of life, disease impact, or disability. The underlying assumption when using self report measurements is that the patient fully understands the questions and can give a reliable judgment on their situation. Problems may arise when the cognitive or communication abilities are insufficient. Moreover, emotional factors may interfere with self assessment, which will affect reliability. Therefore self report measurements may be less suitable in cases of cognitive dysfunction or severe mood disturbance. When applying self report measurements, the presence of such factors could lead to unreliable information or loss of information. It has been suggested that in these situations the use of proxies (for example, partners, relatives, or close friends) to assess the situation of the patient should be considered.1 These considerations are relevant for multiple sclerosis (MS) because self report measures are often used in MS, and both cognitive decline and severe mood disturbances may be present during the course of the disease.2,3 These disorders could lead to invalid self reporting of quality of life,4–8 although there are also papers suggesting that in cognitively impaired and depressed patients self reports may be valid.9–12 Nevertheless, proxy measurements could be important in MS and useful under certain circumstances. We are not aware of any studies that have assessed systematically the value and limitations of proxy measurements in MS. Of all available self report measurements that can be used in MS, the Multiple Sclerosis Impact Scale (MSIS-29) is disease specific and has been rigorously evaluated for its psychometric properties.13 We therefore evaluated the use of the MSIS-29, which measures physical and psychological disease impact, in proxies. However, the MSIS-29 was not developed for use in proxies of MS patients. Thus it is essential that the questionnaire is validated first by using standard psychometric tests before using the scale to evaluate differences and agreement between patients and proxies.1

The focus of the present study was to evaluate the psychometric properties of the MSIS-29 when used by proxies of patients with MS.


Study sample

For this study, partners of MS patients were asked to complete the MSIS-29 as proxies. Partners were recruited in two ways. The first was through an ongoing study of MS patients at the outpatient clinic which required the presence of a healthy control. Those controls who were partners of the patients were asked to participate in the present study by completing the MSIS-29 during the initial visit and at the time of a six month follow up visit. Secondly, a group of partners of patients who visited the outpatient clinic was asked to complete the MSIS-29 twice, with a two week interval, to measure test-retest reliability. The Expanded Disability Status Scale (EDSS), the Guy’s Neurological Disability Scale (GNDS), and the MS subtype were available for all patients.

The medical ethics committee of the VU University Medical Centre approved the study protocol. Informed consent was obtained from all participants.

Measures and procedures

The MSIS-29 is a 29 item measurement, which assesses the physical and psychological impact of MS on affected individuals; it is a self report questionnaire consisting of 20 and 9 items, respectively. Scores on the individual items are added and then transformed to a 0–100 scale, thereby generating two summary scores (for physical and psychological impact). Higher scores indicate worse health.13 For this study the Dutch version of the MSIS-29 was used, which is an in-house translation of the original English version that was subsequently validated in a large study across eight European countries.14 The partners were asked to complete the MSIS-29 after being instructed to keep the following question in mind: “How do you think the patient experiences the impact of MS on his/her life?” Standard psychometric methods were used to evaluate the following psychometric properties: data quality, scaling assumptions, acceptability, reliability, responsiveness, and validity. Evaluation was done according to the methods used in the health technology assessment report of Hobart et al.13 Table 1 gives a summary of the psychometric criteria that were applied.13

Table 1

 Psychometric criteria


The MSIS-29 and instructions were given to the patient for the proxy to complete at home. After a two week interval the proxy received the MSIS-29 for the second time by postal survey. Test–retest reliability was calculated by intraclass correlation coefficient (ICC).


A method to detect clinically important change over time is by comparing change scores (baseline score minus follow up score) with an external criterion of change, such as a transition question. This is also referred to as the retrospective method.13,15,16 In this study we compared the change scores of the MSIS-29 for the two domains, using the following transition question: “When you compare the health status of the patient at this moment with that of six months ago (baseline study), would you say that it is: better, the same, or worse?” Although patients did not have an intervention, change could have been induced by the passage of time.15 Responsiveness can then be determined in different ways. One method is to calculate the correlation between the change scores and the transition score; a high correlation indicates a greater responsiveness.13 Another method is by calculating the minimum clinically important difference. This is done by dividing the mean change score for improved/deteriorated patients by the mean change score of unchanged patients. A change of 0.5 is considered small, 1.0 moderate, and 1.5 large.13,15 Finally, effect sizes (ES) were calculated by dividing the mean change score by the standard deviation of the admission score. Effect sizes are interpreted as small at ES <0.20, medium at ES  =  0.50, and moderate at ES >0.80.13


Internal validity was determined by calculating the intercorrelations between the physical and the psychological domain. External validity was examined by correlations between the two MSIS-29 domains and the EDSS and GNDS. Group differences validity was assessed by comparing both scales with variables such as age and sex.

Validity was also determined by comparing the mean MSIS-29 proxy score between MS subtypes and weighing against patient scores on the GNDS and the EDSS. The mean scores on the GNDS, EDSS, and the MS subtype were divided into different groups and it was hypothesised that the corresponding mean MSIS-29 proxy scores for these different groups should differ significantly. The GNDS score was divided into two groups based on the median: ⩽15 and >15. The EDSS score was divided into three groups: EDSS 0.0–3.5, EDSS 4.0–6.0, and EDSS ⩾6.5. MS subtype was also divided into three groups: relapsing remitting (RR), secondary progressive (SP), and primary progressive (PP). Independent t tests were done to compare the different groups. The mean MSIS-29 proxy scores for GNDS and EDSS were compared only for the physical domain as the GNDS and the EDSS do not have a psychological domain. The MS subtype was compared for both domains, given that the subtype might influence the psychological impact of MS. Table 1 shows the hypotheses, which were defined prospectively. Correlations were calculated for age and sex.


In all, 62 partners were recruited as proxy for the study. Table 2 shows the characteristics of the partners and the patients.

Table 2

 Characteristics of proxies and patients

Data quality

These results are shown in table 3. The percentage of missing data was low for both scales, and the computable scale scores were high. Scales could be calculated for at least 93.5%. Item test–retest was 0.87 for the physical domain and 0.83 for the psychological domain.

Table 3

 Psychometric properties for the physical and psychological MSIS-29 impact scales

Scaling assumptions

Scaling assumptions are shown in table 3. Similar item means and standard deviations were found for both scales. Frequency distributions showed to be symmetrical for both scales. Skewness for the physical scale was between −1 and 1, but skewness for the psychological scale was slightly out of range (+1.235). Item-total correlations for both scales were high: the physical scale ranged from 0.55 to 0.85 and the psychological scale from 0.41 to 0.86, both therefore fulfilling the correlation criterion (>0.30). There was a definite scaling success of 50% and 44% for the physical and psychological scale, respectively. Consequently the criterion of 65% or larger was not satisfied.


The score ranged from 0–91.3 for the physical scale and 0–91.7 for the psychological scale (table 3). The physical scale was slightly skewed (−2.31). Floor and ceiling effects were small and did not exceed the predefined maximum of 20%. Scale midpoints were near the middle and standard deviations were almost equal.


The internal consistency was high (table 3): 0.96 for the physical scale and 0.90 for the psychological scale. In all, 30 proxies completed both questionnaires for the test–retest study, which resulted in a test–retest reproducibility of 0.87 for the physical scale and 0.83 for the psychological scale.


The correlations between the change score and the transition score were 0.07 for the physical scale and 0.24 for the psychological scale (table 3). The minimum clinically important change was small to moderate. The effect sizes were minimal for both domains.


Validity assessments are shown in tables 4 and 5. Internal validity was moderate, with an intercorrelation of 0.65 (table 4). The EDSS score showed a higher correlation of 0.66 with the physical domain than with the psychological domain. Correlation for GNDS with the physical domain was slightly higher (0.69) than for the EDSS score. Both MSIS-29 scales showed moderate correlations with age and sex.

Table 4

 Pearson correlation for age, sex, EDSS, and GNDS scores with the physical and psychological MSIS-29 impact scales

Table 5


Table 5 shows the hypothesised group differences with accompanying p values. Independent t tests were significant except for the difference between the mean MSIS-29 physical score (SP) and mean MSIS-29 physical score (PP). This was also true for the group differences of the mean MSIS-29 psychological scores per MS subtype.


Our aim in this study was to evaluate the psychometric properties of the MSIS-29 when used by proxies. Standard techniques were applied to evaluate data quality, scaling assumptions, acceptability, reliability, responsiveness, and validity. This was done according to psychometric criteria that were also used for the psychometric evaluation of the MSIS-29 in a patient sample.13 Psychometric properties were satisfactory for most of the criteria. Data quality was high. Scaling assumptions and acceptability were good, although the score did not span the full range for both domains and the physical scale was slightly skewed. High values for reliability were found. Responsiveness showed minimal effect sizes and poor correlations. This might have reflected the fact that change was caused by natural progression over time rather than by a treatment effect following an intervention. Natural progression may be harder to detect by proxies than the effect of a treatment. Taking into account the relatively small sample size we think that the responsiveness results should be interpreted with caution.

The validity of the MSIS-29 was established in different ways. Correlation between the EDSS and the physical scale was high, which could be expected as the EDSS measures disability. Correlations between the two scales and the GNDS were both moderate, which is probably explained by the fact that the GNDS contains questions on both physical and psychological topics. Correlations for both MSIS-29 scales and the variables age and sex were low, indicating that neither scale was biased by these variables. Significant differences in mean MSIS-29 scores were found for the different groups defined on the basis of GNDS, EDSS, and MS subtype. Of the 10 prior hypotheses, six were significantly confirmed, especially those referring to the physical domain. Partners of patients with an EDSS score ⩾6.5 scored significantly higher on the MSIS-29 physical score than partners of patients with a low EDSS score of 0.0 to 3.5. Proxies believed that patients with a high EDSS score experience a greater physical impact of MS than patients with a lower EDSS score. The same can be said about patients with a high GNDS score: partners of patients with a GNDS score >15 scored significantly higher on the MSIS-29 physical score than partners of patients with a GNDS score ⩽15. This shows that differences in the EDSS score and the GNDS score are reflected in the MSIS-29 score of proxies, and that the MSIS-29 measures what it is supposed to measure when completed by proxies. The MSIS-29 physical score only differed between the subtypes RR and SP and between RR and PP, indicating that proxies do not make a distinction between the subtypes SP and PP. No significant differences were found between the MS subtypes on the psychological domain. Whether this is in line with the opinion of the patient remains to be investigated. A limitation of the present study in this respect is the lack of other quality of life measurements with psychological domains such as the 36 item short form health survey (SF-36) and the general health questionnaire (GHQ), which precludes direct comparisons. Other research showed that proxies are better at detecting the more observable domains, such as physical problems, than the subjective domains, such as psychological problems.17

Overall, the results showed resemblance with the psychometric properties when used by patients.13,18,19 In summary, our results indicate that the MSIS-29 also is a reliable and valid instrument when used by proxies. This creates a solid basis for further use of MSIS-29 proxy measurements in MS. As a next step the relation between data on proxy versus self reports needs to be studied, focusing on factors that may affect agreement and discrepancies.


We wish to thank all the patients and their partners who participated in this study. The MS Centre of the VU Medical Centre is partially funded by a programme grant of the Dutch MS Research Foundation.


Supplementary materials

  • Lay Summary

    This paper has been chosen as the December 2005 JNNP Patient Choice. The lay summary for patients is available here as a PDF (printer friendly file)

    If you do not have Adobe Reader installed on your computer, you can download this free-of-charge, please Click Here

    Files in this Data Supplement:


  • Competing interests: none declared.