Statistics from Altmetric.com
Jenkinson et al show that London handicap scale scores are about the same if items are weighted equally rather than using the published scale weights.1 We reached much the same conclusion using the data from which the scale weights were derived.2
Handicap is defined as disadvantage in role performance due to impairments or disabilities, which implies valuation of the extent to which role performance is affected. Value—from the viewpoint of health services research—is quantified as the “utility” of a state of health. The scale weights were derived by asking various population samples to value different combinations of problems, described using the handicap dimensions and items from the handicap scale. These were analysed to determine the contribution of each of the component parts of the description.
The fact that equal weighting gives roughly the same scores as the empirically derived weights is probably because the items were carefully chosen on the basis of clinical experience to be approximately equally spaced across the range of possible severity.
Does it matter if different weighting methods lead to much the same results? Weighting processes are inexact, be they empirically derived or equal weighting, but the second approach simply increases the level of approximation. The 95% confidence intervals around the agreement between estimated and measured scores were about ± 10 (on a 0–100 scale). This measurement imprecision arises because rating health states is difficult, leading to random measurement error, and the modelling assumed that overall valuation of a state of health could be estimated by the sum of the component parts of the description, which is almost certainly an oversimplification (although goodness of fit statistics for the model were reasonable). The London handicap scale is primarily an epidemiological tool—that is, it is intended for use in groups (such as in a clinical trial). If scores are calculated for individual patients—for example, in clinical practice—there is a further approximation, that between the values and opinions of that individual, and “average” views of the population from which the values were derived. There is some evidence that the handicap dimensions have general validity, and there is some consensus on the values assigned to states of handicap. As part of the revision process of the International Classification of Impairments, Disabilities and Handicaps 3 qualitative studies established strong core transcultural agreement on six domains of participation with potential to be affected by health conditions, and these corresponded to the handicap dimensions (Martin Prince, personal communication). Furthermore, a comparison of values given to states of health by Hong Kong Chinese showed good agreement with those estimated by using the London handicap scale weights (derived from Londoners).4 Neither was there convincing between population variation in scale weights assigned in the original scale development work.2 It is not safe, however, to assume that there are no between person differences.
We are pleased to see a further independent validation of the London handicap scale. If simplification makes the scale more useful then we welcome it. The additional burden in applying the weights, however, is no more than that of adding six lines of commands in a statistical computer program (for instance, using SPSS syntax). As we have empirically derived estimates of valuations of handicap states, we see no reason why the further approximation of equal weighting is necessary.
Mant et al reply:
Whether one decides to use the original weighting scoring system of the London handicap scale or the simpler unweighted scoring system as we proposed is essentially a trade off between the advantages of each approach. As argued by Harwood and Ebrahim, the weighting system will provide what at least seems to be a more accurate estimate of the value of a health state for only limited additional analysis. However, it should be noted that weighted systems more often than not give the appearance of greater accuracy when in fact all they produce is differentresults.1-1 Further, even if the weights are regarded as the “gold standard” the increase in accuracy is small, and there is scope for confusion (different weights have been published),1-2 1-3 and the derivation of the score is more complex, which might lead to computational errors. However, perhaps the greatest advantage of the unweighted scoring system is that there is a simpler relation between the questionnaire responses and the final score, which makes interpretation more straightforward. There is a role for both methods of analysing the London handicap scale. Use of the simpler unweighted scoring would be likely to increase uptake and use of the instrument, without any important loss of accuracy. On the other hand, researchers who are already familiar with the weighting system have little to gain from switching to the unweighted system.