Effect sizes can be misleading: is it time to change the way we measure change?

Jeremy C Hobart; Stefan J Cano; Alan J Thompson

doi:10.1136/jnnp.2009.201392

Article Text

Research paper

Effect sizes can be misleading: is it time to change the way we measure change?

Jeremy C Hobart1,2,
Stefan J Cano1,2,
Alan J Thompson2

¹The Neurological Outcome Measures Unit, Peninsula College of Medicine and Dentistry, Devon, UK
²UCL Institute of Neurology, London UK

Correspondence to Dr J C Hobart, Department of Clinical Neuroscience, Peninsula College of Medicine and Dentistry, Room N16 ITTC Building, Tamar Science Park, Davy Road, Plymouth, Devon PL6 8BX, UK; jeremy.hobart{at}pms.ac.uk

Abstract

Objectives Previous comparisons of the ability to detect change in the Barthel Index (BI) and Functional Independence Measure motor scale (FIMm) have implied these two scales are equally responsive when examined using traditional effect size statistics. Clinically, this is counterintuitive as the FIMm has greater potential to detect change than the BI and raises concerns about the validity of effect size statistics as indicators of rating scale responsiveness. To examine these concerns, in this study a sophisticated psychometric analysis was applied, Rasch measurement to BI and FIMm data.

Methods BI and FIMm data were examined from 976 people at a single neurorehabilitation unit. Rasch analysis was used to compare the responsiveness of the BI and FIMm at the group comparison level (effect sizes, relative efficiency, relative precision) and for each individual person in the sample by computing the significance of their change.

Results Group level analyses from both interval measurements and ordinal scores implied the BI and FIMm had equivalent responsiveness (BI and FIMm effect size ranges −0.82 to −1.12 and −0.77 to −1.05, respectively). However, individual person level analyses indicated that the FIMm detected significant improvement in almost twice as many people as the BI (50%, n=496 vs 31%, n=298), and recorded less people as unchanged on discharge (FIMm=4%, n=38; BI=12%, n=115). This difference was found to be statistically significant (χ²=273.81; p<0.000).

Conclusions These findings demonstrate that effect size calculations are limited and potentially misleading indicators of rating scale responsiveness at the group comparison level. Rasch analysis at the individual person level showed the superior responsiveness of the FIMm, supporting clinical expectation, and its added value as a method for examining and comparing rating scale responsiveness.

Multiple sclerosis
rehabilitation
scales

https://doi.org/10.1136/jnnp.2009.201392

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

View Full Text

Footnotes

Linked articles 206409.
Competing interests None.
Ethics approval This study was conducted with the approval of the National Hospital for Neurology and Neurosurgery.
Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles

Editorial commentary
The measurement of disease

I S J Merkies R A C Hughes
Journal of Neurology, Neurosurgery & Psychiatry 2010; 81 943-943 Published Online First: 19 Jun 2010. doi: 10.1136/jnnp.2010.206409

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Footnotes

Linked Articles

Read the full text or download the PDF:

Log in using your username and password

Read the full text or download the PDF:

Log in using your username and password