Construction and pilot assessment of the Upper Limb Assessment in Daily Living Scale
- Marc Rousseaux1,
- Huei-Yune Bonnin-Koang2,
- Bernadette Darné3,
- Phillippe Marque4,
- Bernard Parratte5,
- Alexis Schnitzler6,
- Patrick Dehail7,
- Nacera Bradai8,
- Jean Michael Viton9,
- Walter Daveluy1,
- Alain Yelnik8,
- Myriam Zadikian10,
- Charles Benaïm11
- 1Service de Rééducation Neurologique, CHRU et Université de Lille, Lille, France
- 2Service de Médecine Physique et de Réadaptation, CHU, Université de Nîmes, France
- 3Monitoring Force Group, Maisons-Laffitte, France
- 4Service de MPR, CHU, Université de Toulouse, Toulouse, France
- 5Service de MPR, CHU, Université de Besançon, Besançon, France
- 6Service de MPR, AP-HP, Université de Versailles-Saint Quentin, France
- 7Service de MPR, CHU, Université de Bordeaux, Bordeaux, France
- 8Service de MPR, AP-HP, Lariboisière-Fernand Widal, Université de Paris, Paris, France
- 9Service de MPR, CHU, Université de Marseille, Marseille, France
- 10Merz Pharma France, Nanterre, France
- 11Service de MPR, INSERM CIE1 and INSERM 803, CHU, Université de Dijon, Dijon, France
- Correspondence to Dr M Rousseaux, Service de Rééducation Neurologique, Hôpital Swynghedauw, CHRU, Lille Cedex 59037, France;
Contributors All authors contributed to the conception and design, acquisition of the data or analysis and interpretation of the data; drafting of the article or revising it critically for important intellectual content; and final approval of the version to be published.
- Received 9 July 2011
- Revised 27 December 2011
- Accepted 13 February 2012
- Published Online First 6 March 2012
Objective The upper limb function of hemiplegic patients is currently evaluated using scales that assess physical capacity or daily activities under test conditions. The present scale, the Upper Limb Assessment in Daily Living (ULADL) Scale, was developed to explore the subjective and objective functional capacities of such patients in a proximal to distal sequence.
Methods A group of experts constructed a scale addressing 17 upper limb functions (five active passive and 12 active) which could be explored by a questionnaire (Q) and a test (T). Reproducibility, internal consistency, concurrent validity (Rivermead Motor Assessment (RMA)) and learning effect were estimated in a multicentre study.
Results 49 stroke patients were each rated three times within 7 days by a total of 21 physicians, yielding a total of 142 ratings. The ULADL took 16±8 min to complete compared with 9±5 min for the RMA. Cronbach's alpha coefficient was 0.95 for Q and 0.97 for the practical tests (T). The global Q and T scores, and in particular the global Q score, were slightly higher at the second rating. The intra-rater intraclass correlation coefficient (ICC) was 0.65 (95% CI (0.44 to 0.79)) for Q and 0.97 (0.95 to 0.98) for T, and the inter-rater ICC was 0.95 for both Q and T. The Bland and Altman method showed good intra- and inter-rater reliability with no systematic trend. Correlation coefficients for ULADL versus RMA were >0.80 for both Q and T.
Conclusions The ULADL Scale has good psychometric properties and can explore patients with different degrees of upper limb impairment.
Upper limb motor recovery after acute stroke is slow and usually follows a proximal to distal sequence.1 Many patients have severe persistent manual motor difficulties that considerably hamper object handling.2 3 Joint movements at the wrist of the affected upper limb can be continuously measured during ambulant monitoring to quantify effective movements in daily life activities.4 However, use of a specific device is requested and no information on the daily activities that are actually performed are provided. Several rating scales have been developed to analyse functional abilities and most have been used in clinical trials of therapeutic interventions.
Such scales can be classified into five main groups. The first group, which includes the Medical Research Council classification5 6 and the Fugl–Meyer Motor Assessment,7 8 focuses on motor control. But, in fact, they measure impairment and not activity. The second comprises object handling tests such as the Nine Hole Peg Test,9 the Box and Block Test9 10 and the Action Research Arm Test,11 that generally assess distal pinch or grip. These scales can only be used in patients with a degree of distal motor function, and the objects used do not correspond to those encountered in daily life. The third category of tests uses common objects but also require a degree of distal motor control; examples include the Frenchay Arm Test12 13 and the Arm Motor Ability Test.14 The fourth category analyses motor and functional recovery, in a shoulder to hand sequence. These scales include the Rivermead Motor Assessment (RMA)15 16 and the Wolf Motor Function Test.17 They are probably the most sensitive to all degrees of motor impairment because they also cover proximal movements. However, several items evaluate motor control while others assess object handling. The fifth category of tests evaluates the main objective of patient assessment—namely, performance in daily life activities. They include the Motor Activity Log18–20 and the Abilhand questionnaire.21 But most items address activities executed with the hand and only poorly consider proximal functions.
These different scales have been used in therapeutic trials aimed at demonstrating the effect of functional electrical stimulation,22 23 constraint induced therapy,17 18 24 botulinum toxin injection,25 26 robot assisted rehabilitation,27 28 mirror therapy29 or combined interventions.30 But the efficacy of these interventions was often difficult to demonstrate, especially when using tests of motor control or functional abilities in a shoulder to hand sequence.28
Simple scales assessing upper limb function in daily life are still needed for use in both clinical practice and therapeutic trials. Indeed, the objective of rehabilitation and specific therapeutic interventions is to improve proximal as well as distal functions in real life conditions. Many patients do not recover the ability to grasp objects but therapeutic interventions can give them the possibility of improving proximal motor control which is necessary for placing the hand on the table or for holding an object on the table with the help of the non-affected hand. Such scales should ideally be able to investigate patient performance in daily life and also assess movements of increasing complexity, with both passive functions (eg, cleaning and dressing of the affected limb) and active functions (usually distal activities). Comparison of performance in daily life and in formal testing is also important for enhancing the patient consciousness of what he(she) is able to perform at a given time in the recovery process.
We have developed a new scale, the Upper Limb Assessment in Daily Living (ULADL) Scale, which was designed to fulfil these criteria by exploring physical capacity in real life conditions. Here we report a pilot assessment of its validation in patients with stroke.
The test was designed during three meetings of a group of French specialists in physical medicine and rehabilitation or neurology (MR, HYBK, PM, BP, AS and CB). All were involved in post-stoke rehabilitation programmes, including those using botulinum toxin injections or robot assisted rehabilitation. In fact, the aim was to construct a test evaluating both passive and active functions, focusing on actions executed during activities of daily living. About one-third of these activities were designed to be passive and two-thirds active (object handling). It was planned to begin the evaluation of active functions with an object blocking task (the simplest), followed by transport tasks requiring direct grasp or pinch, and ending with handling functions (global then distal grip). It was decided to evaluate a maximum of 20 tasks in order to minimise the time taken to administer the scale. Firstly, the patient's subjective estimation of his/her ability in each task was assessed using a self-administered questionnaire and then actual performance in each task was examined in a practical test. The experts used their own experience of patient performance and needs in daily life activities to determine the first distribution between passive and active functions and the number of tasks. A first list of questions and general instructions for the examiner and patient was established. The items were tested by each expert on 2–4 patients. These were required to give their opinion about test presentation and activities presented in the list. Corrections and changes in items were performed before finalisation of the scale and the wording was then reviewed by two independent experts.
The final version of the ULADL Scale (see data supplement, available online only) comprised 17 items and two consecutive sections: a 17 item questionnaire (Q) on daily activities, followed immediately by a practical test (T) of performance in the same 17 activities. During the Q step, questions are asked by the investigator who can mimic the movement for better understanding. The patient quotes his (her) estimated ability when using his (her) affected limb by moving the bar of a visual analogue scale (VAS), with 0 corresponding to “I cannot perform the task” and 100 mm corresponding to “I can perform the task without any difficulty”. The scale is presented in a vertical position. The VAS is a reference scale for the assessment of pain and quality of life, and has been largely used in other domains, such as fatigue, restriction in activities of daily living31 and physical functioning.32 In the test section (T), the patient is required to perform the tasks previously enquired by the questionnaire. When the task requires handling of objects, these are placed on the table, 20 cm from the edge, opposite the shoulder of the affected limb. Global assessment of the quality and effectiveness of the gesture is then performed by the examiner using the same VAS, from 0 to 100 mm.
Organisation and objectives
The multicentre validation consisted of a non-interventional longitudinal study involving 10 French university hospitals. Technical support was provided by the Monitoring Force Group, France. The protocol was approved by the French Data Protection Authority (CNIL).
The main objectives were to estimate inter- and intraobserver reproducibility and to identify a possible learning effect. Secondary endpoints were test duration, scores of the different items of the scale and correlation with the arm section of the RMA, the most comparable reference scale.15 16
Inclusion criteria were as follows: adults (>18 years) of both sexes with a first stroke resulting in hemiplegia and spasticity of the upper limb (score ≥1 on the 4 point modified Ashworth Scale33 for the fingers, wrist and elbow flexors), with ongoing rehabilitation and some experience of post-stroke life at home (permanent or temporary home residence after rehabilitation). Patients with all degrees of upper limb impairment had to be included. All patients received full information on the study before their participation.
Patients with a Boston Diagnostic Aphasia Examination global severity score of 3/5 or less34 were excluded, as were patients having received botulinum toxin within the previous 6 months and patients whose medical treatment, especially antispastic and psychotropic, was scheduled to be modified less than 7 days after the enrolment date. The assessments were to be performed no less than 1 h after a physical rehabilitation session.
General design and endpoints
The 21 examiners were rehabilitation specialists, consultant neurologists or physiotherapists, and seven had taken part in the expert group meetings.
Each patient underwent three assessments (A1, A2 and A3) within 7 days. Two assessments were conducted by one examiner and one assessment by a second examiner. The order of assessments was randomly allocated between the examiners when starting the study by opening a sealed envelope. During each of the three assessments, the ULADL and RMA scales were both administered, in random order. The examiner knew the randomisation order by scratching a dedicated emplacement on the assessment form. On completion of an assessment, study forms were systematically sent using pre-stamped envelopes to the Monitoring Force Group so that the next examiner could not have access to the results of the previous assessment.
The primary endpoint for intra- and interobserver reproducibility was the difference in the global Q and T scores and in the scores for each item between two assessments.
Quality criteria included completeness of execution of the Q and T components of the test, observance of the randomisation schedule and observance of the time interval between the first and third assessments.
Q and T scores were analysed independently. Internal consistency was assessed with Cronbach's alpha coefficient.35 Intraobserver and interobserver reliability was analysed with the intraclass correlation coefficient (ICC)36 and the Bland and Altman method.37 38 ICC is equivalent to a quadratic weighted kappa. An ICC value <0.20 represents slight agreement, a value of 0.21–0.40 fair agreement, a value of 0.41–0.60 moderate agreement, a value of 0.61–0.80 substantial agreement and a value of 0.81 or higher indicates almost perfect agreement.35 39 A global score, corresponding to the sum of all items, was calculated for Q, T and Q + T. The Pearson correlation coefficient was used to compare the ULADL and RMA global scores. All statistical tests were two sided and significance was assumed at p<0.05. SAS V.9.1 software was used for all analyses.
Patients and assessments
Fifty patients were initially enrolled. None refused testing and all assessments were performed in one session. No adverse events were recorded. One patient who did not meet the selection criteria was excluded at the outset. Because of missing data, intraobserver reproducibility was evaluated in only 46 patients, interobserver reproducibility in 47 patients and the learning effect in 46 patients. A total of 142 assessments were analysed.
Mean age was 56.7±14.0 years (range 25–82), 26 (53.1%) of the patients were men and 46 patients (93.9%) were right handed. The critical event was most often a first (n=48; 98.0%) ischaemic (n=41; 83.7%) stroke, usually in the right hemisphere (n=30; 61.2%). The patients were studied an average of 14.1±25.9 months (median 3.7 months) after the stroke. The Barthel Index score40 at inclusion was 78.7±17.4 (/100), with a relatively wide range of performance (35–100), showing that patients with all degrees of difficulties were included. The specific assessment of activities requiring upper limb participation (feeding, bathing, grooming, dressing, toilet use) showed a mean score of 28.8±7.5 (/40), also with a wide range of difficulties (10–40). Some degree of spasticity was found in 38/49 patients, with an Ashworth score of 1.51±1.14 (range 0–4).
The mean interval between the first and third assessments was 4.6±2.6 days. Q scores (/1700) tended to increase from A1 (912±450) to A3 (1039±501) (table 1) while the mean RMA score (/15) remained stable across the three assessments. The ULADL took 16±8 min to complete compared with 9±5 min for the RMA. The time required to administer the Q and T components, and the RMA scale, tended to decrease from A1 to A3.
No difference in global T and Q scores or the item scores was observed between patients with right and left upper limb involvement, except for item 4 (p<0.01).
There were four minor protocol deviations at A2: one assessment was performed <1 h after a rehabilitation session and three assessments were performed on the same day by the two observers, <1 h apart. There were three minor deviations at A3: the order of randomisation between the ULADL and RMA scales was not respected in one case and, in two cases, the required interval between A1 and A3 was not respected. There were also two major deviations during A2, the questionnaire being administered after instead of before the practical test.
A significant difference in global Q and T scores was noted between the two assessments made by the same observer (121±226, p=10−4 and 46±114, p=0.018, respectively). Q scores for items 9, 11, 12, 13 and 17 and T scores for items 6 and 15 were significantly different between the two assessments.
ICC was 0.65 (95% CI (0.44 to 0.79)) for the global Q score and 0.97 (0.95 to 0.98) for the global T score. ICC for each item of the questionnaire was >0.70, except for items 3, 5, 7, 11, 12 and 17 (>0.60). ICC for each item of the practical test was >0.80, except for items 1 (0.67), 3 (0.71) and 5 (0.76). The global scores for the questionnaire and practical test displayed good reproducibility and no systematic trend, as reflected by the horizontal lines on Bland and Altman38 graphs (figure 1).
No significant difference in global scores was noted between the two assessments of the two different observers (Q: p=0.86; T: p=0.84). In contrast, there was a significant difference between Q scores for item 13 (p<0.01) and T scores for item 2 (p<0.01).
ICC was 0.95 for both the global Q score and the global T score. ICC for each item of the questionnaire was >0.70, except for items 5 (0.60) and 6 (0.68). ICC for each item of the practical test was >0.70, except for items 3 (0.68), 4 (0.56), 5 (0.69) and 6 (0.57). ICC tended to increase as the test proceeded, and was >0.80 for the last items. Global Q and T scores displayed good reproducibility and no systematic trend (figure 2).
Global Q and T scores both correlated strongly with the RMA score (r=0.80 and 0.88, respectively; p<10−4) (figure 3).
Cronbach's α coefficient was 0.95 for the three administrations of the questionnaire and 0.97 for the three administrations of the practical test. Moreover, global Q and T scores correlated strongly with each other at each assessment (A1, A2, A3: p<0.0001). Individual Q and T scores for the different items also correlated strongly with each other, except for the first (A1) assessment of item 2 (p=0.047).
Order of increasing difficulty
The item order was 5, 1, 3, 2, 4, 6, 7, 8, 12, 10, 11, 13, 14, 15, 9, 17 and 16 for the questionnaire, and 1, 3, 4, 2, 5, 7, 6, 8, 12, 11, 10, 13, 9, 15, 14, 17 and 16 for the test.
Questionnaires examining post-stroke capacity for activities of daily living have only recently been developed.18–21 These questionnaires assess active functions, with specific emphasis on hand use, and suffer from a threshold effect because many stroke victims are unable to perform even simple activities. As a result, the use of available questionnaires is restricted to patients with moderate to mild deficits, with relatively fair recovery of motor control. In addition, activities such as inserting the arm into a sleeve or washing the armpit are considered passive by some authors because help from caregivers is usually required, but active participation is also important to consider. Assessing and improving these abilities in patients with severe hemiplegia is a goal for rehabilitation efforts. The ULADL Scale can participate in the investigation of these proximal abilities, whether in an active or passive manner.
Concerning distal abilities, the scale measures three upper limb functions which are essential for daily living—namely, object blocking, transport and manipulation. Object blocking is often the first function that stroke patients recover, as it may be performed passively in patients without active finger extension, with the help of the unaffected hand. Grasping, transporting and releasing objects become possible when active control of finger extension and wrist position has been recovered. Picking up objects (fruit, small ball) involves wrist pronation and sometimes flexion, the latter facilitating active finger extension. Lateral grasping (glass) using the first commissure requires supination and opening of the first commissure, which tend to be much more difficult. Distal pinching of the tip of the thumb and index finger requires recovery of fine finger motor control and opposition. In the ULADL, these two abilities are tested in succession. The third function corresponds to object handling. The simplest activities (eg, cutting with a knife) require global grasp and mobilisation by proximal movements of the shoulder and elbow, with appropriate wrist orientation. Conversely, manipulations involving distal finger pinch (eg, using a key) require fewer proximal movements but precise wrist mobilisation in pro-supination and/or flexion–extension, and also motor control of the fingers. All of these abilities are tested successively in the ULADL. Fine motor activities involving the fingers, such as sewing and handling thin objects, are not included, but few hemiplegic patients recover this ability. A fourth type of function, such as carrying a bag or opening a door by pressing on a handle, requires proximal movements of the shoulder and elbow and relatively limited control of finger flexion. This function is also covered by the ULADL Scale.
Importantly, the scale compares the patient's subjective estimation of his or her abilities with objective capacity in a practical test. The accuracy of the patient's estimation usually gradually increased from underestimation at the start to near accuracy at the third assessment. This might be partially explained by a possible ‘unlearning’ of the use of a limb, which is frequent in this setting.19 On the other hand, many patients were at the secondary phase post-stroke—that is, in the first 4–5 months—and were still in the active rehabilitation process. Therefore, at the first assessment, the perception of their capacities was probably underestimated compared with patients who have been living at home for years since the stroke. Actually, the presentation of the ULADL can participate in the awareness of patient capabilities in order to define concrete operational objectives before therapeutic interventions.18
We found fair psychometric qualities, especially in terms of intraobserver and interobserver reproducibility. The assessment of intraobserver reproducibility was somewhat hindered by the learning effect in the questionnaire when the difference between the two examinations was analysed. However, the global ICC was good. Interobserver reproducibility was excellent, whatever the statistical method used. Internal consistency was satisfactory, suggesting that all of the items contributed to the measurement of a coherent functional unit from the proximal to the distal parts of the upper limb. External validity, determined by comparison with the RMA Scale, was also excellent, both for the questionnaire and the practical test. Finally, most items showed a significant correlation between the questionnaire and practical test, indicating that the patients' subjective and examiners' objective assessments were very close, even if the patients slightly underestimated their capacities during the first session. The order of increasing difficulty was slightly different between the patient report and examiner estimation when the task was effectively performed, especially for item 5, considered as easier for patients, and for item 9, considered as more difficult. Actually, in question 5 (opening fingers), the emphasis was put on a passive movement performed with the help of the healthy hand, while for questions 1–4, the emphasis was put on more active movement of the shoulder, arm or wrist. This difference between passive and active movements may have contributed to the perception of this difference. Holding a yogurt pot with the affected hand (item 9) requires a lateral grip by the first commissure of the hand, and was perceived as much more difficult than the same activity performed with the help of the healthy hand (item 8); in addition, the item is relatively similar to item 14 (moving a glass).
The test was well accepted by the patients. Total time taken to execute both parts of the test was relatively short, making it suitable for use in daily clinical practice. We confirm that, by comparison with test conditions, self-reporting assessment is acceptable for patients, easy to administer and psychometrically comparable.41 The ULADL Scale should also be suitable for clinical research purposes, particularly for evaluating treatment efficacy in patients with relatively severe motor deficit.
Several quality criteria remain to be determined for this scale. Firstly, sensitivity to spontaneous and treatment induced changes in functional status, and also the construct validity (ie, the presence or absence of relationships with other measures of patients' difficulties) and the predictive validity (ie, the capacity to predict the outcome after a relatively long time interval). This is the subject of an ongoing investigation. Another problem is the generalisation of our results to other subpopulations of patients, especially older and more aphasic subjects. However, the oldest patients in the series were 80–82 years, and no specific difficulties were encountered during their evaluation.
In conclusion, the ULADL Scale evaluates the global upper limb functional capacity of stroke patients in a real world setting, which is a main issue within the context of the rehabilitation process, and allows direct comparison between the patient's subjective capacity and his objective ability in formal testing. These possibilities are not given by more classical tests of upper limb activities.
Funding This research was supported by Merz Pharma France (Nanterre).
Competing interests None.
Ethics approval The observational study was approved by the French Data Protection Authority (CNIL).
Provenance and peer review Not commissioned; externally peer reviewed.