Comparison of patient rated treatment response with measured improvement in Parkinson's disease
- Correspondence to Dr D J M McGhee, Division of Applied Health Sciences, University of Aberdeen, Polwarth Building, Foresterhill, Aberdeen AB25 2ZD, UK;
Contributors MBD undertook the statistical analysis and interpretation of the data, and drafted and revised the manuscript. DJMM undertook the statistical analysis and interpretation of the data, and drafted and revised the manuscript. He was also involved in acquisition of the data through the ongoing Parkinsonism Incidence in North-East Scotland (PINE) study. CEC devised the PINE study, and produced the idea for this manuscript. He revised the manuscript and provided overall supervision for this work. He obtained funding for the PINE study and was involved in acquisition of the data.
- Received 13 March 2012
- Revised 19 April 2012
- Accepted 24 April 2012
- Published Online First 24 May 2012
Background A marked response to dopamine replacement therapy is important in supporting a diagnosis of idiopathic Parkinson's disease (PD). The aim of the study was to compare PD patients' subjective rating of improvement with measured improvement on a number of scales.
Methods People with clinically defined PD were identified from a prospective long term follow-up study of incident parkinsonian patients. Changes in the Unified Parkinson's Disease Rating Scale (UPDRS) (activities of daily living and motor subsections), timed tests and Parkinson's Disease Questionnaire, full version, between assessments immediately before starting adequate dopamine replacement and the two subsequent follow-up assessments (mean 6 and 12 months after baseline) were calculated. These were compared with the patients' own subjective ratings of improvement (nil, slight, moderate, good, excellent).
Results 133 patients were included (mean age 71 years, 56% men). Thirty-eight patients were treated with a dopamine agonist and 95 with l-dopa (median l-dopa equivalent dose 300 mg). Most patients showed improvements in their measured scores but there was no statistically significant association between these scores and the patient subjective response, except for the motor UPDRS at the first follow-up. A third of those who showed no improvement in their motor UPDRS at the first follow-up rated their improvement as moderate or better, while 29% of those whose motor UPDRS improved by over 50% said they had no or slight improvement.
Conclusion PD patients' subjective ratings of their degree of improvement often do not accurately reflect the degree of objective change in parkinsonian impairment or disability. Clinicians should record a simple measure of motor impairment before and after treatment to assess treatment response more accurately.
In both clinical practice and formal research criteria,1–3 one of the major supportive criteria used to make a diagnosis of idiopathic Parkinson's disease (PD), as opposed to another parkinsonian syndrome, is a marked improvement in the motor features with adequate dopaminergic replacement therapy. Response to treatment is also a key component in subsequent dose alteration. This raises the issue of how response to therapy is assessed.
In clinical practice, one of the major ways of accessing treatment response is by asking the patient to subjectively rate their improvement. However, it can also be measured objectively by performing motor tasks, such as timed tests, or completing an assessment of motor impairment such as the Unified Parkinson's Disease Rating Scale (UPDRS).4 To our knowledge, no studies have assessed how well the subjective reporting of improvement by patients correlates with objectively measured improvement.
The aim of this study was, therefore, to compare PD patients' subjective assessment of their response to treatment with objective improvement, in order to assess the usefulness of the subjective response when clinicians are deciding whether a patient has improved significantly with treatment.
The Parkinsonism Incidence in North-East Scotland (PINE) study aimed to measure the incidence and prognosis of parkinsonism, including PD, in Aberdeen and the surrounding area. The original pilot study5 was extended to recruit a total of 377 patients with possible or probable parkinsonism (defined as two or more of bradykinesia, rest tremor, rigidity, postural instability) who were offered lifelong follow-up. The PINE study was conducted with the adequate understanding and written consent of the patients involved. The research was approved by the North of Scotland Research Ethics Committee.
Patients could consent to simple clinical follow-up, including yearly assessment of the motor part of the UPDRS and Mini-Mental State Examination (MMSE),6 or more detailed follow-up, including yearly assessment of the full UPDRS as well as timed tests of walking and upper limb function (tapping with either hand between two counters),7 the 39 item Parkinson's Disease Questionnaire (PDQ-39),8 the Mini-Mental Parkinson's (MMP)9 and the 15 item Geriatric Depression Scale (GDS-15).10
At each follow-up appointment the most likely clinical diagnosis of the cause of their parkinsonism was made by a movement disorder specialist (eg, PD, dementia with Lewy bodies, vascular parkinsonism, a Parkinson-plus syndrome) using all the available information, including the clinical features, response to treatment, motor complications and imaging. The UK Parkinson's Disease Society Brain Bank criteria1 were used to guide a clinical diagnosis of PD but not rigorously enforced, partly because those with early PD often cannot fulfil three supportive criteria as they have had insufficient follow-up. All patients had to meet the definition for parkinsonism and not have any major exclusion criteria (repeated strokes with stepwise decline, cerebellar features, early severe autonomic involvement, supranuclear gaze palsy, early dementia or apraxia) but a family history, use of neuroleptics (if an FP-CIT scan was abnormal) and an isolated extensor plantar were not regarded as absolute exclusion criteria.
Patients could be seen for an interim appointment between their yearly assessments if they or their general practitioner felt they needed treatment, at which point they had a motor UPDRS assessment. Factors including clinical presentation (tremor less likely to respond than bradykinesia), the impact of symptoms on the patient's lifestyle and patient preference were taken into consideration, but no specific scores were used to guide initiation of treatment. There was no standard firstline dopaminergic medication: the treatment choice was made after discussion of the advantages and disadvantages of each option with the patient.
Once started on treatment, patients were seen again after approximately 4–6 months to assess their response to treatment, both subjectively and by repeating the motor UPDRS (and the timed tests and PDQ-39 in those who had consented to more detailed follow-up if the visit coincided with a yearly assessment). Given that none of the patients were fluctuating, all treatment scores were in the ‘on’ state. Post-treatment scores were made without knowledge of pretreatment scores and were usually made by the same study doctor. The subjective response was measured on a 5 point scale by simply asking the patient to rate whether they had experienced no response to treatment, a slight response (<25% improvement), a moderate response (25–49% improvement), a good response (50–75% improvement) or an excellent response (>75% improvement). Similar subjective scales have been used in previous studies.11
For this analysis, patients were included if all four of the following conditions were met: (i) their latest or final (if they had died) clinical diagnosis was idiopathic PD; (ii) they had a pr-treatment (baseline) assessment of their motor UPDRS from either a yearly or interim visit no more than 6 months before starting treatment—any longer and it was felt that disease progression may have significantly altered their baseline scores; (iii) they had a follow-up appointment to assess the effectiveness of treatment at least 3 months after starting treatment, to allow time for a treatment effect to have developed, but no more than 12 months after their pretreatment assessment; and (iv) they were on adequate treatment at follow-up, which was conservatively defined as at least 3 months of l-dopa or a dopamine agonist at an l-dopa equivalent daily dose (LEDD) of 150 mg/day or more. Patients treated with only an anticholinergic or a monoamine oxidase B inhibitor were excluded because these therapies have only a very mild effect on PD symptomatology and did not fulfil the criteria for an adequate treatment trial (eg, selegiline 10 mg is equivalent to only 100 mg of l-dopa). Patients who were on treatment before their first PINE study appointment were also excluded as were those started on both a dopamine agonist and l-dopa concurrently, as we wished to compare treatment effects in each group separately.
The date of the pretreatment appointment and the dates of the first two follow-up appointments (FU1 and FU2) after 3 months of adequate treatment were extracted from the patients' records and corresponding data values for these appointments were extracted from the PINE database. The second follow-up was used to assess whether changes seen at the first follow-up were maintained. However, to avoid disease progression affecting any changes seen, data from the second follow-up appointments more than 18 months after the pretreatment appointment were not included. Data extracted included baseline patient demographics, treatments given, along with the UPDRS, PDQ-39, GDS-15, MMSE and MMP measured closest to FU1, timed test scores (averaged 12 min get up and go walk, right and left hand taps over 30 s) and subjective response to treatment. Drug doses were converted to LEDD.12
Statistical analysis was performed using IBM SPSS Statistics V.19.0. Comparisons between patients taking l-dopa and those taking dopamine agonists were made using the Mann–Whitney test unless otherwise stated. Changes in scores between groups were compared using the t test, unless otherwise stated, and the association between changes in scores and the subjective response by the Kruskal–Wallis test. The influence of possible depression and cognitive impairment on subjective responses was analysed using the continuity corrected χ2 test.
Of 214 incident patients with a current diagnosis of PD (maximum duration of follow-up 9 years), eight did not consent to clinical follow-up and were excluded. A further 73 patients were excluded for the following reasons: 19 were on treatment at the time of their first PINE study appointment; six had no available pretreatment data; four had a pretreatment appointment more than 6 months prior to starting treatment; 23 did not have an adequate treatment trial; nine had an adequate treatment trial but did not have a follow-up within the allotted time period; five had no follow-up data (one died); six stopped treatment before follow-up occurred; and one was on two drug therapies concurrently (cabergoline and co-beneldopa).
In the final cohort of 133 patients, 38 (29%) were on a dopamine agonist and 95 (71%) were on l-dopa. Twenty-seven of the included patients did not contribute data to the second follow-up after starting treatment because they had not had a second follow-up (death or not yet due), had stopped/changed treatment or their second follow-up was more than 18 months after their pretreatment appointment.
Table 1 details the baseline characteristics for patients who were commenced on either a dopamine agonist or l-dopa. As expected, those starting l-dopa had significantly worse disease at baseline (in terms of motor UPDRS scores and timed walks) and were older than those commencing a dopamine agonist.
Mean time between the pretreatment visit and FU1 was 6.1 months (SD 2.1), and baseline and FU2 was 12.4 months (SD 2.2).
Comparison between baseline and follow-up scores
As can be seen in table 2, a general improvement in most test scores was seen after treatment at FU1 (except for the PDQ-39). This improvement was sustained at FU2. The differences in improvement between those on l-dopa and those on a dopamine agonist were not significant, except for a greater improvement with l-dopa in timed walks at FU1 (p=0.002, Mann–Whitney test), motor UPDRS at FU1 (p=0.049) and motor UPDRS at FU2 (p=0.007) (data not shown).
Changes in scores compared with subjective responses
The changes in the measured test scores stratified by the patient's subjective response score at both FU1 and FU2 are shown in table 3. Some of the groups had very small numbers of patients. For most measures, the median scores showed improvement across all subjective response groups although the differences from pretreatment were often small. While the improvement was always greatest in those who rated themselves >75% improved, the only statistically significant association between changes in a measured score and subjective treatment response was for the motor UPDRS at FU1 (p=0.004). Even here, the median change in motor UPDRS score from baseline to FU1 for those reporting no subjective improvement in their parkinsonism was −3.0 (IQR −10.0 to 3.0) suggesting that, despite patients' subjective views, the motor impairment had actually improved in most. Stratification of measured test scores by subjective response was repeated for both the l-dopa treated and dopamine agonist treated patients separately, but unfortunately the numbers in many groups after stratification were too small to allow a meaningful analysis.
The subjective response was also analysed according to the percentage change in the patients' motor UPDRS from pretreatment to FU1. The percentage UPDRS improvement was categorised as worse or no change, >0–25% improvement, >25–50% improvement and >50% improvement (figure 1). There was a significant trend for more patients to rate themselves as significantly better (good or excellent) with increasing objective improvement (χ2 test for linear trend 12.54, df=1, p=0.0004). However, significant disagreement remained: 33% (95% CI 16% to 55%) of patients whose motor UPDRS deteriorated or remained unchanged reported a moderate or better response to therapy while only 38% (95% CI 19% to 59%) of these patients reported no response. Furthermore, 29% (95% CI 8% to 58%) of patients whose motor UPDRS improved by more than 50% reported either no or slight improvement.
The influence of possible coexistent depression on patients' subjective responses at FU1 was also analysed by assessing the number of patients with a GDS-15 score of ≥5, the cut-off that suggests possible depression.13 One hundred and eight patients had a GDS-15 score, which was taken from the visit closest to FU1. For those with >25% improvement in the motor UPDRS from pretreatment to FU1, significantly more (p=0.010) patients reporting no, slight or moderate improvement (13/20, 65.0%) than those reporting good or excellent improvement (7/29, 24.1%) had a GDS-15 of ≥5. However, for those with ≤25% improvement in the motor UPDRS, there was no significant difference in the percentage of patients with a GDS-15 of ≥5 between these groups (12/42, 28.6% vs 5/17, 29.4%; p=1.000).
Similarly, the impact of possible coexistent cognitive impairment, defined as either a MMSE of ≤24/30 or a MMP of ≤27/32, on patients' subjective responses at FU1 was analysed. One hundred and thirty-two patients had a MMSE score and 109 a MMP score, which were taken from the visit closest to FU1. For those with >25% improvement in the motor UPDRS, there was no significant difference (MMSE p=0.533; MMP p=0.520) in the percentage of patients with possible cognitive impairment between those reporting no, slight or moderate improvement (MMSE 25/28, 89.3%; MMP 11/28, 52.4%) and those reporting good or excellent improvement (MMSE 30/32, 93.8%; MMP 19/32, 59.4%). Furthermore, for those with ≤25% improvement in the motor UPDRS, there was again no significant difference in the percentage of patients with possible cognitive impairment between these groups (MMSE 50/54, 92.6% vs 16/19, 84.2%, p=0.539; MMP 26/54, 61.9% vs 11/19, 64.7%, p=1.000).
This study found a poor association between the degree of patient reported subjective improvement after commencing treatment and improvements in objective measures of motor impairment or self-rated measures of disability or quality of life. Only the motor UPDRS had a significant association with patient rated improvement, but even this was limited: the upper limit of the IQR for those who subjectively reported no or slight improvement at the first follow-up was minus 10 points, implying that 25% of patients in these groups improved by more than 10 points. There was also substantial misclassification of patient reported improvement within the different categories of objective change in motor UPDRS, with significant proportions both underestimating and overestimating their objective improvement. This discrepancy between subjective and objective responses highlights the inaccuracy of using the patient's self-reported improvement or lack of it to judge the actual degree of improvement. This may in turn lead to inappropriate decisions about changing medication or even misdiagnosis (eg, patients reporting no improvement with adequate treatment may be misdiagnosed as not having PD) either in a clinical setting or in applying research criteria.
There are several possible reasons for the discrepancies between subjective and objective measures of improvement following commencement of treatment in PD. First, the subjective overestimation of improvement by some patients may simply reflect a desire of the patient to please the treating physician. Alternatively, it may be that they noticed an improvement in aspects of their PD—for example, non-motor features—which are not measured by the objective tests used in this study. However, the PDQ-39 assesses many non-motor features and showed no better correlation with subjective scores. By contrast, patients' underestimation of improvement may relate to the fact that treatment did not improve the symptoms they were most concerned about. For example, tremor is often the most visible and embarrassing aspect of PD to a patient but often responds poorly to treatment. Low mood may also cause patients to underestimate their treatment response, which is partially supported by our GDS-15 analysis. We did not however find any evidence to suggest that cognitive impairment influenced patients' subjective responses.
In other cases, discrepancies between subjective and objective responses may reflect difficulties patients have in recalling what they were like prior to treatment compared with the time at which they are reassessed. Clearly this is more likely to be an issue when the time period is longer and, therefore, this was less likely to be a factor at FU1, which was only 6 months after the baseline appointment.
While this study raises a number of interesting issues, there are some limitations. First, the number of patients contributing data to some of the objective measures was small, which means these analyses were underpowered. However, for motor UPDRS, which was generally collected at every patient encounter, there were over 100 patients. The lack of standardised timing for follow-up appointments and the variation in objective measures used at each follow-up is another possible criticism. This simply reflects the real life nature of this prospective incidence study in which normal clinical care was provided to the participants as well as collecting research data. We used a conservative definition of adequate treatment but this probably gave a greater spectrum of objective improvement allowing us to examine for associations with subjective responses across a wider range of UPDRS change and the median LEDD at FU1 was adequate (table 1). Confounding of measured improvements by disease progression could also be a limitation of this study but is unlikely to be a major problem, especially at the first follow-up visit. We did not specifically collect a carer's view of the response to treatment but we are aware that sometimes this did influence the patient's response. Anecdotally, this was usually to highlight a better response than the patient initially suggested. When this happened we recorded the final ‘consensus’ view of the patient and carer.
Our use of an expert clinical diagnosis of PD guided by research criteria, rather than rigorous application of formal research criteria, reflected the real life nature of our study and the fact that in early disease some of the criteria are not applicable (eg, the presence of motor complications). Moreover, expert clinical diagnosis has a similar positive predictive value as research criteria diagnosis.14 Some may question why patients in this study who worsened or had little improvement with treatment were classified as having PD. However, their final diagnoses were made after longer follow-up, outwith the scope of this analysis, during which they improved with increased doses of treatment or alternative therapy, or developed motor complications. Moreover, about 10% of patients with pathologically proven PD have little response to treatment15 and so occasionally it can be appropriate to make a diagnosis of PD in patients with an asymmetrical parkinsonian syndrome with a classical resting tremor without atypical features, despite a poor response to treatment. In addition, some patients with the early stages of atypical parkinsonian disorders can have a sustained treatment response for several years, which further complicates diagnosis. Therefore, even if a small number of patients with a misdiagnosis of PD were included in our study (which is the case in any study without 100% post mortem diagnosis), it should not alter the validity of our results because it is still reasonable to assess their subjective response to treatment.
In conclusion, in many patients the subjective rating of their degree of improvement with dopamine replacement therapy is not a reliable measure of objective improvement and could mislead clinicians if used as the sole source of information about treatment response. We recommend that in routine clinical practice, a patient's subjective response is complemented by an objective measure, such as the motor UPDRS or a brief simplified version which could be used in a short follow-up clinic. In addition, it may be appropriate to modify research criteria for PD to define more precisely what is meant by a good response to treatment and how this should be measured.
The authors thank Susan Kilpatrick for secretarial support, Clare Harris and Hazel Forbes for helping with patient assessments, and Katie Wilde for maintaining the PINE study database.
MBD and DJMM are joint first authors and contributed equally to the manuscript.
Funding The PINE study was funded by Parkinson's UK (grant Nos G-0502, G-0914), the BMA Doris Hillier Award, NHS Grampian Endowments, SPRING (Special Parkinson's Research Interest Group) and the RS MacDonald Trust. DJMM is supported by a grant from the National Institute for Health Research (NIHR) (grant No RP-PG-0707-10124).
Competing interests None.
Ethics approval Ethics approval was provided by North of Scotland Research Ethics Committee.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The authors are open to discussion with interested parties regarding using data from the ongoing PINE study.