Article Text
Abstract
Background The lack of reliable biomarkers to track disease progression is a major problem in clinical research of chronic neurological disorders. Using Huntington's disease (HD) as an example, we describe a novel approach to model HD and show that the progression of a neurological disorder can be predicted for individual patients.
Methods Starting with an initial cohort of 343 patients with HD that we have followed since 1995, we used data from 68 patients that satisfied our filtering criteria to model disease progression, based on the Unified Huntington’s Disease Rating Scale (UHDRS), a measure that is routinely used in HD clinics worldwide.
Results Our model was validated by: (A) extrapolating our equation to model the age of disease onset, (B) testing it on a second patient data set by loosening our filtering criteria, (C) cross-validating with a repeated random subsampling approach and (D) holdout validating with the latest clinical assessment data from the same cohort of patients. With UHDRS scores from the past four clinical visits (over a minimum span of 2 years), our model predicts disease progression of individual patients over the next 2 years with an accuracy of 89–91%. We have also provided evidence that patients with similar baseline clinical profiles can exhibit very different trajectories of disease progression.
Conclusions This new model therefore has important implications for HD research, most obviously in the development of potential disease-modifying therapies. We believe that a similar approach can also be adapted to model disease progression in other chronic neurological disorders.
- HUNTINGTON'S
- MOVEMENT DISORDERS
- CLINICAL NEUROLOGY
- STATISTICS
Statistics from Altmetric.com
Introduction
The lack of reliable biomarkers to track disease progression is a major problem in clinical research of chronic neurological disorders.1 This problem has gained prominence as the development of disease-modifying therapies has started to enter the clinic,2 especially as some of these novel therapeutic agents or therapies involve direct delivery into the brain and, as such, randomised controlled trials (RCT) are not always possible.3 Furthermore, RCTs in neurological disorders with a low prevalence, such as Huntington's disease (HD), can be further complicated by virtue of difficulties in patient recruitment.
HD is a genetic neurodegenerative disorder that affects 2.71/100 000 persons worldwide.4 The pathology of HD is caused by an expansion in a trinucleotide CAG repeat in exon 1 of the huntingtin gene, and the length of this repeat predicts disease onset in patients.5 ,6 Models predicting disease onset enable researchers to study HD before the start of overt disease features and, by so doing, the possibility of delivering novel therapies at disease onset.7 While useful, we are still poor at modelling disease progression once the condition has started. We therefore sought to do this using our extensive database of 343 patients that we have followed longitudinally since 1995. We propose a model that tracks and predicts the natural progression of manifest HD based on the motor and functional components of the Unified Huntington’s Disease Rating Scale (UHDRS), which are routinely used in HD clinics.
We have found that patients with similar initial clinical profiles can have very different patterns of disease progression, which renders the use of conventional regression analysis (that estimates a common slope among groups of patients) inappropriate. In contrast, our novel approach enables researchers to predict disease progression of individual patients for the next 2 years, based on assessments from the past four clinical visits (with a minimum span over 2 years). We have interrogated the quality of our prediction by: (A) extrapolating our equation to model the age of disease onset, (B) testing it on a second patient data set by loosening our filtering criteria, (C) cross-validating with a repeated random subsampling approach, as well as (D) holdout validating with the latest clinical assessment data from the same cohort of patients, which was unavailable at the time of our original modelling.
We believe that our model will benefit clinicians and researchers in studying HD, especially for those developing potential disease-modifying therapies. Furthermore, our results enable researchers to reassess their existing data based on different profiles of patients’ predicted disease progression. A similar approach can also be adapted to model disease progression in other chronic disorders.
Materials and methods
Patient recruitment and assessments
Data were collected from participants who attended the HD clinic at the John van Geest Centre for Brain Repair, UK, between 1995 and 2013, either as part of their routine clinical care, or through participation in related studies. This study was approved by the Cambridge University Hospital NHS Foundation Trust, in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. All participants consented to their data being shared between research studies in an anonymised form. Motor and functional impairments were assessed using the UHDRS total motor score, functional assessment and functional capacity scales, conducted by an experienced rater. The UHDRS total motor score ranges from 0 (no motor features detected) to a maximum score of 124. Manifest disease was defined as a total motor score ≥5, as done previously.8 Demographic information was collected on patients including their CAG repeat size (where available), age and gender.
Modelling methods
A detailed description can be found in the online supplementary file. In short, the initial data were filtered using the following three criteria: (A) patients beyond the prodromal stage of disease with a ≥15 on their generalised index (GI) score (explained below) at their last clinical visit (up to 2012); (B) patients with at least five clinical visits and (C) patients not showing a large negative validity score (see online supplementary table S1). The validity score was created to avoid potential complications from medications, based on an assumption that patients were not expected to improve with time, which is consistent with a recent finding.9 The validity of patients who showed improvement between two consecutive visits was penalised and their data are more prone to exclusion. Most patients were filtered out using criteria (A) and (B) above and at the end of the filtering process 68 patients were eligible for modelling (see online supplementary figure S2).
Clinical data from the UHDRS was then transformed into GI scores, by deducting chorea and dystonia that has higher interrater variability,10 which in our experience could fluctuate over short periods of time, and did not correlate over time in our data (see online supplementary figure S1). The GI is normalised to 100, which represents an average of the motor and functional components of the UHDRS, and lies between 0 (no features) and 100 (all features). The optimum function to data from individual patients was fitted to best describe his/her GI progression, using linear (GI=B0+B1*Age), quadratic (GI=B0+B1*Age+B2*Age2), and exponential (GI=exp(B0+B1*Age)) models. The fitness of each model was quantitated as described in the online supplementary file and previously.11 We then searched the optimum coefficient for individual patients (B1 for linear, B2 for quadratic, both indicates the rate of disease progression) within a range (see online supplementary table S2), defined by the maximum and minimum value from the patient data (see online supplementary table S3).
Prediction and model validation
A detailed description can be found in the online supplementary file. In short, the first N samples of each patient were used to classify and create a model that would describe the disease progression pattern for that patient (see online supplementary figure S3). The optimum coefficient was then generated as described above, while the other parameters were derived from the first n samples from that particular patient. Prediction is conducted in a moving-horizon sense; as soon as the (n+1)th sample is available, the above algorithm would reclassify and refine the model for that patient. Our predictions were validated using four different methods. First, we extrapolated our model to predict the age of disease onset and compared that with a benchmark model built using large cohorts of patients.5 Second, we included data from the 31 new patients that were previously excluded for failing the validity score criteria. Third, we took a repeated random subsampling approach by partitioning our 68 patients into training groups of 50 patients and testing groups of 18 patients. The procedure was repeated 40 times to eliminate selection bias. Finally, 23 out of our 58 patients had revisited our clinic in 2013, such that their latest clinical assessment data were unavailable at the time of modelling, and we could therefore use these data to validate the predicted versus actual GI score of disease progression.
Data analysis
Normality of data was verified using either the Kolmogorov-Smirnov (>50 samples) or the Shapiro-Wilk (≤50 samples) tests. For univariate and multivariate analyses we used parametric (eg, analysis of variance (ANOVA), mixed models ANOVA) and non-parametric methods (Friedman test, Kruskal-Wallis and Mann-Whitney U test) depending on the distribution of residuals. For mixed model analyses of variance with multiple levels of repeated measures data transformation was performed to obtain normality, if required (such as square root transformation). Univariate analyses were corrected for multiple comparisons (Bonferroni) in order to avoid type I statistical error. Matlab (V.7.9) was used for data modelling. SAS (V.9.1) and SPSS (V.21) were used for statistical analysis.
Results
The initial data set contained a total number of 343 patients that we had followed since 1995. The majority of the patient data that failed to meet our filtering criteria were those having less than five clinical assessments, followed by those in the premanifest or prodromal stage.
Disease progression from the final 58 patients that we were able to model, denoted by the GI, representing 85.3% of all eligible patients (n=68), is demonstrated (figure 1). Among these patients, 41 exhibited disease progression that could be described in a linear equation (GI=B0+B1*Age, left panels), while disease progression from the other 17 patients could be described in a quadratic equation (GI=B0+B1*Age+B2*Age2, right panels).
We then compared the demographic information between different subgroups of patients, including data sets from a further 31 patients used in validation test 2 by relaxing the validity score. Clinically, the subgroups could not be distinguished from one another in terms of the age of patients at their last clinical assessment (up to 2012, F4,493=1.306, p=0.267), gender distribution (H(4)=5.460, p=0.243), CAG repeat size (H(4)=11.953, p=0.018) and the average years of follow-up (H(4)=13.543, p=0.009; table 1, post hoc analysis with the latter two revealed no real difference). There were significant differences between patient subgroups in their UHDRS total motor (H(4)=55.216, p<0.001) and functional (H(4)=61.724, p<0.001) assessments as measured at the patient's last clinical assessment (table 1). Post hoc analysis revealed that the data from all patients eligible for modelling (n=68), as well as patients exhibiting a linear (n=41), or a quadratic disease progression (n=17) were significantly, or had a strong tendency to be, more impaired in their UHDRS total motor and functional assessments, compared with the whole cohort (n=343) or compared with those patients who failed the validity score requirements that were used for validation test 2 (n=31). Such a difference could be attributed to the presence of the validity score, which assumed gradual deterioration of HD features over time and was consistent with previous findings.9 ,12 However, there were no intergroup differences between all patients eligible for modelling and those demonstrating linear or quadratic disease progressions. Overall, this indicates that patients sharing similar clinical profiles could exhibit very different patterns of disease progression.
As there is no comparable model to predict HD progression, we started validating our approach by extrapolating our model to predict the age of disease onset. Comparison was made with the most popular existing model to predict disease onset, constructed using large, independent cohorts of patients.5 This helps deal with a major limitation of our model to predict disease progression, namely that we were only studying a relatively small cohort of patients from the east of England. To do this, data were selected from the 41 patients with linear disease progression and the age of disease onset (T0) was defined as T0=−(B0/B1). Our predictions were modelled by using an approach similar to Langbehn’s, as well as by evaluating the maximum fitness criterion. Our resulting equation (T0=22.24+exp(9.844–0.156*CAG)), has very similar coefficients to Langbehn’s (T0=21.54+exp(9.556–0.146*CAG)) (figure 2). This indicates that, despite the fact that our model was built using a smaller cohort, our approach is comparable with the benchmark disease onset model constructed using 2913 individuals.
Our present modelling approach was based on the assumption that patients were not expected to improve over time, consistent with a recent observation.9 Therefore, patients demonstrating large improvements over consecutive clinical assessments were penalised, making their data sets prone to exclusion from the filtering stage. We then revisited our assumption, by including data from 31 patients who were previously excluded due to them having an excessive validity score penalty. The results were analysed in terms of the percentage change between the predicted and actual GI score during the latest clinical visit, which ranged between <6, 6–12, 12–18 and 18–24 months from the last visit used for modelling. Furthermore, we sought to investigate whether the number of prior clinical assessments used for modelling affected the accuracy of prediction. The prior clinical assessments were grouped in categories ranging from 3 to ≥8 prior visits, although our classification algorithm requires at least four prior clinical assessments to properly assign individual patients to their respective type of disease progression. By removing the validity score, this represents the maximum level of prediction error we would expect from our modelling approach.
We used mixed model analyses of variance to estimate fixed effects of time of prediction and number of prior assessments on prediction error. The mean prediction error was 8.4% (±5.3%). The multivariate analyses yielded a strong significant main effect of time of prediction (F3,73=6.97, p=0.0003), with no interaction between the two factors. This was in line with expectations that increasing time elapsed from the last clinical assessment would increase the prediction error, from an average of 5.9% (±4.2%) when predicting <6-month time period, to an average of 11.7% (±7.2%) at 18–24-month time prediction period (figure 3). Number of prior assessments used for the prediction did not yield a significant effect in the model (F5,17=1.70; p=0.19), despite of a tendency for an average decrease of 7% in prediction error when ≥8 prior visits were used as compared to 3. This was also in accordance with expectations. Our results indicate that the accuracy of prediction using our model is dependent on the time elapsed between last clinical assessments and is not affected by the amount of prior data used for that prediction. It has to be noted that the significant decrease in prediction accuracy of almost 6% on average, from the shortest to the longest prediction time period used in our prediction model when validity scores were removed, is still relatively small as compared with the overall accuracy of 91% (±8%).
We then attempted to cross-validate our approach by randomly partitioning the 68 patients into ‘training’ and ‘testing’ groups, in order to avoid overfitting of our model. Patient data from a group of 50 random patients (training) were used to obtain the new optimum parameters (B1 for linear, B2 for quadratic), while the remaining 18 patients (testing) were used to evaluate the predictive power of these newly derived equations using the same statistical models as presented above. This process was repeated 40 times to avoid random selection bias. The mean level of error across the 40 random cycles of shuffling was 8.6% (±1.2%) between predicted versus actual GI. When the predicted clinical assessment was conducted within 6 months of the last visit, the prediction error was on average 6.8% (±1%) and increased to only 12.4% (±2.8%) when the prediction time was increased to 18–24 month. Nevertheless, this effect was highly significant across the 40 trials as the median p value was 0.0002 (IQR: 0.004; figure 4). In conformity with the previous test, the accuracy of prediction did not change when the number of prior visits was increased from 3 to ≥8, yielding a median p value of 0.38 (IQR: 0.6) across the 40 random trials. Similarly, there was no interaction between the two factors.
We finally performed a holdout validation by assessing our model against the latest clinical assessment data from patients who had come back to the clinic in 2013 that were previously unavailable during model training. This holdout data set consisted of 23 of the original 58 patients we used to construct our model, with a total of 53 new clinical assessment data sets. Using a series of non-parametric analyses we could not observe any statistical differences between the predicted and the actual GI score in these patients (figure 5A). The median prediction error was 11.2 (IQR: 17.1). Using Spearman's coefficient we found a highly significant correlation between the predicted and actual GI score (γs=0.91, p<0.001, figure 5B). Similar to the previous validation tests, we then sought to analyse if there were any differences in the quality of prediction with the duration between the last (up to 2012) and present (2013) clinical assessments. However, we found no significant effect of time elapsed since last clinical assessment on the accuracy of the prediction (figure 5C).
Although the size of CAG repeat length was not used to calculate the GI score, it is the major determinant of the age of disease onset, and we were therefore interested to examine if the number of CAG repeats affects the rate of disease progression (see online supplementary figure S4 and table S4). For this, we used data from the 41 linearly progressing patients and calculated the optimum B1 (rate) for each of the motor and functional components of individual patients (see online supplementary figure S5).
We divided patients into three subgroups according to their CAG repeat size and noted that patients with longer CAG repeats had more rapid disease progression compared with patients with shorter CAG repeats (table 2). This is consistent with what had been reported previously.13 ,14 Furthermore, when the CAG repeat size increases, UHDRS total motor score deteriorates at a quicker rate as opposed to the functional components (figure 6). We also compared medications among different subgroups and did not observe any effects on the disease progression profile (see online supplementary figure S6).
Discussion
In the present study, we describe a novel modelling approach that can be used to track, as well as to predict, HD progression in manifest patients. The primary strength of our UHDRS-based model is that it is derived from measures routinely assessed in HD clinics worldwide, and its quality has been scrutinised using four validation methods. With at least four prior clinical assessments over a minimum span of 2 years, we can faithfully predict HD progression for individual patients over the next 2 years. Patients with similar clinical profiles (age, CAG repeat length, UHDRS) can also exhibit very different profiles of disease progression, and we can model this along with providing evidence that patients with longer CAG repeat size have a quicker rate of disease progression. Further studies will, however, be required to determine the underlying causes for the latter two observations.
Over the past few years, much effort has gone into uncovering potential biomarkers to track HD progression. For example, the level of striatal brain-derived neurotrophic factor (BDNF) was shown to be substantially reduced in patients with HD.15 Therefore, the level of plasma BDNF in 398 patients with HD was studied, before concluding that neither the serum level of BDNF protein nor mRNA could be reliably matched to stages of HD severity.16 However, Weiss et al17 have demonstrated that the level of mutant huntingtin (mtHtt) aggregation in the peripheral immune cells was significantly increased, when comparing patients with premanifest with those of manifest HD. In addition, there was also a significant correlation between disease burden scores of individual patients with the level of mtHtt aggregation in the peripheral immune cells,17 although there was considerable intraindividual variability on the level of aggregation between samples from the same participant. Furthermore, Tabrizi et al18 have also systemically evaluated the utility of a range of biomarkers in large cohorts of patients (TRACK-HD). They demonstrated that the rate of changes in the motor and functional components of the UHDRS were associated with disease progression.18 However, all these studies took a categorical approach by grouping patients in accordance to their disease stages for analysis. In contrast, patients in our study were tracked longitudinally as disease progressed and deteriorated, while the degree of GI changes was analysed on an individual basis. We believe that such an approach could better represent the heterogeneity of disease progression in individual patients, as we have observed in our cohort. Similar longitudinal strategies have also been employed in two very recent reports.7 ,9 In the Dorsey et al's9 study, the motor and functional components of UHDRS, as well as several cognitive measures, were found to consistently deteriorate in patients with HD followed for 3 years. In the Tang et al's7 study, the authors used functional imaging tools to demonstrate their potential as biomarkers to track preclinical HD, as the metabolic activity of the neural network was linearly associated with disease progression in patients with premanifest HD.
There are, however, limitations to our study. Most notably, all our patients were recruited from a single centre and their generalisability remains to be demonstrated. We have, however, demonstrated that the age of disease onset derived from our cohort was very similar to that described by an international, multicentre study using larger cohorts of patients.5 Furthermore, our approach enables disease modelling and progression to be analysed on an individual basis, while the optimum coefficients can be redefined for specific cohorts of patients. Such flexibility could facilitate the translation of our approach to other research centres. Another problem is that all patients in our study were examined by a single clinician; this removes issues to do with interrater variability, although it has previously been shown that there is a high correlation coefficient for the UHDRS total motor score between clinicians.10 Finally, to address the possibility of overfitting our model, we have cross-validated our modelling and prediction by using a repeated random subsampling approach, and have performed a holdout validation using data from the patients’ latest clinical assessments, which took place during 2013 and were unavailable at the time of model training.
In conclusion, using HD as an example we have developed a model to track the natural history of disease progression in manifest patients. With data from the previous four clinical visits based on the conventional UHDRS assessment, we can predict disease progression that is statistically not different from the actual progression over the next 24 months. We believe that our model will be an extremely valuable tool, in terms of enabling researchers to reassess their existing data according to patients’ different types of predicted disease progression, as well as facilitating the development of novel disease-modifying therapies in the future. We also believe that similar approaches can be adapted to model clinical progression of other chronic neurodegenerative disorders.
Reference
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
Footnotes
WLK and AK are first authors who contributed equally.
RAB and JG are senior authors who contributed equally.
Contributors WLK and JG conceived the experiment. RAB and SLM collected and organised the data. AK and YY performed the experiments. WLK and ASL analysed and interpreted the data. WLK, AK, YY, SLM and ASL wrote the manuscript. RAB and JG supervised the study, and edited the manuscript.
Funding This study was supported by the Cotswold Trust, the Rosetrees Trust, donations to the Huntington's disease clinic in the John van Geest Centre for Brain Repair and an NIHR award of the Biomedical Research Centre—Cambridge University NHS Foundation Trust. This project was also supported by EPSRC through projects EP/I03210X/1 and EP/G066477/1.
Competing interests None.
Patient consent Obtained.
Ethics approval Cambridge University Hospitals NHS Foundation Trust.
Provenance and peer review Not commissioned; externally peer reviewed.