Comparison of clinical rating scales in genetic frontotemporal dementia within the GENFI cohort

Background Therapeutic trials are now underway in genetic forms of frontotemporal dementia (FTD) but clinical outcome measures are limited. The two most commonly used measures, the Clinical Dementia Rating (CDR)+National Alzheimer’s Disease Coordinating Center (NACC) Frontotemporal Lobar Degeneration (FTLD) and the FTD Rating Scale (FRS), have yet to be compared in detail in the genetic forms of FTD. Methods The CDR+NACC FTLD and FRS were assessed cross-sectionally in 725 consecutively recruited participants from the Genetic FTD Initiative: 457 mutation carriers (77 microtubule-associated protein tau (MAPT), 187 GRN, 193 C9orf72) and 268 family members without mutations (non-carrier control group). 231 mutation carriers (51 MAPT, 92 GRN, 88 C9orf72) and 145 non-carriers had available longitudinal data at a follow-up time point. Results Cross-sectionally, the mean FRS score was lower in all genetic groups compared with controls: GRN mutation carriers mean 83.4 (SD 27.0), MAPT mutation carriers 78.2 (28.8), C9orf72 mutation carriers 71.0 (34.0), controls 96.2 (7.7), p<0.001 for all comparisons, while the mean CDR+NACC FTLD Sum of Boxes was significantly higher in all genetic groups: GRN mutation carriers mean 2.6 (5.2), MAPT mutation carriers 3.2 (5.6), C9orf72 mutation carriers 4.2 (6.2), controls 0.2 (0.6), p<0.001 for all comparisons. Mean FRS score decreased and CDR+NACC FTLD Sum of Boxes increased with increasing disease severity within each individual genetic group. FRS and CDR+NACC FTLD Sum of Boxes scores were strongly negatively correlated across all mutation carriers (rs=−0.77, p<0.001) and within each genetic group (rs=−0.67 to −0.81, p<0.001 in each group). Nonetheless, discrepancies in disease staging were seen between the scales, and with each scale and clinician-judged symptomatic status. Longitudinally, annualised change in both FRS and CDR+NACC FTLD Sum of Boxes scores initially increased with disease severity level before decreasing in those with the most severe disease: controls −0.1 (6.0) for FRS, −0.1 (0.4) for CDR+NACC FTLD Sum of Boxes, asymptomatic mutation carriers −0.5 (8.2), 0.2 (0.9), prodromal disease −2.3 (9.9), 0.6 (2.7), mild disease −10.2 (18.6), 3.0 (4.1), moderate disease −9.6 (16.6), 4.4 (4.0), severe disease −2.7 (8.3), 1.7 (3.3). Sample sizes were calculated for a trial of prodromal mutation carriers: over 180 participants per arm would be needed to detect a moderate sized effect (30%) for both outcome measures, with sample sizes lower for the FRS. Conclusions Both the FRS and CDR+NACC FTLD measure disease severity in genetic FTD mutation carriers throughout the timeline of their disease, although the FRS may be preferable as an outcome measure. However, neither address a number of key symptoms in the FTD spectrum, for example, motor and neuropsychiatric deficits, which future scales will need to incorporate.


INTRODUCTION
Frontotemporal dementia (FTD) is a spectrum of heterogenous disorders characterised by neurodegeneration of the frontal and temporal lobes. A total of 20%-30% of FTD cases are genetic, 1 2 with the majority caused by autosomal dominant mutations in three genes 3 : chromosome 9 open reading frame 72 (C9orf72), 4 progranulin (GRN) 5 and microtubule-associated protein tau (MAPT). 6 Clinical syndromes span changes in behaviour (behavioural variant FTD, bvFTD), 7 language (primary progressive aphasia, PPA) 8 and motor function (progressive supranuclear palsy, PSP, corticobasal syndrome, CBS and FTD with amyotrophic lateral sclerosis, FTD-ALS). [9][10][11] Age of symptom onset, and disease progression and duration vary between and within genetic groups. 12 The ability to accurately evaluate disease stage and track clinical change in FTD across the spectrum of phenotypes is critical for the design of future trials of disease-modifying therapies. Two candidate global severity measures specific to FTD are the Clinical Dementia Rating (CDR) Dementia Staging Instrument and the FTD Rating Scale (FRS). The CDR is a widely used scale that was developed to stage the severity of dementia in the Alzheimer's Disease spectrum. 13 14 Impairment in six cognitive and functional domains are assessed by a neurologist through semistructured interview with both the patient and caregiver. The CDR was extended for FTD by introducing a behaviour and a language domain, taken from the National Alzheimer's Disease Coordinating Centre (NACC) Frontotemporal Lobar Degeneration (FTLD) module (CDR+NACC FTLD). 15 16 A version of the global CDR scoring system 17 (without the emphasis on the memory domain) has been developed to apply to the CDR+NACC FTLD, which classifies cases into five severity levels based on the number and severity of the ratings given for the eight domains. 18 The CDR+NACC FTLD has shown ability to detect mild to severe symptoms in sporadic and genetic FTD cohorts 15 16 18 19 and capture disease progression over 1-2 years. 15 20 The FRS is a 30-item caregiver questionnaire developed with the aim of staging FTD severity based on behavioural changes and functional decline. 21 The scale captures six levels of impairment from very mild to profound. Disease severity according to the FRS has been found to correlate with the CDR 21 22 and CDR+NACC-FTLD, 23 but a detailed evaluation of the measure across the range of presymptomatic and symptomatic FTD has not been reported.
Few studies have directly compared the FRS and CDR+NACC FTLD staging tools, particularly in relation to the increasingly used CDR+NACC FTLD global scoring system. The objectives of this study were to: (1) evaluate and compare how the FRS and CDR+NACC FTLD scales characterise disease stage and severity in the spectrum of presymptomatic and symptomatic genetic FTD, using cross-sectional data from the Genetic FTD Initiative (GENFI) cohort; (2) examine and compare longitudinal change in the scales using data at 1-year follow-up and (3) estimate the sample sizes required to detect a small or moderate size effect on disease progression based on the two candidate outcome measures.

METHODS Cohort
From the fifth data freeze of the GENFI study, 725 participants with both FRS and CDR+NACC FTLD data available for at least one time point were included in the study: 457 mutation carriers (77 MAPT, 187 GRN, 193 C9orf72) and 268 family members without mutations (non-carrier control group).

Measures
All participants underwent a standardised history and examination including the Mini-Mental State Examination (MMSE), with symptomatic status judged by the assessing clinician according to consensus diagnostic criteria.

Frontotemporal dementia Rating Scale (FRS)
The FRS is a 30-item questionnaire covering seven areas: behaviour, outing and shopping, household chores and telephone, finances, medications, meal preparation and eating, and self care and mobility. The FRS was completed by an informant (family member or caregiver) by rating the frequency of difficulties in these areas ('all the time', 'sometimes', 'never'). Raw scores are converted to a percentage (total number of 'never' responses/total number of applicable questions) to exclude any items that were not applicable to the patient. Lower percentage scores therefore denote greater impairment of everyday abilities and behavioural change. In the original development of the scale, six severity stages were identified and operationalised in 75 patients with FTD (very mild, 100%-97%; mild, 96%-80%; moderate, 79%-41%; severe, 40%-13%; very severe, 12%-3%; profound, 2%-0%). 21 One modification was made to these classifications for use in the GENFI cohort because the FRS is also collected on non-carrier family members: a score of 100% was considered 'asymptomatic' rather than 'very mild'. The 'very mild' category in this study therefore encompasses scores of 97%-99% instead of 97%-100%.

CDR+NACC-FTLD
The eight domains of the CDR+NACC FTLD assess memory, orientation, judgement and problem solving, community affairs, home and hobbies, personal care, overall behaviour and overall language. Based on a semistructured interview with the patient and an informant, the presence of impairment in each of these domains is rated by a clinician using scores of 0 (absent), 0.5 (questionable/very mild), 1 (mild), 2 (moderate) and 3 (severe). [15][16][17] The sum of boxes score (CDR+NACC-FTLD-SB) is calculated by summing the ratings given for the eight domains. Thus, a higher sum of boxes value denotes greater symptomatology. The CDR+NACC FTLD global rating was determined using the published scoring rules, 18 whereby a rating on a fivepoint scale is given (0, 0.5, 1, 2, 3) based on the severity of the ratings given for the eight domains. All eight domains are given equal weighting when calculating the global score, so it does not relate to a specific FTD variant, and if any domain has a rating above 0 then the global score is at least 0.5. Therefore, cases with no impairment in any domain are given a global rating of 0, those with mild cognitive, behavioural or language impairment are rated 0.5, those with mild but definite symptomology are intended to receive a rating of 1, those with moderate dementia 2 and severe dementia 3. Global ratings can be reduced into three broader disease severity levels: normal or asymptomatic (0), very mild or prodromal cognitive, behavioural or language impairment (0.5) and fully symptomatic (≥1). 18

Statistical analysis Descriptive statistics and group comparisons
Data were analysed using SPSS V.26 or STATA V.16. Demographic variables were compared between groups using independent sample t-tests or Mann-Whitney U tests when n<30. Sex was compared between groups using χ 2 tests. A linear regression model was used to compare both FRS and CDR+NACC-FTLD-SB scores between groups; bootstrapping with 1000 repetitions was used for data that were not normally distributed. Correlations between FRS percentage score and CDR+NACC FTLD Sum of Boxes scores were generated using Spearman rank correlation coefficients (two-tailed), as were correlations of both scales with disease duration (years since clinician-judged symptom onset -analysis restricted to symptomatic participants) and MMSE score.

Longitudinal analyses
Of the baseline sample, 231 mutation carriers (51 MAPT, 92 GRN, 88 C9orf72) and 145 non-carriers had FRS and CDR+NACC FTLD data available at a follow-up time point. Mean time between baseline and follow-up was 1.3 years (SD=0.5). For both scales, annualised change was calculated as: [follow-up score] -[baseline score]/time between baseline and follow-up. Annualised change was compared between the mutation carrier group and controls Neurodegeneration using a linear regression model; bootstrapping with 1000 repetitions was used for data that were not normally distributed.

Sample size calculation
To explore the use of the FRS and CDR+NACC-FTLD-SB scores as potential outcome measures in treatment trials, sample sizes per arm of a two-arm trial of a disease modifying therapy (with 1:1 randomisation to placebo vs active treatment) were calculated using an analysis of covariance method. The analysis focused on mutation carriers with a baseline CDR+NACC FTLD global rating of 0.5 (ie, a prodromal trial), with the desired treatment effect hypothesised as a reduction in progression from the mean score of the outcome measure in the global 0.5 CDR+NACC FTLD group to the mean score of the outcome measure in the global 1 CDR+NACC FTLD group, that is, slowing of progression from prodromal to fully symptomatic. The following formula was used: ρ is the correlation between baseline and follow-up scores of the outcome measure in mutation carriers, σ is the SD of scores at follow-up, δ is the treatment effect (difference in mean score between the prodromal (0.5) group and mild symptomatic (1) group), α is the significance level, set at 0.05. and 1-β is the power to detect a treatment effect, set at β=0.2 ie,that is, power 80%.

RESULTS Demographics
The demographic and clinical characteristics of the participants in each genetic group at their baseline time point are summarised in table 1. The groups shared similar demographic profiles, except that the MAPT mutation carriers and the controls were younger than the C9orf72 mutation carriers (MAPT t=−3.207, p=0.002; controls t=−4.030, p<0.001) and GRN mutation carriers (MAPT t=−2.875, p=0.004; controls t=−3.501, p=0.001).

Comparison of both FRS and CDR+NACC-FTLD-SB between groups
The mean FRS% score in all genetic groups was lower than controls (p<0.001 for all comparisons): GRN mutation carriers mean 83.4 (SD 27.0), MAPT mutation carriers 78.2 (28.8), C9orf72 mutation carriers 71.0 (34.0), controls 96.2 (7.7) (table 1 and online supplemental table 1). There was also a significant difference between the C9orf72 group and both the GRN group (p<0.001) and the MAPT group (p=0.032).

Comparison of both FRS and CDR+NACC-FTLD-SB within genetic groups by disease severity
Mean scores on the FRS according to CDR+NACC FTLD severity level (0, 0.5, ≥1) for each genetic group are reported in table 2, and according to individual CDR+NACC FTLD global rating (0-3) are presented in figure 1. GRN, MAPT and C9orf72 mutation carriers with a global rating of 0 had comparable FRS scores to controls (online supplemental table 3). Within both the GRN and C9orf72 mutation carriers, the mean FRS score was significantly lower in cases with a global rating of 0.5 compared with those with 0. Within every genetic group, the cases with a global rating of ≥1 had significantly lower FRS scores than those with 0 or 0.5. For comparison, mean CDR+NACC-FTLD-SB scores according to severity level for each genetic group are also reported in table 2. The mean CDR+NACC-FTLD-SB scores were higher in those with a global rating of 0.5 and ≥1 than either controls or those with a global rating of 0 in all three genetic groups (online supplemental table 4).

Comparison of FRS and CDR+NACC-FTLD-SB by severity categories
The percentage of mutation carriers in each FRS severity category according to their CDR+NACC FTLD global rating, and vice versa, are shown in figure 3 (and individually for GRN, MAPT and C9orf72 mutation carriers in online supplemental figure 2). Mutation carriers who had an FRS score in the 'asymptomatic' range most frequently had a global rating of 0 (84.0%); cases in the 'very mild' FRS category also predominantly had Table 2 Baseline FRS scores according to CDR+NACC FTLD severity level, by genetic group  The mean ratings for the CDR+NACC FTLD domains (ie, the severity) in each of the FRS levels are shown in figure 4B for mutation carriers and controls, and for the individual genetic groups in online supplemental figure 3B. Comparing the mean domain score of mutation carriers at each FRS stage against the mean score in controls for that domain: in the asymptomatic and very mild FRS stages, none of the domains were different from controls; in the mild stage, the memory (p=0.009), community affairs (p=0.040) and behaviour (p=0.002) domains had higher ratings than controls; and in the moderate, severe and very severe/profound FRS stages, all of the CDR+NACC FTLD domains had more severe ratings than controls.

Correlation of both FRS and CDR+NACC-FTLD-SB with other measures of disease severity
The FRS score was moderately negatively correlated with disease duration in symptomatic participants except in   The distributions of these diagnoses across the FRS severity categories and CDR+NACC FTLD global rating groups are shown in online supplemental table 5. Both rating scales classified four participants who had been judged as symptomatic within the lowest severity category (asymptomatic for FRS: 2 bvFTD, 1 PPA, 1 ALS/FTD-ALS; 0 for CDR+NACC-FTLD: 1 bvFTD, 2 ALS/FTD-ALS, 1 with a parkinsonian disorder). With increasing FRS severity and CDR+NACC FTLD global rating, an increasingly larger number of participants were judged to be symptomatic: by FRS severity -very mild 6.5%, mild 9.8%, moderate 60.6%, severe 95.2%, very severe/profound 100.0%; by CDR+NACC FTLD global rating -0.5 16.2%, 1 70.0%, 2 98.0%, 3 100.0%.

Longitudinal change in the FRS and CDR+NACC-FTLD
Annualised change on the FRS and CDR+NACC-FTLD-SB in controls and according to baseline CDR+NACC FTLD severity level in mutation carriers are reported in table 3 and online supplemental table 6 Table 4 shows the number of participants required to demonstrate efficacy on change in FRS percentage score and CDR+NACC FTLD Sum of Boxes score as potential outcome measures when assuming small (10%) to moderate (30%) effect sizes. For a trial entering prodromal mutation carriers (with a global rating of 0.5), over 180 participants per arm would be needed to detect a moderate sized effect (30%) for both outcome measures. Power calculations using the FRS yielded projected sample sizes that were more favourable than the CDR+NACC FTLD Sum of Boxes.

Sample size calculations
As the treatment effect is based on preventing progression from global CDR+NACC FTLD 0.5-1, the length of the trial depends on the natural history of this progression. A survival analysis in the GENFI cohort previously showed that ~50% of mutation carriers progress from a global rating of 0.5-1 in 3 years (Poos et al in submission). Against this background, a 6-year trial of prodromal mutation carriers would therefore be required to detect the proposed treatment effect (eg, for a 30% effect on FRS, N=181), or a 3-year trial of the same treatment would require the sample size equivalent to assuming half the percentage change in the target value (eg, if there was a 30% effect on FRS, as only 50% of people will have progressed, the sample size would be equivalent to a 15% effect on FRS that is, N=725).

DISCUSSION
This study has systematically evaluated and compared disease staging and progression defined by the FRS against the widely used CDR+NACC FTLD scale in a large cohort covering the spectrum of genetic FTD. Scores on both scales are strongly related to disease severity in FTD, and in GRN, C9orf72 and MAPT mutation carriers, FRS scores decreased with progression while CDR+NACC-FTLD-SB increased. In direct comparison, both scores were strongly correlated with each other in all three genetic groups.
However, disease staging and severity were not entirely consistent between the two scales. Analysis indicated that the FRS might capture more subtle changes associated with disease progression. A notable proportion of cases were asymptomatic according to the CDR+NACC FTLD (zero cognitive, behavioural or language impairments recorded) despite a mild or moderate degree of functional and/or behavioural change being reported via the FRS questionnaire. Vice versa, a number of cases with a global rating of 0.5, or in a small number a rating of 1, scored 100% on the FRS (indicating zero behavioural or functional changes). In line with previous studies, 22 23 our data suggest that the CDR+NACC FTLD may be more likely to underestimate disease severity when compared with FRS scores: 41% of cases with an asymptomatic CDR+NACC FTLD global rating had a degree of disability or behavioural change according to the FRS, vs 16% of cases with an asymptomatic FRS score having some symptomatology according to the CDR+NACC FTLD. Although the scales both broadly centre around everyday functioning and behaviour, there are differences between them e.g. the CDR+NACC FTLD evaluates language impairment, which the FRS lacks, but conversely, the CDR+NACC FTLD may not as comprehensively capture other changes apparent to the  caregiver, for example, behaviour is captured as a single domain in the CDR+NACC FTLD which may underestimate social and personality impairments that rely on subjective report and are difficult to operationalise. Another consideration is that several of the activities of daily living probed by the FRS have the potential to be affected by apathy or depression (four items begin with 'Lacks interest in… (activity)'), which are symptoms less relevant to the domains of the CDR+NACC FTLD. Whether depressive symptoms are directly related to evolving FTD pathology or are distinct and related to the impact of living at-risk of FTD is challenging to disentangle. Responses to the individual items of the FRS questionnaire were not available in the GENFI cohort to enable exploring trends among the cases with discrepant FRS and CDR+NACC FTLD scores, but this is a consideration for future studies of the scales. There were also discrepancies seen between both scales and symptomatic status, with a small number of participants being judged to be symptomatic by clinicians despite an asymptomatic or very mild score on the two scales. This may relate at least in part to a further issue with both scales, which is the lack of an assessment for motor or neuropsychiatric symptoms. Parkinsonian symptoms are seen across all of the genetic forms of FTD, 25 while ALS is seen mainly in those with C9orf72 mutations. Such motor deficits are associated with disease progression, 26 and impact on function in genetic FTD but are poorly captured by the FRS 27 and not measured at all in the CDR+NACC FTLD. In this cohort, half of the participants diagnosed with ALS/ FTD-ALS were in the asymptomatic, very mild or mild FRS severity categories or had asymptomatic (0) or very mild (0.5) CDR+NACC FTLD global ratings. Similarly, neuropsychiatric symptoms are also prevalent across the different forms of genetic FTD, 28 29 particularly in carriers of the C9orf72 expansion where they can be a defining feature. 30 31 Neither of the scales directly measure these features (ie, hallucinations, delusions, etc) and therefore are likely to be underestimating any effect of such symptoms on function and disease progression. Overall, given the heterogeneity in clinical presentation and disease course within people that share the same underlying genetic cause, 12 the inclusion of assessments of motor and neuropsychiatric symptomatology into clinical rating scales will be important for achieving accurate evaluation of disease stage. In turn, this will allow the full spectrum of FTD phenotypes to be included within the same clinical trial.
To evaluate the scale's abilities to track progression, annualised change was analysed in the cases with a follow-up time point, stratified by global impairment at baseline according to the CDR+NACC FTLD. On both scales, change over 1 year is small in the prodromal stages and then accelerates in carriers with a global rating considered to be symptomatic. Previous studies have reported significant changes in CDR+NACC FTLD scores over 1 15 and 2 years 20 in patients with FTD. We found that annualised change also accelerated moving from an asymptomatic global rating to a very mild 0.5 rating, and moving from 0.5 to 1. Our data align with previous findings that the FRS is able to detect deterioration over 1 year in symptomatic patients, 21 and show that this is the case particularly in those with 'mild' and 'moderate' FTD defined by the CDR+NACC FTLD global score.
Lastly, we estimated the sample sizes required to achieve at least 80% power to detect small to moderate sized effects of a disease-modifying therapy on change in the two scales as outcome measures. The sample sizes generated for both scales, even with a moderate (30%) treatment effect, suggest that a trial entering mutation carriers at a prodromal starting point (of CDR+NACC FTLD global rating 0.5) in an unselective way (ie, that does not further distinguish cases that are likely to soon progress) will require large numbers (with even greater numbers being required if randomisation was unequal rather than 1:1) and several years. The period in close proximity to phenoconversion is a useful target period for disease-modifying therapies, but for such a trial to require achievable sample sizes, this study suggests that better stratification will be needed, potentially combining clinical stage with neuroanatomical and/or fluid biomarkers to accurately identify likely converters. For example, a study involving GENFI and another genetic FTD cohort has recently shown that mutation carriers whose score worsens on the CDR+NACC FTLD over the next 1-2 years have high plasma neurofilament light chain concentrations at baseline compared with non-converters. 32

Limitations
By including a large number of mutation carriers at varied proximities to symptom onset, this study was able to evaluate the utility of disease staging tools across the spectrum of genetic FTD. However, the study cohort at baseline contains a larger proportion of asymptomatic than symptomatic carriers, and once stratified, individual group numbers were smaller. We took a transdiagnostic approach to the study, incorporating all phenotypes in the analysis. We, therefore, did not establish whether the scales were better at evaluating one phenotype over the other, although this is difficult as our study contained mainly people with a bvFTD phenotype (as is the case for genetic FTD), and few with PPA or FTD-ALS. 27 We were also not able to directly assess the ability of the scales to specifically measure the presence of prodromal symptoms as we did not have another marker of this stage, for example, clinician judgment. As discussed above, it may be that both scales (but particularly the CDR+NACC FTLD) are not sensitive enough to adequately capture this stage, and further studies should try to address this point.

CONCLUSIONS
Global rating scales such as the CDR+NACC FTLD and FRS serve a helpful purpose in clinical trials in providing a single score that can condense clinical judgement about disease severity. Although the CDR+NACC FTLD has become the most prominent clinical rating scale in FTD, there are potential issues with its use in clinical trials. In this study we show that there are similarities to the FRS as well as differences, and highlight the potential benefits for using the FRS both in clinical stratification and as an outcome measure in prevention trials of genetic FTD mutation carriers. However, both measures do not fully capture the entire spectrum of FTD symptomatology, and future improvements to the scales should consider the inclusion of motor and neuropsychiatric deficits. Neurodegeneration purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/ licenses/by/4.0/.