Prospective population based-registers of amyotrophic lateral sclerosis (ALS) have operated in Europe for over two decades, and have provided important insights into our understanding of ALS. Here, we review the benefits that population registers have brought to the understanding of the incidence, prevalence, phenotype and genetics of ALS and outline the core operating principles that underlie these registers and facilitate international collaboration. Going forward, we offer lessons learned from our collective experience of operating population-based ALS registers in Europe for over two decades, focusing on register design, maintenance, identification and management of bias and the value of cross-national harmonisation and integration.
- MOTOR NEURON DISEASE
Statistics from Altmetric.com
Prospectively designed population-based registers with complete ascertainment of all affected individuals in a defined geographic catchment area are of increasing value in epidemiological research, as evidenced by the success of European population-based registers for amyotrophic lateral sclerosis (ALS).1 2
Although a rare disease with an incidence in Europe of 2–3/100 000 person-years,1 2 ALS provides an excellent model to study neurodegenerative diseases. ALS has a rapid progression and is uniformly fatal, and there is high clinicopathological correlation (autopsy diagnosis matches clinical diagnosis), rendering in vivo diagnosis accurate in the vast majority of cases. As is the case with other neurodegenerations, it is now recognised that ALS is a heterogeneous condition associated with more than one pathogenic mechanism and with different clinical manifestations and trajectories. With evolving recognition of disease heterogeneity as an important factor in a precision medicine approach towards the development of new therapeutics, population-based datasets can play a crucial role in defining the full range of disease phenotype and demography.
ALS population-based registers capture all cases regardless of age, health or socio-economic status, and can thus provide a wealth of information about disease incidence, prevalence, spatial distribution, heterogeneity in clinical phenotype, outcome and analysis of risk. The Irish and Italian ALS registers have been in continuous operation for over two decades, and the Dutch and English for over a decade, during which insights into the pathophysiology, epidemiology and genetics of ALS have developed greatly. Each of these registers has provided valuable country-specific data regarding epidemiology, disease progression, clinical phenotyping and genetics, health service planning, quality of life assessment and mapping of the patient's journey.2–7
Some of the older European ALS registers have combined their data as part of the European ALS Epidemiology Consortium (EURALS) consortium. This has enabled harmonised collaboration across European registers, providing insights that would not be otherwise possible. For example, combined appraisal of registers has indicated the presence of population-based variability, has provided evidence that ALS is a multistep process5 and has facilitated unbiased studies of spatial epidemiology.8–10
The overall aims of this review is to provide a detailed analysis of what has been learnt from existing ALS registers, to identify and recognise hidden biases and confounders and to explore future challenges to population-based registers and how these might be met.
Incidence and prevalence of ALS
Comparative analysis of ALS across European Registers has provided incidence rates of 2.6/100 000 person-years and prevalence rates of 7–9/100 000 persons, with a mean life expectancy of 30 months from first symptoms.2 11 12 European ALS registers are advantaged by the presence of stable populations with defined geographic borders ensure accuracy of incident and prevalent figures.2 13. The core principle of population-based registers is that they capture all patients; nobody is too old, too poor or too sick, and a stable population structure with limited mobility also reduces the risks of loss to follow-up.
In general, multiple data sources provide the best mechanism for accurate data capture.1 13 Depending on the type of health system, European ALS registers have ascertained patients by a combination of unique patient identifiers (UK, Italy), referrals through networks of clinical professionals and death certification (Ireland), self-reporting and coding and face-to-face or telephone-based interviews with self-reported questionnaires (Netherlands). To assess the degree of underascertainment when two or more independent sources are used, a capture–recapture system is often used,12 14 although this is less accurate if data sources are linked in any way as has been shown by the Irish Register.15
How hidden bias can affect estimates of incidence
A number of studies have suggested that the incidence of ALS is increasing,16–18 but this assertion is not fully supported by data from long-running population-based European registers.13 Careful evaluation of data held within European registers suggests that apparent increases in incidence are more likely explained by the increased recognition of phenotypes that might in the past have been excluded from collection, including in some patients with ‘possible ALS’ as per the EL Escorial diagnostic criteria and those with predominant features of frontotemporal dementia (FTD) and an associated motor degeneration. This can lead to subtle shifts in ascertainment, as registers evolve to include patients who might otherwise have escaped ascertainment. In the case of ALS, registers evolve to include patients who might otherwise have escaped ascertainment. European population-based registers suggest that this is likely to account in part for the observed upward shift in disease incidence of ALS, particularly in later life. This apparent increase in cases is contemporaneous with the evolving recognition that ALS patients can experience a range of cognitive changes which in some cases may be the presenting symptom.19 20 Additionally, all registers, no matter how well they are designed, will miss some cases at the beginning. The development of a national Register, or indeed the presence of a specialist service within a particular region, can have a secondary effect of increasing awareness of the condition, which also improves ascertainment as time goes on. Third, the demographics of European countries are such that an increasing proportion of the population has entered the age range associated with increased risk of developing ALS. In this instance, although the total number of cases of ALS might increase within a population, the standardised rates may not have changed.
Failure to recognise the inherent biases of subtle changes in disease recognition, coupled with improvements in ascertainment strategies and changes in population demography can drive assumptions about increasing frequencies of ALS that are not fully supported by appropriately adjusted data.
How biases in datasets can affect survival estimates
Population-based registers have demonstrated that up to 70% of patients with ALS die within 3 years of first symptoms, and that approximately 5% to 10% can survive for more than 8 years.11 21 Some clinical trial-based datasets (eg, ProACT (https://nctu.partners.org/ProACT)) have suggested that survival of ALS patients has increased over time,22 an effect that has been attributed in part to increased use of interventions such as non-invasive ventilation. However, while there is evidence that non-invasive ventilation may affect survival in some patients, data from European population-based registers have not identified a significant overall effect on survival at population level since the inception of non-invasive ventiliation (NIV) as a standard of care in ALS management.11 21 Moreover, detailed analysis of population-based registers suggests that the apparent increase in survival described in some clinical trial-based datasets is more likely to be a function of systematic bias. European population-based registers have demonstrated that those patients who participate in clinical trials, regardless of whether the therapeutic agent has been efficacious, have a different disease trajectory compared with the disease trajectory of the overall population-based cohort.23 Table 1 and figure 1 exemplify this by comparing participants of the lithium in patients with amyotrophic lateral sclerosis (LiCALS) trial to the parent register population, that is, the South-East England ALS (SEALS) register.24 Trial datasets usually comprise prevalent patients who attend specialist clinics and who are sufficiently well to meet trial entry criteria and sufficiently motivated to enrol in clinical trials. It has been long recognised that differences in disease trajectories between incident and prevalent cohorts can bias survival analyses, and that this bias underpins the differences in clinical characteristics between clinic-based (primarily prevalent) and population-based (incident and prevalent) cohorts. 25 It is therefore more likely that improvements in survival in data repositories such as ProACT are due to subtle differences in the composition of disease cohorts rather than a true shift in disease behaviour. However, it is also the case that direct comparison of datasets (including registers) across different epochs as a means of determining changes in disease outcome can lead to unintentional bias due to cohort effects. This is one of the reasons why the use of historical controls for comparative purposes is not recommended, even when using population-based data from registers.
European ALS registers have shown that data captured in the early years of registers is unlikely to be of the same complexity or quality of subsequently captured data. For example, in the Irish ALS Register, a subtle shift in age profile can been ascertained within the study cohort.26 This is most likely to reflect a transition in awareness of disease within the elderly population among referring practitioners. Maturing registers eventually shift detection from a mixture of prevalent and incidence cases in the earlier register years towards ascertainment of incident cases as the register became established (figure 2). Comparative analyses within registers and across different periods must therefore take these potential confounders into account in drawing conclusions about disease behaviour.
Expanding the ALS phenotype
Italian and Irish ALS registers provide detailed analyses of disease subphenotypes that have helped to characterise the clinical and cognitive changes associated with ALS,27–30 demonstrating how registers provide new insights that can help to accurately classify patients into different clinical and prognostic subgroups. This can be helpful for clinical trial stratification. For example, Irish and Italian registers have shown that cognitive and behavioural changes are intrinsic features of ALS, affecting up to 50% of patients,28 and are associated with significant prognostic implications.29 30 Interrogation of individual datasets can also provide important differentiating features that allow exclusion of possible mimic syndromes.31
Registers have also been helpful in characterising the presence of possible endo-phenotypes among probands and their extended family members. Such population-based observations can in turn lead to novel and previously unrecognised pathogenic mechanisms. For example, a recent population-based study from Ireland has shown higher rates of psychosis and suicide in first and second-degree relatives of ALS patients compared with controls.32 This observed family aggregation of neuropsychiatric disorders in ALS kindreds provided the necessary hypothesis to undertake a combined summary statistics Genomewide Assocation Study (GWAS) Analysis of ALS and schizophrenia which has revealed a hitherto unrecognised 14% polygenic overlap between ALS and schizophrenia, suggesting the presence of shared pathogenic mechanisms between these two clinically divergent disorders.32
Ascertaining environmental risk
While registers themselves cannot define risk, well-established population-based prospective registers such as the Dutch ALS Register can support detailed population-based case-control studies aimed to assess environmental risk, including physical activity, body mass index, consumption of alcohol and fat, smoking and other exposures.6 33–35 Because ALS is a rare disease, large case-control studies require extensive collaboration between different centres and across different geographic regions. Subtle differences in ascertainment and disease definition can introduce bias unless the data collection has been standardised with respect to ascertainment and characterisation. The recent Euro-MOTOR study comprising cases and controls from countries with population-based registers demonstrates that combined ascertainment can significantly enhance power for risk assessment.36 The Euro-MOTOR project has now established a repository of over 1500 population-based incident cases and 3000 matched controls with extensive phenotype, environmental and genomic characterisation.36 The Euro-MOTOR design has since been exported to other regions, most recently to three Latin American countries (Cuba, Chile and Uruguay), which have formed the Latin American Epidemiology Network for ALS.
Registers can also demonstrate how non-uniformity of access to health systems can bias analyses of risk. This is exemplified by conflicting observations relating to the association of disease risk and socio-economic status. For example, an association between ALS with area-based socio-economic status was reported in New Jersey,37 where access to health services is not uniform and those in lower socio-economic groups may not be captured. This contrasts with recent analyses from the European registers, which have not demonstrated any association between social deprivation and ALS incidence.8–10
Registers can also be used to address claims of disease clustering. Many studies have been published suggesting increased rates of ALS in regions thought to be associated with specific risk. This is exemplified by the suggestion that environmental pollutants or cyanotoxin exposure are associated with ALS. A geographic ascertainment bias (the so called ‘Texas sharp-shooter phenomenon’) is generated by examination of reported clusters in the absence of complete surveillance data.38 By contrast careful population-based analysis using well-established registers has to date failed to identify a spatial association between specific environmental pollutants and disease risk, as exemplified most recently by the negative evidence for clustering in a heavily polluted region of Italy.39 Indeed, to date, other than Guam and the Kii Peninsula of Japan, no reproducible areas of clustering have been noted. A region of reduced incidence (‘cold spot’) has recently been reported by the Irish Register.40 The reasons for this are unclear but may be related to subtle historical differences in local population structure.
Making sense of genetics
Defining familial disease
Prior to the recognition of the importance of the C9orf72 repeat expansion as causative gene in ALS, familial ALS (FALS) has been reported to account for 5% of cases .41 More detailed analysis of family history and genotyping of at least one population-based register (Ireland) now suggests that the true proportion of FALS is closer to 16%–20% of all ALS cases.32 Low reported rates of familial disease are most likely a function of biased study design that do not collect within a population-based setting. It is also the case that incident patients may not be aware of a family history or may not recognise the link between the proband and a family history of progressive neurological decline. Longer running registers can provide important insights in this regard, as new patients are ascertained within kindreds that had previously been classified as ‘sporadic’.
It must also be noted that genetic studies that do not recognise the presence of variations in population structure can confound analysis, as they make assumptions of uniform prevalence of gene variants and clinical phenotype.42 43 Important variation in the prevalence of at risk genes is known to be the case for at least two major ALS genes—the frequency of the C9orf72 repeat expansion is high in population of European extraction and low in the Asian population,44 while variants in SOD1 account for 13% of FALS in Italy but are not found in Ireland and are rare in Holland.45 The presence of population isolates can also affect the genetic epidemiology of ALS. For example, higher rates of ALS have been identified in a population isolate in the Netherlands, leading to the discovery of ALS-associated variants in the NEK1 gene.3 Similarly, higher rates of FALS have been noted in Sardinia due to founder effects with respect to TDP43 and C9orf72.43
It is important to note that registers cannot only help to identify new genes as in the case of NEK1, they can also limit the impact of referral bias on genetic studies.46 Many genetics studies are of necessity clinic based, and since it is known that clinic-based cohorts are phenotypically distinct from population-based cohorts (table 1), it can be assumed that reports of the prevalence of at risk genotypes are also biased.
Complex genetics, ancestral origin and disease risk
Interrogation of incidence, prevalence and risk genetically admixed populations is of increasing interest. As noted, there is now considerable evidence that the incidence of ALS varies significantly across countries,47 48 and the phenotype and outcome of the disease vary in relation to population ancestral origin.49 South American populations of mixed ancestral origin may have lower rates of ALS compared with those reported in Europe.47 50 A population-based mortality study from Cuba has reported different rates of ALS in different ancestral populations, with higher rates in those of European origin and lower rates in the admixed population (which corresponds to the ‘Latino’ population in the USA).47 Population-based registers which ascertain within a region of mixed ancestral origin are therefore of particular interest from genetic and environmental perspectives. Differences in ancestral risk may be significant sources of bias in the generation of registers in countries such as the USA, where differential access to and utilisation of health services is linked to race, ethnicity, language, rurality and socio-economic status.51 52 The design of European population-based ALS registers can counteract such biases by capturing all patients using multiple different sources and care pathways.2
Population-based registers can inform health services. The availability of precise incident, prevalent and clinical trajectory data can permit detailed service planning and can enable projection of future societal needs (table 2). While registers cannot of themselves provide sufficiently rich datasets to inform the entire patient journey, well-constructed and compatible registers within different jurisdictions can also permit comparative analyses of different types of services, as has been demonstrated in Ireland.53 Comparison of survival outcome between registers in the Republic of Ireland and Northern Ireland, which have similar population structures but which provide different types of specialist care for ALS patients, have shown the multidisciplinary care within a single clinic is superior to devolved care provided within a defined ‘hub and spoke’ model of care.53 Registers also permit high level comparative studies of different interventions within individual geographic regions using outcomes such as hospitalisations and survival. In Puglia, the model of care is such that there is no survival difference between patients attending local neurologists and those receiving care in a specialist multidisciplinary clinic,54 Similarly, nested work by Dutch researchers shows that the additional availability of a regional care worker does not improve quality of life among caregivers,55 although analysis of outcome using population-based datasets in the Netherlands and Lombardy (Italy) show that multidisciplinary clinics are also better value for money, reduce hospitalisations and enhance quality of life of patients.56 57
Health services, clinical trials and ‘real-world data’
There is an increasing recognition that clinical trials by necessity select patients that are not representative of the true population, rendering decisions regarding the generalisation of trial findings challenging from a health policy perspective.58 While the best study design for assessment of treatment effectiveness is the randomised clinical trial, therapeutic effectiveness can also be assessed in more generalisable context using prospective observational studies nested within registers, as has been demonstrated in the case of Riluzole.59 Prospective cohort studies using population-based registers in which the outcomes are collected after exposure or intervention in patients can provide valuable ‘real-world’ information regarding the longer term effect of a therapeutic intervention.
Sustainability of population-based registers
Registers are difficult to fund as they are often viewed as infrastructure by research bodies and as research initiatives by health services. Many registers rely on the energy and interest of a single founder and are challenged at the time of retirement of the key principal investigator, as occurred in the case of the Scottish Register on the retirement of the founding clinician, Dr Robert Swingler. Fortunately, recent recognition of the value of the population-based register for ALS by the Scottish health authorities has enabled the re-establishment of this valuable resource with the provision of ring-fenced funding.
Long-term sustainability of registers can also be eroded by limitations on the types of data disease registers are permitted to record. While issues of privacy and data protection must be clearly addressed in registers as part of an overarching governance structure, recent changes in European legislation are of potential concern to the operation of true population-based registers. For example, inclusion of data relating to living patients without their expressed informed consent is now in breach of European data protection laws. As institutional review boards are taking an increasingly stringent position regarding patient’s autonomy with the introduction of ‘consent to contact’ requirements, there is now a real risk of underascertainment of cases. Without derogation based on the principles of public health benefit, it becomes increasingly difficult to create and sustain accurate population-based registers for most conditions.
Legislation providing derogation will require an understanding and recognition by the public of the important potential societal benefits of population-based epidemiological research, and in particular the potential public health benefit of identifying and communicating data regarding regional variations in disease incidence, prevalence and survival. This principle is implicit in the case of notifiable infectious and communicable diseases and is of particular import in the case of rare neurodegenerative. This has been recognised in some jurisdictions for some types of registers (eg, the Irish Cancer Register,and the US ALS Registry) and by the International Rare Disease Research Consortium (http://www.irdirc.org/). However, there remains a disappointingly limited recognition within the Europe legislature of the significant benefits of population-based registers.
Prospective population-based disease registers are invaluable in patient-oriented research of rare diseases. As exemplified by the success of European ALS registers, population-based databases can identify and address biases that are intrinsic to other types of data. While some biases cannot be completely eliminated, their recognition can provide the necessary caution in data interpretation. Notwithstanding their limitations, registers can provide unique and often unexpected insights into disease epidemiology and pathobiology and can inform the types of healthcare that are of greatest benefit to patients.
It is imperative that funding agencies, healthcare providers and institution review bodies recognise the value of these types of registers, particularly in the case of rare disease such as ALS; and that forthcoming data protection legislation, while well intentioned and appropriate in many ways, does not compromise our ability to fully understand disease heterogeneity and to continue to improve the lives of patients with ALS and related neurodegenerations.
Clearly defined case definitions should be used including inclusion and exclusion criteria.
Include a clearly labelled register subsection for cases that should be tracked but do not fulfil the formal inclusion criteria.
Register variables should be carefully selected and a ‘core content’ paradigm should be agreed in advance.
International collaborative efforts and/or national merger of data in large countries using multiple registers to cover different regions are advisable for rare diseases.
Dedicated staff time to ensure effective set-up and maintenance of the register
Defined capture methodology including multiple sources
Regular comparison of ascertainment rates and patient demographics
Investigation of ‘ascertainment holes’
Employment of careful statistical analyses of data collected in the first 3–5 years to account for ‘start-up bias’
Exclude the most recent 1–2 years of data capture, particularly for survival analyses.
Security is paramount for system software, yet flexibility to accommodate a shifting knowledge base is essential.
Including population-based controls in a register enables valuable case-control studies for studying environment/lifestyle/genetic risk factors.
The authors would like to thank Dr Katy Tobin who kindly prepared figure 2
Funding OH is funded by the Health Research Board Clinician Scientist Programme. Prof Hardiman has received speaking honoraria from Novarits, Biogen Idec, Sanofi Aventis and Merck- Serono. She has been a member of advisory panels for Biogen Idec, Allergen, Ono Pharmaceuticals, Novartis, Cytokinetics and Sanofi Aventis. She serves as Editor-in-Chief of Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. AAC declares associations with OrionPharma, Cytokinetics, Mitsubishi-Tanabe Pharma, OneWorld Publications and Cold Spring Harbor Laboratory Press. CB declares no conflicts of interest. EB reports grants from UCB-Pharma, grants from Shire, grants from EISAI, personal fees from Viropharma, grants from Italian Ministry of Health, grants from European Union, grants from Fondazione Borgonovo, grants from Associazione IDIC 15, outside the submitted work. LHB serves on scientific advisory boards for the Prinses Beatrix Spierfonds, Thierry Latran Foundation, Biogen and Cytokinetics; received an educational grant from Baxalta; serves on the editorial board of Amyotrophic Lateral Sclerosis And Frontotemporal Degeneration and the Journal of Neurology, Neurosurgery, and Psychiatry; and receives research support from the Prinses Beatrix Spierfonds, Netherlands ALS Foundation, The European Community's Health Seventh Framework Programme (grant agreement n° 259867), The Netherlands Organization for Health Research and Development (Vici Scheme, JPND (SOPHIA, STRENGTH, ALSCare)).
Funding The research leading to these results has received funding from the Health Research Board Interdisciplinary Capacity Enhancement Programme, the European Community’s Seventh Framework Programme (FP7/2007-2013) under the Health Cooperation programme and the project EUROMOTOR (number 259867), from the European Joint Programme in Neurodegeneration (SOPHIA and ALS-CarE), and the Charities Research Motor Neuron and Irish Motor Neuron Disease Association. The funding sources played no role in the preparation of this manuscript.
Provenance and peer review Commissioned; externally peer reviewed.
Correction notice This paper has been corrected since it published Online First. The author Ammar Al-Chababi's surname has been corrected.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.