Objectives Recent advances in amyotrophic lateral sclerosis (ALS) genetics have revealed that mutations in any of more than 25 genes can cause ALS, mostly as an autosomal-dominant Mendelian trait. Detailed knowledge about the genetic architecture of ALS in a specific population will be important for genetic counselling but also for genotype-specific therapeutic interventions.
Methods Here we combined fragment length analysis, repeat-primed PCR, Southern blotting, Sanger sequencing and whole exome sequencing to obtain a comprehensive profile of genetic variants in ALS disease genes in 301 German pedigrees with familial ALS. We report C9orf72 mutations as well as variants in consensus splice sites and non-synonymous variants in protein-coding regions of ALS genes. We furthermore estimate their pathogenicity by taking into account type and frequency of the respective variant as well as segregation within the families.
Results 49% of our German ALS families carried a likely pathogenic variant in at least one of the earlier identified ALS genes. In 45% of the ALS families, likely pathogenic variants were detected in C9orf72, SOD1, FUS, TARDBP or TBK1, whereas the relative contribution of the other ALS genes in this familial ALS cohort was 4%. We identified several previously unreported rare variants and demonstrated the absence of likely pathogenic variants in some of the recently described ALS disease genes.
Conclusions We here present a comprehensive genetic characterisation of German familial ALS. The present findings are of importance for genetic counselling in clinical practice, for molecular research and for the design of diagnostic gene panels or genotype-specific therapeutic interventions in Europe.
- amyotrophic lateral sclerosis
- whole exome sequencing
Statistics from Altmetric.com
Genetic factors contribute substantially to the neurodegenerative disease amyotrophic lateral sclerosis (ALS). Approximately 3%–10% of patients newly diagnosed with ALS report a positive family history.1
To date, mutations in any of more than 25 genes have been suggested to cause familial ALS (fALS) in a monogenic manner.2–4 ALS-causing mutations can also manifest as frontotemporal dementia (FTD), sometimes in the same family or even patient (ALS/FTD comorbidity).5–7
While a considerable number of ALS/FTD disease genes have been identified since 1993, few common cell biological pathways involved in ALS pathogenesis emerge when grouping these genes according to their known physiological functions.8 For example, several ALS disease genes are involved in RNA synthesis and processing, protein homoeostasis or cytoskeletal functions. However, beyond novel insights into basic molecular mechanisms of ALS, genetic discoveries may also lead to genotype-specific, improved treatment options in the near future. Examples are knockdown of SOD1 expression by intrathecal administration of antisense oligonucleotides in a clinical trial (ClinicalTrials.gov: NCT01041222) or reduction in the concentration of SOD1 protein in the cerebrospinal fluid,9 both studies being performed exclusively in patients with SOD1 mutations. Consequently, detailed knowledge about the genetic architecture of ALS in a specific population will be important for genetic counselling but also for future gene-specific or even mutation-specific therapeutic interventions. Furthermore, novel mutations identified in known genes represent important starting points and tools to foster research on molecular mechanisms of the disease. Therefore, we here report the spectrum of variants in the consensus splice sites and protein-coding regions of all currently known monogenic ALS genes and their contribution to ALS in a large central European cohort of ALS families.
An estimated 85% of the disease-causing inherited mutations are located in the protein-coding regions of the human genome and in consensus splice sites.10 Therefore, exome capture and high-throughput sequencing is an efficient method of analysing a patient’s DNA to discover the genetic cause of a genetically heterogeneous disease.11 Consequently, most fALS index patient DNA samples of our cohort were subject to whole exome sequencing (WES) subsequent to screening for mutations in the most frequently mutated ALS genes C9orf72 and SOD1, in order to define the frequency of known mutations and to discover novel mutations in known genes. To define likely pathogenic variants, we applied stringent parameters with regard to the type, frequency and disease cosegregation of the observed variant.
Materials and methods
Overall, 301 pedigrees with familial ALS were recruited at German Clinical ALS Research Centres in Ulm, Berlin, Bochum, Essen, Hannover, Jena, Würzburg, Aachen and Munich from 1995 through 2016. All patients had been evaluated by neuromuscular specialists and were diagnosed according to the El Escorial criteria.12 The diagnosis of familial ALS was based on the presence of at least one first-degree or second-degree relative with ALS or FTD spectrum disorder. In few cases and if other sources were not available, the diagnosis of familial ALS was based on the patient’s or other family members’ reporting of symptoms compatible with ALS or FTD. Whenever possible, the information was confirmed by collecting medical records and by scrutinising death certificates and other available documents. In total, 10.5% of the patients included in the German ALS network MND-NET, which was the patient resource for this study, met the definition of familial ALS.
Initially, all patients were screened for mutations in the most frequently mutated ALS genes C9orf72 and SOD1.13 Furthermore, some patients with ALS-associated mutations in other more rare genes were identified in previous studies.14–20 All DNA samples that did not reveal a mutation in a known ALS gene by targeted genotyping were subject to WES, a total of 226 samples from 173 pedigrees.
This study was approved by the local medical ethics committees. All patients gave written informed consent before in accordance with the Declaration of Helsinki (WMA, 1964). In agreement with this approval, patients and healthy probands were informed about positive results only if requested before testing. Moreover, healthy probands (eg, healthy relatives of patients with an ALS mutation) were informed only after undergoing genetic counselling, in accordance with the German gene diagnosis law.
DNA was extracted from whole EDTA-containing venous blood samples as described.21 Analysis of the C9orf72 repeat length was performed by fragment length analysis and repeat-primed PCR (RP-PCR) using previously published primers.22 23 Since PCR-based methods cannot determine the size of larger expanded repeat-alleles, samples with a sawtooth pattern in the RP-PCR were further analysed using Southern blot.24
For the SOD1 screen and to confirm some variants detected in the WES analysis, the patient’s DNA was tested by Sanger sequencing. We designed forward and reverse m13-tailed primers. After the amplification, the fragments covering the variant sites were treated with ExoSAP-IT (Affymetrix). For the sequencing reaction, the BigDye Terminator V3.1 Cycle Sequencing Kit (Life Technologies) was used in accordance with the manufacturer’s instructions.
Electrophoresis was performed on an ABI PRISM 3130 Genetic Analyzer (Life Technologies). Data were analysed using the Peak Scanner (fragment length analysis and RP-PCR) and Sequence Scanner V1.0 (sequencing) software, respectively.
The WES was performed as 100 bp paired-end reads on HiSeq2000/2500/4000 systems (Illumina).25 We generated on average 10 gigabases of sequence resulting in an average depth of 125× with 95% of the target regions covered at least 20 times.
Enrichment for exome sequencing was performed with SureSelect Human All Exon 50 Mb kits, V3, V4, V5 or V6. Burrows-Wheeler Aligner (BWA V0.5.9) with standard parameters was used for read alignment against the human genome assembly hg19 (GRCh37). We performed single-nucleotide variant and small insertion and deletion (indel) calling specifically for the regions targeted by the exome enrichment kit using SAMtools (V0.1.18). Structural variants were analysed with Pindel26 and ExomeDepth.27 Custom scripts and database application are available on request (https://ihg4.helmholtz-muenchen.de/cgi-bin/mysql/snv-vcf/login.pl). The 35 investigated genes are well covered. Overall, 476 and 487 of the 491 target regions were covered at least 20 times in the V5 and V6 kits, respectively. The mean coverage of the 35 investigated genes was 131 (±34 SD) in a representative exome (online supplementary table 1).
We searched for variants in known ALS disease genes (table 1). To define likely pathogenic variants, we applied strict parameters with regard to the type, frequency and disease cosegregation of the variant (see the Results section). To assess the potential functional consequences of each sequence variation, we used three bioinformatic tools designed to predict possible impacts of an amino acid substitution on the structure and known function(s) of a human protein, PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), SIFT (http://sift.jcvi.org/) and MutPred (http://mutpred.mutdb.org/). To assess the conservation of affected amino acids for the respective protein, we aligned the sequences within the Mammalia including elephant, chimpanzee, cow, mouse and platypus and within the Vertebrata including Xenopus tropicalis, zebrafish, green sea turtle, parrot and lizard.
Overall, we analysed index patients from 301 ALS families. Additionally, in order to test for cosegregation with disease and penetrance, 81 affected and 25 unaffected individuals from respective families were sequenced. Unaffected individuals were genotyped only if they were informative, that is, older than the latest onset of disease in this family. A subset of the index patients displayed also cognitive or behavioural symptoms of FTD. All patients were of European origin.
Of the 301 ALS families included in this study, a subset of 128 index patients had been screened by Sanger sequencing or fragment length analysis combined with RP-PCR and Southern blotting (for C9orf72) for the genes discovered until 2011 in previous projects. Specifically, in 75 out of the 301 index patients, a Southern blot-confirmed C9orf72 hexanucleotide repeat expansion (HRE) was detected (figure 1). Thirty-seven index patients turned out to carry non-synonymous variants in SOD1 (variants with a minor allele frequency (MAF) 1:10 000 or lower (according to the ExAC dataset), except for the known pathogenic p.D91A mutation with MAF of 1:891), and further mutations were found in TARDBP (three pedigrees detected with two different mutations14), in FUS (eight pedigrees detected with six different mutations15 16), in OPTN (one pedigree detected with one mutation17), in PFN1 (one pedigree detected with one mutation18), in SETX (two pedigrees detected with two different mutations19) and in ALS2 (one pedigree detected with one mutation20).
Whole exome sequencing
Overall, the prescreening of the 301 index patients led to the detection of non-synonymous variants with an MAF <1:10 000 in known ALS genes or a C9orf72 HRE in a total of 128 index patients (42.5%, figure 1). The remaining 173 index patients as well as 81 affected and 25 informative unaffected relatives of these index patients were subject to WES to obtain a comprehensive mutational profile in this fALS cohort. WES revealed non-synonymous variants with an MAF <1:10 000 in known ALS genes in additional 43 index patients. All rare variants were found in a heterozygous state, except for the homozygous SOD1 p.D91A in four pedigrees, the homozygous ALS2 p.T185Lfs*five in one male with juvenile-onset ALS (age of onset 12 years) and an index patient with a homozygous loss-of-function mutation in OPTN p.E135*.
Categorisation according to probable pathogenicity
Table 2 summarises all sequence variants we identified in this study with an MAF <1:10 000. To allow an approximation of the relative contribution of each gene to the pathogenesis of fALS in Germany, we grouped the resulting variants in known ALS genes according to their likely pathogenicity. We divided the sequence variants in two groups according to whether they are (1) ‘likely pathogenic’ or (2) ‘variants of uncertain significance’ (VUS). The following variants were considered to be ‘likely pathogenic’: (1) pathologically expanded hexanucleotide repeats in C9orf72 (all expansions in our cohort displayed a length of 50 to thousands of hexanucleotide repeats), (2) non-synonymous variants in protein-coding regions with an MAF <1:10 000 in the ExAC dataset (http://exac.broadinstitute.org/) that were found in two different families (taken together this and previously published3 5 9 14 15 17–20 22 28–60 work) and present in all affected members of these families as far as DNA was available for genotyping; (3) any variant in a known ALS disease gene with an MAF <1:10 000 that cosegregates over at least five meioses, that is, is found in two affected relatives separated by at least five meioses and not found in unaffected family members (or otherwise reported as a possible indication for incomplete penetrance).
(4) loss-of-function variants (frameshifts, premature STOP codons/nonsense mutations, consensus splice site mutations, STOP loss) in genes with haploinsufficiency as the likely molecular genetic mechanism of toxicity (ie, FUS, TBK1, OPTN, NEK1, NEFH).
All other non-synonymous variants were classified as VUS. Thus, besides cosegregation data, we put emphasis on the low frequency of specific variants for our classification, based on the observation that rare and unique alleles contribute most to the heritability of ALS,61 and known monogenic causes of familial ALS represent mostly rare or even private mutations. One exception was principally made for loss-of-function variants in NEK1 with an MAF above 1:10 000, as NEK1 variants have a greatly reduced penetrance,62 although loss-of-function variants in NEK1 were lacking in our German fALS cohort. The second exception is the known pathogenic p.D91A mutation with an MAF of 1:891.
Based on this classification, we identified likely pathogenic variants in 49% and VUS in 8% of the 301 index patients (figure 2). In the remaining 43% of the families, no rare variant in any of the known ALS genes was detected by our screening approach. Thus, in total, 51% of all families were lacking a likely pathogenic variant according to the definition above. However, it has to be taken into account that a substantial proportion of the other rare variants that were found only in one family so far could also be causal, although this is hard to prove without segregation data supporting their role in ALS pathogenesis.
We detected no index patient with more than one likely pathogenic mutation. However, double or triple mutations may have escaped detection, as DNA of the patients who were positive in the C9orf72 or SOD1 prescreening were not subject to further analysis by WES. Moreover, we observed an index patient with three rare variants, although the trigenic inheritance could not be formally proven. The patient had DCTN1 p.I195L, FUS p.R524G (both according to our strict definition classified as VUS) and TBK1 p.Y185*. Interestingly, the patient had a substantially earlier onset compared with the other family members with only one of the three genetic alterations. Furthermore, we could identify a TARDBP p.N352S and ANXA11 p.P87T or p.G162R mutation (both classified as VUS) in two index patients.
Overall, based on the likely pathogenic variants, the five most frequently mutated genes in our German cohort were C9orf72, SOD1, FUS, TARDBP and TBK1 (figure 2, table 3). We additionally observed likely pathogenic variants in the more rarely mutated genes OPTN, CHCHD10, UBQLN2, SETX, VABP, VCP, NEFH and ALS2. Collectively, the latter genes are found mutated in a total of 4% of index patients in our cohort. Moreover, table 3 provides an overview of the clinical features of the study population.
WES of unaffected relatives
We performed WES also in a total of 25 unaffected relatives of patients from 17 families. We had chosen only informative unaffected family members, defined as individuals who were lacking symptoms of ALS or FTD at an age at least as old as the latest known onset of disease in the same family. In some instances, for example, for variants in CHCHD10 (p.R15L), SETX (p.F458L and p.H1962R) and ERBB4 (p.T271I), the variant was found not only in the index patient but also in an informative relative without ALS. This argues for possible reduced penetrance of the respective variant (in case of likely pathogenic variants) (table 2). At the same time, a caveat has to be expressed, as the presence of variants in unaffected informative family members could also indicate that the found variant is not causal, and thus, the criteria for likely pathogenicity were still too liberal.
Known ALS genes without mutation in our cohort
We identified several previously described mutations. Moreover, several novel potentially or likely pathogenic variants that have not been described in other families so far were observed (table 2). On the other hand, we demonstrate also the absence of variants in some recently described ALS genes in our cohort. Specifically, no variant with an MAF of <1:10 000 was found in ANG, ATXN2, C21orf2, CCNF, CHMP2B, DAO, GLE1, HNRNPA2B1, MAPT, MATR3, SIGMAR1, TIA1 or TUBA4A. Moreover, no homozygous variants were found in SPG11. SPG11 mutations are most frequently associated with autosomal recessive spastic paraplegia with thin corpus callosum, an autosomal-dominant inheritance has so far not been reported.
In our work, we present the genetic characterisation of a large cohort of patients with ALS from Central Europe, in order to estimate the frequency of known mutations and discover novel mutations important for clinical testing as well as the design of gene-specific therapeutic trials. Moreover, novel mutations described in this work could be the starting point for mechanistic molecular research.
While we identified known pathogenic variants in a subset of index patients, we found also novel variants in established ALS disease genes. In order to be able to classify these variants, we defined two principle categories: ‘likely pathogenic’ and VUS. We chose a strict definition for ‘likely pathogenic’. We put a strong emphasis on classical segregation analysis and rarity of the respective variant, considering that low-frequency alleles contribute most to heritability of ALS.61 In contrast, we did not take into account bioinformatic prediction results, since bioinformatic algorithms are designed to predict impairment of known protein function, but detrimental effects of a given mutation could also be due to, for example, toxicity by a gain of novel function instead of a loss-of-function of the protein.
All remaining variants not fulfilling our above mentioned criteria were categorised as VUS. We thus perform a dichotomic separation of variants based on a strict, but in our view plausible threshold for pathogenicity. It has to be emphasised that a substantial number of VUS may still be causative. Nevertheless, variants that do not fulfil our high evidence standards for pathogenicity are hard to interpret in clinical settings and are not recommended for experimental work-up because the results would remain inconclusive.
We observed likely pathogenic variants in 49% of the 301 ALS families, whereas 43% and 8% of the families remained genetically unexplained or harboured a VUS, respectively. Generally, this cohort of patients with familial ALS reveals a heterogeneous genetic architecture, with variants in several rarely mutated genes, and a relatively small contribution even of the most frequently mutated genes C9orf72 and SOD1 when compared with other populations that historically went through a genetic ‘bottleneck’. For example, the relative contribution of the C9orf72 mutation to familial ALS is 25% in our study, whereas it reached 46% in populations in Sweden or Finland23 and even 51.1% in patients of Sardinian ancestry.63 The genetic heterogeneity of our cohort could also be responsible for the comparably high proportion of familial patients in whom a genetic cause could not be established, because of the contribution of a relatively high number of very rare and therefore so far undiscovered disease genes. Moreover, polygenic inheritance of variants with lower effect size may account for additional familial ALS cases. Furthermore, an unknown fraction of regulatory variants can only be identified by means of whole genome sequencing.
However, also in this German cohort, some mutations are detected that are found identical in multiple, seemingly unrelated families and most likely represent founder mutations. For example, the most frequent SOD1 mutation in Germany is p.R116G, which has not been described in any other population so far.37 64
We discovered also several novel variants, for example, in HNRNPA1, TARDBP, OPTN and NEFH, although in some instances, their pathogenicity will remain unclear until additional evidence for cosegregation with disease or a second patient with the same variant becomes available.
In line with the usually dominant mode of inheritance, the vast majority of mutations were found in a heterozygous state. An index patient with a homozygous OPTN loss-of-function mutation represents an exception, in agreement with the biallelic OPTN mutations previously observed in patients with ALS.65 66 The p.D91A mutation in SOD1 is another rare instance of ALS-causing mutations detected in both heterozygous and homozygous state, as confirmed in this study. The SOD1 p.D91A mutation carriers are all of German descent.
Moreover, mutations in several of the rarely mutated ALS disease genes were absent in the study cohort. Specifically, no rare variants were observed in ANG, ATXN2, C21orf2, CCNF, CHMP2B, DAO, GLE1, HNRNPA2B1, MAPT, MATR3, SIGMAR1, TIA1 and TUBA4A, and no homozygous variants were found in SPG11. WES did not allow us to scrutinise the ATXN2 poly-Q-repeat, which is an established risk factor for ALS at an intermediate length.67
A higher frequency of patients with mutations in more than one ALS disease gene than expected by chance has been suggested before.68 In our cohort, we observed only three index patients with more than one rare variant, although the begenic or trigenic inheritance could not be formally proven because, according to our strict definition, the second and third variant(s) in the respective index patient are not classified as ‘likely pathogenic’ but as VUS. Furthermore, it has to be emphasised that patients who were positive in the C9orf72 or SOD1 prescreening or patients from previous studies were not subject to further analysis by WES, which concerns a total of 128 index patients (42.5%). Thus, double or triple mutations may have escaped detection.
Overall, the clinical phenotype/genotype association was similar to what had been described before. For example, the high prevalence of FTD comorbidity, more rapid disease progression and more bulbar onsets in patients with the C9orf72 HRE has been described before.24 69 As expected, the homozygous ALS2 mutation was connected to a juvenile-onset motor neuron disease. Interestingly, one of the patients with a SOD1 mutation (p.H49R) displayed mild symptoms that were principally in agreement with a beginning behavioural variant FTD, which is rarely observed in patients with SOD1 mutations.70 In addition, CSF analysis was consistent with Alzheimer’s disease in this patient, therefore possibly representing a rare mixed degenerative phenotype caused by this SOD1 mutation.
Taken together, we here present a comprehensive genetic characterisation of German fALS. We delineate the contribution of all known Mendelian ALS genes and reveal several novel mutations. Our work should represent a valuable resource for genetic counselling as well as the design of ALS multigene panels for diagnostics. Moreover, the novel mutations described here could be starting points for molecular genetic work-up of ALS disease mechanisms. Finally, the dataset could turn out to be pivotal for the development and clinical evaluation of gene-specific or mutation-specific therapies based on, for example, antisense oligonucleotide techniques in the near future.
We are indebted to the patients and healthy control persons for their participation in this project.
Contributors KM and JHW conceived the study. DB, PW, TMey, TG, SP, JG, JS, AEV, GB, CK, TK, DZ, SJ, MS, SK, AK, KG, JW, KGC, BS, A-DS, AH, MO, JD, TMei, TMS, PMA and ACL helped with the implementation. All authors contributed to the refinement of the study protocol and approved the final manuscript.
Funding This work was supported by grants from the German Society for Patients with Neuromuscular Diseases (DGM) and German Federal Ministry of Education and Research (BMBF; STRENGTH project and the German ALS network (MND-NET)). The work of AEV was funded by the Deutsche Forschungsgemeinschaft (DFG, VO 2028/1-1).
Competing interests None declared.
Patient consent Obtained.
Ethics approval This study was approved by the local medical ethics committees.
Provenance and peer review Not commissioned; externally peer reviewed.
Collaborators Ute Weyen, Andreas Hermann, Martin Regensburger, Jürgen Winkler, Ralf Linker, Beate Winner, Tim Hagenacker, Jan Christoph Koch, Paul Lingor, Bettina Göricke, Stephan Zierz, Berit Jordan, Petra Baum, Joachim Wolf, Andrea Winkler, Peter Young, Ulrich Bogdahn, Johannes Prudlo, Jan Kassubek, Karin Danzer.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.