Article Text
Abstract
Objectives To determine whether the familial clustering of amyotrophic lateral sclerosis (ALS) cases and the phenotype of the disease may help identify the pathogenic genes involved.
Methods We conducted a targeted next-generation sequencing analysis on 235 French familial ALS (FALS), unrelated probands to identify mutations in 30 genes linked to the disease. The genealogy, that is, number of cases and generations with ALS, gender, age, site of onset and the duration of the disease were analysed.
Results Regarding the number of generations, 49 pedigrees had only one affected generation, 152 had two affected generations and 34 had at least three affected generations. Among the 149 pedigrees (63.4%) for which a deleterious variant was found, an abnormal G4C2 expansion in C9orf72 was found in 98 cases as well as SOD1, TARBP or FUS mutations in 30, 9 and 7 cases, respectively. Considering pedigrees from the number of generations, abnormal G4C2 expansion in C9orf72 was more frequent in pedigrees with pairs of affected ALS cases, which represented 65.2% of our cohort. SOD1 mutation involved all types of pedigrees. No TARDBP nor FUS mutation was present in monogenerational pedigrees. TARDBP mutation predominated in bigenerational pedigrees with at least three cases and FUS mutation in multigenerational pedigrees with more than seven cases, on average, and with an age of onset younger than 45 years.
Conclusion Our results suggest that familial clustering, phenotypes and genotypes are interconnected in FALS, and thus it might be possible to target the genetic screening from the familial architecture and the phenotype of ALS cases.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information. All data are deidentified and available at the Biochemistry and Molecular Biology Department of the CHU of Tours.
Statistics from Altmetric.com
Introduction
Amyotrophic lateral sclerosis (ALS) is the most frequent motor neuron disease (MND) in adults. This neurodegenerative disease is characterised by the progressive death of upper motor neurons and lower motor neurons in bulbar and spinal areas. This disease remains dreadful, with a median survival of 36 months after weakness onset.1 After decades of research, the pathophysiology remains inadequately understood, but it is accepted worldwide that environmental and genetics factors play key roles in the occurrence of the disease.
Even after the first description of a familial ALS (FALS) reported by Aran in 1850 with a man in his 40s whose one sister and two maternal uncles suffered from ALS, the existence of familial forms became formally accepted only after the pioneer publications of Mulder and Kurland in 1954.2 3 Over the last 65 years, a plethora of literature has been focused on this topic leading to the identification in 1993 of mutations in the superoxide dismutase 1 (SOD1) gene in FALS and, over the last 30 years, more than 30 pathogenic genes linked to ALS have been described.4 5 Variants of these genes are implicated in several pathological mechanisms, including neuroinflammation, excitotoxicity, defects in axonal transport, and abnormalities in protein homeostasis, autophagy, RNA metabolism and mitochondrial function.6 7
Currently, FALS accounts for 5%–10% of ALS cases, and four genes (C9orf72, SOD1, TARDBP, FUS) account for approximately 60% of FALS pedigrees and less than 10% of sporadic ALS (SALS) cases.5 8 9
Although the key role of genetics in the pathophysiology of ALS has become widely accepted, numerous situations remain unexplained due to the difficulties in defining the mechanism of heredity in families with few ALS cases, as well as in the complexity of drawing a clear correlation between the genotype and an atypical ALS phenotype. These complex situations would be improved by using predictive tool that integrates phenotypic and familial data, and allows targeted molecular biology screening.10–12
The aim of this study is to perform a thorough analysis of the epidemiology, the phenotype and the genotype of 235 French FALS pedigrees in order to determine whether the familial clustering of ALS cases and the phenotype of the disease may be directly related to the genotype. A better understanding of these relationships will enable to propose a targeted diagnostics approach and, thus, could accelerate and improve personalised patient care.
Materials and methods
Patients with FALS
Two hundred and thirty-five index Caucasian ALS of unrelated family cases, recruited within the 19 French ALS Centers from 2000 to 2014, were included in this study. All index ALS cases fulfilled the criteria of definite, probable laboratory supported and probable ALS and were considered as familial since, at least one first or second-degree relative previously suffered from ALS.13 14 A genealogic tree was available, mentioning the MND cases diagnosed in the family.
For each family, we collected the number of cases, the number of generations affected by ALS, the parental relationships between the proband and the other familial ALS cases, and the number of men and women with ALS. Considering the number of generations affected by ALS, we split the cohort in three subgroups: MonoG (for monogenerational) when the disease was present in one generation, BiG (for bigenerational) when the disease was present in two generations and MultiG (for multigenerational) when the disease was present in three generations. The following clinical features were collected for all proband with FALS: date of birth, gender, age of onset, site of onset which referred to the first site of weakness and date of death or of the last update.
Genescan analysis
All patients gave their informed consent for DNA analysis. Genomic DNA was extracted from blood. The presence of a pathological hexanucleotide expansion GGGGCC (G4C2) (repeat number >30) in the C9orf72 gene was evaluated using repeated primed PCR followed by fragment sizing on a 3130xl Genetic Analyser (ThermoFisher).15 Data were analysed using the GeneScan software.
Next-generation sequencing
All probands underwent a next-generation sequencing (NGS) analysis studying the following gene panel: ALS2, ANG, CHCHD10, CHMP2B, DAO, DCTN1, DPYSL3, FIG4, FUS, GRN, MAPT, MATR3, NEFH, OPTN, PRPH, PSEN1, PSEN2, SETX, SIGMAR1, SOD1, SPG11, QSTM1, TAF15, TARDBP, TBK1, TREM2, TUBA4A, UBQLN2, VAPB and VCP. A HaloPlex target enrichment system was used to sequence the coding regions and intron–exon boundaries of these 30 ALS genes (Agilent Technologies, Santa Clara, California, USA). We used the online Suredesign tool from Agilent to design Haloplex probes, targeting 436 amplicons of 30 genes, representing a total sequenceable size of 108 332 base pair (bp). Libraries were sequenced using a MiSeq sequencer according to the manufacturer’s instructions with 150 bp paired-end sequencing (MiSeqReagent Kit V2; 300 cycles) (Illumina, San Diego, California, USA).
The sequences were analysed in comparison to human reference genome UCSC hg19 using the bioinformatic pipeline used in the ALS Center of Tours. Briefly, reads were aligned with BWA algorithm (V.0.7.17), variants were called using GATK tools (V.3.4), annotated using the ANNOVAR software and coverage was analysed with samtools (V.1.8).16 Variants were selected with minimum 30times coverage, allelic frequency in 1000 Genomes Projects and Exact databases<0.01% for all populations (synonymous variants were exclude). Variants identified by NGS were validated by Sanger sequencing using a 3130xl Genetic Analyser (ThermoFisher). They were classified into categories according to ACMG Guidelines: deleterious (D), variants of uncertain significance VUS (V), non-deleterious (N), on the basis of results obtained by pathogenicity prediction softwares (SIFT, Polyphen-2, Mutation Taster), information in databases (ClinVar, Ensembl), bibliography (Pubmed) and family data.17
The results of the genetic study defined two subgroups of families considering the presence (FALS-GENE KNOWN, which corresponds to D) or the absence (FALS-GENE UNKNOWN which corresponds to V and N) of a deleterious mutation.
Statistical analysis
The following comparisons were performed: FALS-GENE KNOWN versus FALS-GENE UNKNOWN, Monog versus Big versus MultiG, FALS with two cases versus FALS with at least three cases, C9orf2 versus SOD1 and versus TARDBP versus FUS. The significance threshold was set at 0.05. Gender and site at onset were compared between groups using χ2 test with Yates’ correction if necessary, and age was analysed using analysis of variance (ANOVA) or Kruskal-Wallis test, depending on the normality of the distribution. Survival analysis was performed using Kaplan-Meier curve and log-rank test to evaluate the relation between the different mutations and disease duration. Data were censored at the date of the last consultation in the absence of a date of death. Statistical analysis was performed with JMP software (SAS, V.15.0).
Results
Genealogy of the FALS cohort reveals heterogeneity between families
Two hundred and thirty-five French FALS probands were enrolled in the study. Gender ratio (man/woman) was 0.93 (113/122), mean age of onset was 57.6±12.1 years, site of onset was bulbar in 60 cases (26.3%), upper limbs in 69 cases (30.3%) and lower limbs in 99 cases (43.4%). Median duration of the disease was 30.0±40.4 months, ranged from 5.0 to 221.0 months (n=149) (table 1).
One hundred and forty-seven (62.5%) pedigrees included only two affected patients with ALS. These pairs involved first-degree relatives in 125 cases (89 parent–child pairs and 36 siblings pairs) and second-degree relatives in 22 cases (6 cousins pairs and 16 uncle–nephew or grandfather–grandchild pairs).14 The remaining 88 pedigrees involved 3 affected relatives in 49 (20.9%) families, 4 affected relatives in 17 (7.2%) pedigrees, 5 affected relatives in 8 (3.4%) pedigrees, 6 affected relatives in 4 (1.7%) pedigrees, 9 and 10 affected relatives in 3 (1.3%) pedigrees and with 7, 8, 11 and 15 affected relatives in one (0.4%) pedigree (figure 1/table 2).
There was a significant difference between FALS with two cases and FALS with at least three cases concerning the age of onset (t-test, p<0.003). No difference was observed concerning the sex ratio (χ2 test, p=0.10), the site of onset (χ2 test, p=0.34) and the duration of the disease (log-rank test, p=0.66) (table 1).
Considering the number of generations, there were 49 (20.8%) MonoG, 152 (64.7%) BiG and 34 (14.5%) MultiG (table 2) families. A statistically significant difference was observed between these three subgroups concerning the age of onset (ANOVA test, p<0.0001). No difference was present concerning the sex ratio, the site of onset and the duration of the disease (table 2).
Genetics of the FALS cohort involves a dozen of genes
An abnormal G4C2 expansion in the C9orf72 gene and/or a genetic variant in another gene was identified in 149 (63.4%) pedigrees (FALS-gene known) (figure 2). For the remaining 86 (36.6%) probands of the FALS-gene unknown group, C9orf72 and NGS analyses were normal. Variants identified using NGS were considered deleterious (D) in 59 cases, variants of uncertain significance (VUS) in 34 situations and neutrals (N) in 39 situations. All deleterious variants are reported in the appendix.
Among all candidate variants, we observed 50 different deleterious variants in only 11 out of 31 genes (online supplemental table 1). Ninety-eight (41.7%) out of the 235 families were linked to C9orf72. An SOD1 mutation was found in 30 (12.8%) probands, a TARDBP mutation in 9 (3.8%) probands and a FUS mutation in 7 (3.0%) probands. The remaining mutations TBK1 (R440fs, E476fs), FIG4 (I41T, S787N), ALS2 (L238F, R1421fs), SETX (L1976F M2017I), DYSPL3 (I255V) were found in two probands for each gene, and a DCTN1 (G601A), PFN1 (M114T) or SPG11 (G2317D) mutation in one proband (online supplemental appendix). While 43 of the variants were already reported in previous studies, 7 new variants were identified in five genes: SOD1 (pF46S, pK129fs, 17-18del), FUS (167-168del), TBK1 (pR440fs), SETX (pL1976F), DCTN1 (pG601A). Lastly, 25 neutral (N) variants and 24 variants of uncertain significance (VUS) were also found using the NGS analysis (data not shown).
Supplemental material
FALS patients with pathogenic variants have earlier ages of onset of the disease
We then studied the demographic and disease characteristics of the two groups, FALS-gene unknown and FALS-gene known pedigrees (tables 1 and 2). In the FALS-gene unknown group, 61 (41.5%) pedigrees had two affected relatives and 25 (28.4%) pedigrees at least three affected relatives. Considering the number of generations, there were 17 (34.7%) MonoG, 60 (39.5%) BiG and 9 (26.5%) MultiG. NGS disclosed 8 VUS and 16 N variants (see online supplementary data).
In the FALS-gene known group (n=149), there were 86 (58.5%) pedigrees with two affected relatives and 63 (71.6%) pedigrees with at least three affected ALS. Of note, a deleterious variant was found in 79 (61.8%) pedigrees with a pair of first-degree relatives with ALS and in only 7 out of 22 (31.2%) of pedigrees when the parental relationship was remote with a pair of second-degree relative pairs with ALS. Considering the number of generations, there were 32 (65.3%) MonoG, 92 (60.5%) BiG and 25 (73.5%) MultiG.
In eight pedigrees, an abnormal expansion in the C9orf72 gene was evidenced, respectively combined with a mutation in ALS2, FIG4 (n=2), and in DYSPL3, SOD1 (het D91A), SPG11 and TARDBP (N291H). In the last pedigree, an SOD1 and a SETX gene mutation were associated. Concerning the phenotype in these eight pedigrees, male/female sex ratio was 1.25, mean age of onset was 64.3±13.1 years (n=8), site of onset was bulbar in three cases, in arms in three cases and in legs in two cases. Finally, median duration of the disease was 28.8±29.3 months, ranging from 7.0 to 82.1 months (n=6).
The comparison of FALS-gene known and FALS-gene unknown pedigrees highlighted a significant difference concerning the age of onset, younger in the FALS-gene known group (p=0.006). There was no difference considering both the number of cases/pedigree and the number of affected generations within families and the other clinical characteristics between the two groups.
Relationships exist between genealogy, phenotype and genotype in FALS
We then investigated whether relationships could be associated with certain characteristics of the families or of the disease (figure 3, table 3).
We found families reporting multiple generations of affected patients were more likely to carry mutations in gene FUS (85.7%), and families with bigenerations to carry mutations in gene TARDBP (77.7%). In contrast, families with single generation of affected patients were more likely to carry mutations in gene C9orf72 (24.5%) and SOD1(23.3%). Interestingly, there was no FUS mutation in FALS pedigrees with two affected ALS cases.
We found that the age of onset was lower among families carrying mutations in gene FUS (around 40 years), while C9orf72 abnormal expansion predominated in families with an older mean age of onset of the disease around 53.1 years. Finally, a patient with MultiG ALS whose disease began around 45 years at lower limbs was rather suggestive of an SOD1 mutation. Of note, the frequency of pathogenic mutations was the lowest (30.4%) in pedigrees in which the two ALS cases did not have a direct parental relationship (parent–child or siblings’ pairs).
Discussion
We report here epidemiologic, phenotypic and genetic analysis of 235 FALS index cases and their pedigrees, one of the most important cohorts to be analysed for such parameters.18 19 The most striking result of this study was the presence of a link between the pattern of familial ALS clustering and the genotype.
Our cohort is a representative of the FALS population, since the phenotypic characteristics were in accordance with previous reports, 62% of pedigrees had only two cases of ALS with in the vast majority (85%) having a parent–child pair, deleterious variants in C9orf72, SOD1, TARDBP and FUS genes were described in 61.3% of FALS pedigrees in our cohort, and the clinical descriptions in these four groups matched with data in the literature.5 8–12 20–22
One third of the pedigrees remained genetically unexplained after C9orf72 and targeted NGS analysis. In the cohort of FALS-gene unknown cases, the age of onset was significantly older and the duration of the disease was shorter. In FALS-gene known, the lowest frequency of pathogenic mutation was found in pedigrees in which the two ALS cases did not have a direct parental (parent–child or siblings) relationship with around 70% of this type of pedigree genetically unexplained. Considering the lifetime risk of ALS of 1 out of 350, those elements may suggest that the occurrence of two ALS cases in these families is not due to inheritance but could rather represent two sporadic ALS cases occurring in two relatives sharing common risk factors, either genetic or environmental.5 We cannot exclude, however, that a causal gene, still unknown and with incomplete penetrance, may be implicated.
We found a frequency of 3.8% of the probands with two pathogenic mutations, similar to previous observations.23 Two probands had mutations in two out of the four four main causative genes in ALS: abnormal expansion in C9orf72/het p.D91A SOD1 mutation, and abnormal expansion in C9orf72/het p.N291H TARDBP mutation. The other probands had a C9orf72 abnormal expansion with a mutation found by NGS analysis in genes rarely linked to ALS. On the contrary to what is expected, we did not observe a more severe phenotype in these cases.22–25 Nevertheless, this observation supports the hypothesis of an oligogenic model in some ALS cases, a model that has already been developed.23 26
Although modern and efficient, the yield of targeted NGS to identify mutations in genes other than SOD1, TARBP and FUS, was quite low in FALS, with only seven families carrying a pathogenic mutation in genes other than these major genes. In our cohort, only 11 genes were pathogenically linked to ALS, and 2 out of 3 of the genes of the panel were not found to be causative. This might justify focusing NGS to patients with ALS once linkage to one of the four major genes is excluded.
We observed a decrease in the age of onset of the disease within families over generations: age of onset was 7 years earlier in BiG than in MonoG and, once again, 7 years earlier in MultiG than in BiG. Although the approach developed here was different from that classically performed to highlight an anticipation process, this result might differently show that the disease occurs earlier in the latest-born generation to the precedent one.27 28 In this study, we found a 7-year anticipation of the onset of the disease from one generation to the following one. This also reminds the results published by Bradley et al, who mentioned that death occurred 7 years earlier on average in the second generations compared with what observed in the first one while there was no difference in the duration of the disease between both generations.27
We showed here that the clustering of the disease within the family might direct the molecular genetic study for C9orf72, SOD1, TARDBP and FUS. Two out of three of the C9orf72 abnormalities were found in pedigrees with pairs of affected ALS, while SOD1 mutations were found in all types of pedigrees. This completely differed from what we noticed with TARDBP and FUS, which were never found mutated in MonoG, but rather linked to pedigrees characterised by clustered ALS cases and more specifically BiG with few ALS cases in the family. While FUS mutation was absent in MonoG, the architecture of pedigrees linked to FUS were almost in all cases MultiG and, systematically, pedigrees with numerous ALS cases in the family. Although TDP-43 and FUS proteins share numerous structural and functional similarities, the pedigrees linked to TARDBP and FUS genes appeared completely different: FUS was predominantly linked to MultiG, while TARDBP to BiG with at least three cases.29 One explanation might be a negative effect of TARDBP mutant on fertility due to the observation of aberrant expression of TDP-43 in testes and sperm of infertile men. This might explain the clustering of ALS cases in FALS-gene known linked to TARDBP.30 While our population did not allow for statistical analysis, these results may suggest that, in MultiG, an age of onset around 45 years should lead physicians to first focus on FUS screening, and to favour C9orf72 in case of an onset after 50 years and bulbar origin. Therefore, the age of onset and the site of onset seem to be effective items for a targeted molecular genetic approach in MultiG FALS.
It should be noted that the study has some limitations. First, it is a retrospective study, with no systematic and thorough familial inquiry nor large genetic testing of pedigrees. Subsequently, clinical parameters described are limited to probands, and it is possible that a larger investigation of cases within families might show additional or different patterns. This was the case in the large French cohort with C9orf72 abnormal G4C2 expansion, where a genetic determinant for bulbar onset was suggested as well as a predominant maternal transmission of the trait.31 Second, as the follow-up of the families was lacking, it is not known whether new cases occurred in the families studied. Third, the information about the existence of cognitive involvement are lacking in the vast majority of the cases, and it would probably be useful for strengthening gene targeting, as it is already the case for families with C9orf72 abnormal G4C2 expansion.
In conclusion, we suggest that in familial ALS, we should put more emphasis on the study of pedigrees and clinical phenotypes to improve molecular diagnosis. In addition, our results suggest that, at the time of diagnosis, fewer than 10 of the 30 ALS-related genes should be targeted, and that four genes should be given priority attention, taking into account familial and clinical information. We emphasise that these four genes should be studied more comprehensively, that is, all exonic, intronic and regulatory regions. The frequencies of mutations in genes that we have described in the French population are close to the frequencies in other European and North American populations.18 We can therefore assume that our observations in ALS families may also be valid in other European and North American countries, but this will have to be confirmed.
Once confirmed, this study should change our clinical practice in the genetic screening in FALS by prioritising some gene analysis owing to the familial clustering of ALS cases. Using a more targeted indication for the NGS, it might be possible to put in place a more accurate genetic screening exploring, for example, the intronic and exonic sequences of such causative genes.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information. All data are deidentified and available at the Biochemistry and Molecular Biology Department of the CHU of Tours.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Contributors P Corcia, WC, HB, FL and PV conceived the study, had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the analysis. P Corcia, WC, P Couratier, JPC, P Cintas, AV, MHS, CD, MCF, NG, JC, FV, SP, VDB, YK, GLM, FS, EB, PFP and VM helped with the implementation. PV, CB, RH, DL, SM and CA performed the genetic analysis. HB, P Corcia and WC performed the statistical analysis. All authors contributed to the refinement of the study protocol and approved the final manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.