Background The diagnosis of dementia with Lewy bodies (DLB) is based on diagnostic clinical criteria, which were updated over the years.
Objective To evaluate, through a systematic review, accuracy of the diagnostic criteria, testing a possible improvement over time.
Methods We searched on MEDLINE and SCOPUS databases for studies reporting diagnostic parameters regarding the clinical diagnosis of DLB until October 2016. We performed meta-analysis, using a Bayesian approach, on those using pathological examination as gold standard, subclassified based on the different diagnostic criteria used.
Results We selected 22 studies on 1585 patients. Pooled sensitivity, specificity and accuracy were 60.2%, 93.8%, 79.7%, respectively, for criteria antecedents to McKeith 1996. For McKeith 1996-possible, pooled sensitivity, specificity and accuracy were 65.6%, 80.6%, 77.9% in early stages and 72.3%, 64.3%, 66% in late stages, respectively. For McKeith 1996-probable, pooled sensitivity, specificity and accuracy were 19.4%, 95.1%, 77.7% in early stages and 48.6%, 88%, 79.2% in late stages, respectively. McKeith criteria 2005 were evaluated only in late stages: pooled sensitivity, specificity and accuracy were 91.3%, 66.7% and 81.6%, respectively, for possible diagnosis (only one study) and 88.3%, 80.8%, 90.7% for probable diagnosis, decreasing to 85.6%, 77.1% and 81.7% if only considering clinical settings focused on dementia diagnosis and care.
Conclusions and relevance Diagnostic criteria have become more sensitive and less specific over time, without substantial change in the accuracy. Based on current data, about 20% of DLB diagnosis are incorrect. Future studies are needed to evaluate if the recently released revised consensus criteria will improve the diagnostic accuracy of DLB.
- lewy body dementia
- clinical neurology
- systematic reviews
Statistics from Altmetric.com
The diagnosis of dementia with Lewy bodies (DLB) may be challenging, especially early in the course, as the clinical presentation is extremely variable among individual patients. Early diagnosis is important for therapeutic choice, as patients have in general good responsiveness to cholinesterase inhibitors, but increased risk of sensitivity to certain medications such as neuroleptics and anticholinergics, which can lead to increased morbidity and mortality in these patients.1
The identification of DLB as a distinct disease is relatively recent.2 Until 1996, the scientific community missed a consensus on terminology and diagnosis, for example, referring to ‘dementia associated with cortical Lewy bodies’ or ‘senile dementia of Lewy body type’ and using different sets of diagnostic criteria (as Nottingham Criteria3 and Newcastle Criteria4). In 1996, during the First International Workshop of the Consortium on DLB,5 it was proposed the terminology of dementia with Lewy bodies (DLB), with new clinical and pathological diagnostic criteria. Such criteria were first revised in 20051 and a second time in June this year.6
Our aim was to perform an up-to-date systematic review of the studies on diagnostic accuracy in DLB and to evaluate the sensitivity, specificity and accuracy of the diagnostic criteria: we analysed the validity of the different criteria, hypothesising an improvement over time.
We followed Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and Meta-analysis of Observational Studies in Epidemiology (MOOSE) guidelines for systematic review and meta-analysis.7 8 We performed electronic searches of MEDLINE and SCOPUS databases using the combination of a number of medical subject heading (MeSH) and free-text terms (eg, ‘dementia with Lewy bodies’, ‘Lewy-body dementia’, ‘diagnostic accuracy’, ‘sensitivity’, ‘specificity’, ‘positive predictive value’, ‘negative predictive value’) (online supplementary table 1). Duplicates were eliminated and all relevant articles were retrieved. We did not place restriction on language and date. We performed the last search on October 2016. We excluded abstracts and chapters of book. We carefully reviewed the reference list of articles for additional articles missed in the research. We included articles if they reported any of the diagnostic parameters or raw data, specifically regarding the clinical diagnosis of DLB. We decided to perform the meta-analysis only on those studies that used pathological examination as gold standard. We excluded the studies using different gold standard. Our aim was to evaluate accuracy of clinical diagnosis using specific diagnostic criteria. We excluded the studies not specifying the criteria used or using multiple diagnostic criteria.
supplementary table 2
Achievable diagnostic parameters should fully or partially include: sensitivity—proportion of patients with DLB who were correctly identified with the diagnosis of DLB using clinical criteria; specificity—proportion of patients without DLB who were correctly identified as not having the disease using clinical criteria; positive predictive value (PPV)—proportion of patients with initial diagnosis of DLB based on clinical criteria who truly had the disease; negative predictive value (NPV)—proportion of people with initial diagnosis of non-DLB based on clinical criteria who truly did not have the disease; diagnostic accuracy—proportion of all correct diagnoses.
Two authors (GR and MA) independently performed the literature search, selected all potentially relevant papers, screened the full texts and extracted data from the eligible studies. Disagreements were resolved by asking the opinion of a third reviewer (RS). When relevant information was missing, we tried to contact study authors by email.
We analytically read the methods of the studies in order to evaluate the different diagnostic criteria used and if the criteria were applied in the early (<3 years) or later stage of disease (>3 years). We defined three categories of diagnostic criteria: ‘criteria antecedents to those of McKeith 1996’, ‘McKeith criteria 1996’ and ‘McKeith criteria 2005’.
Some studies reported accuracy based on different diagnostic criteria in the same population and on diagnosis of possible or probable separately (for McKeith criteria 1996 and 2005); therefore, these studies were included in the meta-analysis with more than one record.
Given that PPV and NPV are affected by the different prevalence of DLB or other diseases evaluated in each specific setting, we only meta-analysed sensitivity and specificity values, and in addition accuracy values, as providing an overall measure.
Bayesian methods offer a flexibility, which allows the approach to be extended to consider complex likelihood functions other than normal. Bayesian methods might also perform better and provide robust credible intervals in applications with a relatively small number of studies. Details on the statistical analysis are reported in the online supplementary material 1.
supplementary materials 1
Out of 1377 studies that were identified, we included in this systematic review 22 studies,9–30 from 22 populations and 1585 patients (see figure 1 for details). From these 22 studies, 19 clinic based9–12 14–17 19–29 and three community based,13 18 30 we extracted 39 records for meta-analysis (table 1): (1) six records from five studies9–13 for criteria antecedents to those of McKeith 1996 (specifically Nottingham Criteria3 and Newcastle Criteria4), in one10 applied in the early phases and in the others9 11–13 in the late stages of disease; (2) 29 records from 17 studies11–27 for McKeith criteria 1996, with diagnosis of possible DLB in early stages in two,16 23 of probable DLB in early stages in two,16 23 of possible DLB in late stages in eight,12 14–17 19 22 23 of probable DLB in late stages in 12,11 12 14–19 and of mixed
possible+probable DLB in late stages in five records13 24–27 (the last ones not included in the subanalyses based on possible or probable diagnosis) and (3) four records from three studies28–31 for McKeith criteria 2005, with diagnosis of possible DLB late stages in one28 and of probable DLB in late stages in three records.28–30 Among these, one study30 included only patients with parkinsonism and, differently from the other studies included in this meta-analysis, differential diagnosis was only made among patients with other parkinsonism and not among patients with dementia as Alzheimer’s disease (AD) or frontotemporal dementia, as in all the other settings. This only study reported a very high accuracy (sensibility 95.97%, specificity 96.71%, accuracy 97.4%).
Pooled diagnostic parameters (sensitivity, specificity and diagnostic accuracy) were reported in figures 2–4, according to the different diagnostic criteria used.
Criteria antecedents to those of McKeith 1996: the pooled sensitivity, specificity and accuracy were, respectively, 60.2%, 93.8% and 79.7%.
McKeith criteria 1996: the pooled sensitivity, specificity and accuracy were, respectively, 56.5%, 86.5% and 75%. Looking at the ‘possible’ or ‘probable’ DLB diagnosis separately, the sensitivity, specificity and accuracy were, respectively, 65.6%, 80.6% and 77.9% for McKeith criteria 1996-possible in early stages of disease, 72.3%, 64.3%, 66% for McKeith criteria 1996-possible in late stages, 19.4%, 95.1% and 77.7% for McKeith criteria 1996-probable in early stages, 48.6%, 88%, 79.2% for McKeith criteria 1996-probable in late stages.
McKeith criteria 2005: the pooled sensitivity, specificity and accuracy were, respectively, 88.8%, 77.5% and 88.4% (reducing to 87.2%, 74.5% and 81.5% excluding Savica et al 30). Only one study28 is available for McKeith criteria 2005-possible in late stages reporting values of 91.3%, 66.7% and 81.6%. The pooled sensitivity, specificity and accuracy for McKeith criteria 2005-probable in late stages were, respectively, 88.3%, 80.8% and 90.7% (reducing to 85.6%, 77.1% and 81.7% excluding Savica et al 30). No studies evaluated McKeith criteria 2005 in early stages.
Our meta-analysis showed an improvement of sensitivity and worsening of specificity without substantial change in global accuracy of the clinical diagnosis of DLB over the years, based on the different diagnostic criteria adopted in clinical practice and research. The first crucial step in the field of DLB was the development of the first consensus criteria in 1996,5 which refined earlier criteria proposed for dementia with cortical Lewy bodies3 and for senile dementia of Lewy bodies type.4 Consensus diagnostic criteria identified the categories of possible and probable DLB, based on central, core, supportive and unsupportive features. The main feature is represented by the progressive cognitive decline interfering with social/occupational function. The diagnosis of probable DLB required the presence of two core features among fluctuating cognition, recurrent visual hallucinations and spontaneous parkinsonism, while the diagnosis of possible DLB required the presence of only one. Supportive features alone (repeated falls, syncope, transient losses of consciousness, neuroleptic sensitivity, systemised delusions and hallucinations in other modalities) did not allow the DLB diagnosis. As for pathological assessment, the consensus suggested as essential only the presence of Lewy bodies, detected using H&E for the brainstem and H&E and/or ubiquitin for the cortex and considered associated but not essential other features as Lewy-related neurites, plaques, neuronal loss, neurofibrillary tangles. Most of the studies included in this meta-analysis used these criteria, which showed an accuracy mildly changed with respect to the previous years, with a low sensitivity and a high specificity. Indeed, the sensitivity ranged from 19.4% for probable diagnosis in early stages to 72.3% for possible diagnosis in late stages, whereas specificity ranged from 95.1% for probable diagnosis in early stages to 64.3% for possible diagnosis in late stages. This is in line with the results of a multicenter study on a very large sample of patients with DLB, not included in this meta-analysis due to the lack of a clear documentation about clinical diagnosis for every subject.31
AD was the most frequent misdiagnosis probably because of two factors: (1) the cognitive symptoms may overlap and (2) in all but particularly in the older patients, the underlying pathology can be mixed with present α-synuclein, amyloid and tau pathology. The issue of the mixed pathology, not always fully detailed and reported among the studies selected for this meta-analysis, represents a central diagnostic problem. This mainly regards AD pathology, which variably coexists in the majority of patients with DLB and can underlie a part of the cognitive symptomatology,26 27 32 leading to refine the neuropathological DLB criteria in terms of likelihood based on the different degree of α-synuclein or AD pathology (see below).1 An additional component, especially in the oldest group of patient, may be vascular pathology, more common in AD than in other degenerative dementias,33 contributing to the cognitive impairment but also to manifestations of extrapyramidal signs, in terms of mild parkinsonian signs or axial motor impairment.33 34
The second misdiagnosis is Parkinson’s disease (PD), reflecting the not well-defined boundaries between the two diseases. Indeed, DLB has clinical and pathological characteristics that overlap with PD and Parkinson’s disease dementia (PDD), sharing the typical Lewy body deposits. We recently reported that PD could be misdiagnoses as DLB, representing up to 10.8% of false positives and up to 3.5% of false negatives among PD diagnoses.35
Consensus criteria 1996 suggested that if dementia occurred before or within 12 months of the onset of parkinsonism, the patient should be assigned a diagnosis of DLB, otherwise of PDD, but acknowledging the possible need of revision of such cut-off.5
The suboptimal sensitivity of McKeith criteria 1996, apparently due to difficulties in recognition of the core feature fluctuation and a low rate of all core features in the presence of neocortical, neurofibrillary tangle pathology, led the Consortium on DLB to revise both clinical and pathological criteria in 2005.1 Revised pathological assessment included recommendation to use immunohistochemical staining for α-synuclein, rather than ubiquitin immunohistochemistry, since it has been shown to be, by far, the most sensitive and specific method available for detecting Lewy bodies and Lewy-related pathology. Furthermore, they introduced the approach based on the likelihood that the neuropathological findings predict the clinical syndrome of DLB, likelihood directly related to the severity of Lewy-related pathology and inversely related to the severity of concurrent AD-type pathology. Diagnosis of DLB was permitted with an intermediate or high likelihood condition. Indeed, a lower Braak stages improve the clinical sensitivity and specificity of DLB,36 and diagnostic accuracy is positively related to the extent of Lewy body pathology and negatively to the severity of Alzheimer’s neuritic plaque pathology.27 Conversely, in the previous criteria, the only neuropathological requirement for DLB was the presence of Lewy bodies in the brain of a patient with a clinical history of dementia, leading to the inclusion of actual cases of AD as pathologically confirmed DLB and contributing to the view that the clinical criteria had low sensitivity.
As for clinical criteria, consensus of 2005 confirmed the same central and core features and added three suggestive features (rapid eye movement (REM) sleep behaviour disorder, severe neuroleptic sensitivity, low dopamine transporter uptake in basal ganglia demonstrated by single-photon emission CT (SPECT) or positron emission tomography (PET) imaging),1 usable for the diagnosis of possible or probable DLB in combination with core features, to increase sensitivity. Similarly to the previous criteria, a number of clinical features (repeated falls and syncope, transient unexplained loss of consciousness, severe autonomic dysfunction, hallucinations in other modalities, systematised delusions, depression, relative preservation of medial temporal lobe structures on CT/MRI scan, generalised low uptake on SPECT/PET perfusion scan with reduced occipital activity, low uptake metaiodobenzylguanidine (MIBG) myocardial scintigraphy, prominent slow wave activity on electroencephalography (EEG) with temporal lobe transient sharp waves) were indicated as only supportive. The arbitrary ‘1-year rule’ continued to be recommended for the distinction between DLB and PDD, although in specific research studies in which distinction needs to be made between the two phenotypes, while in other research settings including clinicopathological studies and clinical trials the unique category of Lewy body disease or α-synucleinopathy could be considered.1 In this regard, a recent clinicopathological study did not reveal a clear delineation of AD neuropathology groups according to clinical phenotype, PDD or DLB, not supporting such categorical clinical distinction.32 In this regard, the issue of the borders of DLB as a diagnostic category is currently a subject of intense debate.37 38
Only three studies using McKeith criteria 2005 are available in the literature.28–30 One of these30 reported very high diagnostic accuracy but was probably affected by a different type of referral because the patients were selected if presented a parkinsonism, while all the other studies focused on patients referred because dementia was suspected. Using the parkinsonism as a selection criterion implies the automatic presence of at least one core feature in all patients in the sample increasing the probability of correct diagnosis. Indeed, the main reason for ‘missing’ DLB clinically in a prospective clinicopathological study was the absence of extrapyramidal signs.19
The remaining two studies28 29 that applied revised consensus criteria in a dementia setting showed increased sensitivity but reduced specificity compared with McKeith 1996, with unchanged accuracy (about 80%). Similarly, the most frequent misdiagnosis was AD. Increasing the sensitivity with the new criteria have conversely reduced the specificity. Unfortunately, data on the accuracy of McKeith criteria 2005 in early stages are missing. Indeed, in the early stages, the lack of some features could affect the sensitivity of the criteria. Conversely, in the late stages, mixed pathology and non-specific features could increase the sensitivity and decrease the specificity, as it was for McKeith criteria 1996, where specificity of probable diagnosis decreased even if only slightly from early (95.1%) to late stage (88%). Despite the paucity of available studies, it seemed evident that further improvements needed to be made in the McKeith criteria 2005. Accordingly, new revised DLB consensus criteria have been recently released,6 following the discussion at the International DLB Conference in 2015 in Florida. These revised criteria now distinguish clearly between clinical features, divided in essential, that is, the presence of dementia, core and supportive and diagnostic biomarkers, divided into indicative and supportive. Probable DLB can be diagnosed if: (1) two or more core clinical features of DLB are present, with or without the presence of indicative biomarkers or (2) only one core clinical feature is present, but with one or more indicative biomarkers. Probable DLB should not be diagnosed on the basis of data from biomarkers only. Possible DLB can be diagnosed if: (1) only one core clinical feature of DLB is present, with no indicative biomarker evidence or (2) one or more indicative biomarkers is present but there are no core clinical features. For the distinction between DLB and PDD, the existing 1-year rule continues to be recommended. As for core clinical features, REM sleep behaviour disorder (RBD) has been added to the three already present in the previous criteria. Indeed, Ferman et al 28 had previously suggested that the inclusion of RBD in the core criteria could improve the diagnostic accuracy. Supportive clinical features now include: severe sensitivity to antipsychotic agents; postural instability; repeated falls; syncope or other transient episodes of unresponsiveness; severe autonomic dysfunction, for example, constipation, orthostatic hypotension, urinary incontinence; hypersomnia; hyposmia; hallucinations in other modalities; systematised delusions and apathy, anxiety and depression.
As for biomarkers, indicative biomarkers include reduced dopamine transporter uptake in basal ganglia demonstrated by SPECT or PET, abnormal (low uptake) 123iodine-MIBG myocardial scintigraphy and polysomnographic confirmation of REM sleep without atonia. The ‘promotions’ of the abnormal MIBG scintigraphy was supported by the results of a number of studies.39 40 Indeed, a meta-analysis39 reported a pooled sensitivity of MIBG scintigraphy in detection of DLB of 98% and a specificity of 94%, although no study on autopsy-proven patients was available. A recent comparative study showed that 123I-MIBG myocardial scintigraphy have similar sensitivity to 123I-FP-CIT SPECT for detecting DLB (93% vs 90%), but higher specificity for excluding non-DLB dementias (100% vs 76%).40
Further biomarkers are considered ‘supportive’ as, likely to the supportive clinical features, are commonly present, but lack clear diagnostic specificity. Supportive biomarkers include relative preservation of medial temporal lobe structures on CT/MRI scan, generalised low uptake on SPECT/PET perfusion/metabolism scan with reduced occipital activity±the cingulate island sign on fluorodeoxyglucose-PET imaging, prominent posterior slow-wave activity on EEG with periodic fluctuations in the pre-alpha/theta range. Only minor modifications to pathological methods and criteria are recommended, adding previously omitted Lewy-related pathology categories (amygdala-predominant and olfactory bulb only, both associated with low likelihood) and including assessments for substantia nigra neuronal loss.
These revised criteria probably and hopefully will increase DLB diagnostic accuracy. However, some problems are likely to remain open as the case of ‘atypical’ phenotype, as rapidly progressive dementia41 or multiple system atrophy-like presentation.42 About the latter case, autonomic failure in the contest of cognitive impairment should probably have greater emphasis in the diagnostic criteria.
Furthermore, despite all efforts to improve the clinical diagnosis of DLB, an important step will be the availability of accessible biomarkers of α-synucleinopathy, which remains elusive to date. However, interesting data have been produced,43 44 as the detection of phosphorylated α-synuclein deposits in skin nerves of patients with DLB.44
The strength of our meta-analysis is the evaluation of the diagnostic accuracy of DLB on a large population using the pathological assessment as the gold standard, although the heterogeneity of the studies included could represent a limitation. This problem was addressed by using random effects within the Bayesian framework. Furthermore, we divided the studies into more homogeneous subgroups. However, heterogeneity of the studies remains the major weakness, as suggested by the wide ranges of the credible intervals of the pooled values. A number of aspects contribute to such heterogeneity. The most important aspect is probably the different application of clinical diagnostic criteria, being retrospective in the most cases9–18 20 21 23 26 27 30 and implying that data about core features would often either be absent or not documented in a standard form. Anyway, two28 29 out of three studies using McKeith criteria 2005 applied them prospectively. A further difference regarded the setting, although only two studies13 18 using McKeith criteria 1996 and one30 using McKeith criteria 2005 were community based. All other studies were clinic based. The setting could also be relevant if the focus of clinical work is dementia rather than parkinsonism, as the challenges of diagnosis are different. In our analyses, all studies but one conducted at Mayo30 were performed in dementia or memory clinics. Therefore, we ran the analyses with and without the Mayo clinic study. Finally, differences in the neuropathological examination, also considering the changes in analyses and staging over time, have surely affected sensitivity and specificity. Unfortunately, we were not able to accurately evaluate differences in neuropathological methods, as not reported in details in all studies, although all studied referred to the guidelines for diagnosis of the different forms of dementia at the time of study. However, two changes in the pathological assessment must be mentioned. First, the introduction of immunohistochemical staining for α-synuclein, rather than ubiquitin immunohistochemistry, which was integrated in the McKeith criteria 2005 but was also used, in all22 24–27 or a part of patients,20 by a minority of studies evaluating McKeith criteria 1996. This probably was a major contributor to the heterogeneity of accuracy findings of such criteria. Second, the introduction of diagnostic DLB likelihood based on the proportion of Lewy-related pathology and AD-type pathology, which obviously should be considered in the accuracy comparison between the criteria of 1996 and those of 2005.
In conclusion, although such unavoidable heterogeneity, current data indicate that the diagnosis of one out of five cases of DLB is yet incorrect. Hopefully, the new diagnostic criteria could further improve the accuracy of DLB diagnosis in the future. The role of the use of fluid biomarkers and imaging in improving the clinical diagnosis needs further evaluation.
We thank all those authors of papers selected for this review who helped us to recover missing data. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Contributors GR and GL had full access to the data set of this study and take responsibility for the integrity of data and the accuracy of data analysis. Conception and design: GR and GL. Literature review: GR and MA. Articles screening, data collection and data set management: GR, SA, RS and MA. Statistical analysis and interpretation of data: GR, SA, MC, AF, RL and GL. First draft of manuscript: GR, SA and MC. Critical revision of manuscript: SA, RS, RL and GL. Supervision: GL. All authors made substantial contributions to the intellectual content of the paper and gave final approval for the final version of the manuscript.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Data from single studies used for this meta-analysis are available on request.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.