Article Text


Atrophy of medial temporal lobes on MRI in “probable” Alzheimer's disease and normal ageing: diagnostic value and neuropsychological correlates
  1. Philip Scheltens,
  2. Laura van de Pol
  1. Alzheimer Centre and Department of Neurology, VU University Medical Centre, Amsterdam, the Netherlands
  1. Correspondence to Professor P Scheltens, Department of Neurology, VU University medical Centre, Neurology, PO Box 7057, Amsterdam 1007 HV, Netherlands; p.scheltens{at}

Statistics from


Authors: Scheltens P, Leys D, Barkhof F, et al.

Published: 1992;55:967–72

Philip Scheltens and Laura A van de Pol of the Alzheimer Centre and Department of Neurology, VU University Medical Centre, Amsterdam, ask after 20 years of visual rating of medial temporal lobe atrophy on MRI in dementia, what have we learnt?

Historical perspective

In the last decades of the previous century, interest in diagnosing Alzheimer's disease (AD) was rising. At the same time, MRI was being discovered as an exciting, non-invasive, high resolution method to study brain changes in vivo. Autopsy studies had shown that the medial temporal lobe structures, including the hippocampus, were the structures affected earliest and most severely in AD.1 The first quantitative MRI studies assessing hippocampal volumes in AD described a volume reduction of the hippocampus of up to 40% in AD patients compared with control subjects.2 ,3 Since volumetric analysis of brain structures on MRI scans is a time consuming process which is not routinely available, the need arose to have a more user friendly alternative; a quantitative visual rating scale. In the early 1990s, I had been offered the possibility of doing a six month research fellowship while in my training period of neurology at the Department of Radiology, led by Professor Jaap Valk. At that time my good friend and coworker Frederik Barkhof was working on his thesis on MRI in multiple sclerosis and I did my research on MRI in AD. We sat down and discussed ways to rate hippocampal atrophy but the actual development was done drinking beer and using the beer mat to draw on. We became so enthusiastic that we simultaneously developed a rating scale for white matter changes on MRI, published in 1993.4 From then on, there were two ‘Scheltens’ scales, of which the medial temporal lobe scale has proven to be the most useful.

Although everything is seen in a different perspective 20 years later, and may seem to have lasted shorter than in reality, I still think that it took us probably no longer than 30 min to come up with the definite scale that was ultimately presented in our paper ‘Atrophy of medial temporal lobes on MRI in “probable” AD and normal ageing: diagnostic value and neuropsychological correlates’ in 1992. Before submission, however, we had a meeting with the coworkers, including my friend and colleague Didier Leys from Lille (France), in which we practised the use of the scale and did some inter-rater reliability assessments. At that time it was all done with hard copies and I remember the logistics were quite a challenge. It was probably written in the stars already, but 12 years later, he and I joined forces again as editors of JNNP.

In this paper, one of my first PhD students that worked on the scale, Laura van de Pol, and I, aim to give an overview of the impact this scale had in further research, 20 years after its ‘conception’.

Medial temporal lobe atrophy visual rating scale

The so-called MTA scale (medial temporal lobe atrophy visual rating scale) is based on a visual score of the height of the hippocampus and the width of the surrounding CSF space. The severity of medial temporal lobe atrophy (MTA) is scored from 0 (no atrophy) to 4 (most severe atrophy), originally on one side, and later modified to each side of the brain on a coronal T1 weighted MRI sequence (table 1). The original study showed that MTA scores could distinguish AD patients from control subjects, and further validated the new scale by demonstrating correlations of the MTA scores with linear MRI measurements of the hippocampus and performance on the Mini-Mental State Examination (MMSE)5 and a delayed recall measure. Importantly, the MTA scale proved to be easy to learn and quite reliable, which is expressed by good intra-rater reliability6 as well as fair to good inter-rater reliability, as shown in a study in which four raters rated 100 MRI scans.7

Table 1

Medial temporal lobe atrophy visual rating scale

Validation in Alzheimer's disease

The MTA visual rating scale has been further validated against other measurements, in the context of AD over the years. We will address the validation against volumetrics of the hippocampus, neuropsychology and pathology.


In contrast with more sophisticated methods, such as manual or (semi-)automated volumetry, visual rating is relatively independent of scan protocol or quality and therefore easy applicable for neurologists and radiologists. There have been a number of studies comparing the visual rating scale with quantitative methods, showing that MTA scores form a good estimate of medial temporal lobe and hippocampal volumes in AD patients, and control subjects.8–11

For longitudinal analysis, the MTA score is probably less suitable. As a 5 point instrument, it might not be sensitive enough to measure longitudinal volume change over a relatively short period of time, as shown in a study in 47 AD patients with a follow-up of 1 year.9


The severity of MTA also correlates well with neuropsychological measures. In the original study, Scheltens et al showed a correlation between severity of MTA and performance on the MMSE and memory tests. Several other studies showed correlations of the MTA score with various neuropsychological tests; the clock drawing test in a cohort of 84 patients with memory complaints12; scores on the MMSE, Clinical Dementia Rating Scale and measures of delayed recall of memory tests in a cohort of 238 AD patients13; performance on the Alzheimer Disease Assessment Scale-cognitive/mild cognitive impairment delayed recall of the New York University paragraph recall test; and the Digit Symbol Coding test in a cohort of 896 patients with mild cognitive impairment (MCI) in a large clinical trial.14


Recent post mortem studies have correlated AD pathology with MTA scores. Burton et al found a strong correlation between MTA scores and Braak stages throughout a cohort of patients with AD, dementia with Lewy bodies (DLB) and vascular dementia.15 Barkhof et al studied MTA scores on post mortem MRI in patients aged over 85 years and also demonstrated a strong correlation between AD pathology and MTA score. However, medial temporal lobe atrophy was also observed in patients with other types of dementia, such as DLB.16

Diagnostic tool

Alzheimer's disease

As a diagnostic tool, MTA scores differentiate between AD patients, with moderate to severe dementia, and control subjects, with a sensitivity of 70–100% and a specificity of 67–96%.8 ,9 ,11 ,17 In combination with clinical information, it seems justified to take the severity of MTA into account when diagnosing AD in the individual patient in clinical practice.

Other types of dementia

Atrophy of medial temporal lobe structures is not entirely specific to AD. A visit by Clare Galton to our centre resulted in an adaption of the scale, applied to patients with frontotemporal dementia, and inhouse, Laura van de Pol did the same. Both studies showed a clear overlap in MTA scores with the AD control group.18 ,19 Also, in patients with vascular dementia, increased MTA scores were found, independently contributing to global cognitive impairment.20 The creation of the MTA scale brought me several times to Newcastle. In a wonderful and longlasting collaboration, John O'Brien took on the validation in DLB with his group and myself, and showed it to be less present in pure cases of DLB.17 MTA scores in isolation are not very helpful in differentiating between the dementias, when used in isolation of the clinical information.

Mild cognitive impairment

MCI, as described by Petersen et al,21 refers to non-demented individuals with memory impairment who have an increased risk of developing AD. Longitudinal studies in MCI showed that baseline MTA scores of MCI patients predicted progression to AD. In a group of 190 subjects with MCI, a greater than twofold increase in progression to AD within the next 3 years was observed in subjects with a mean MTA score >2.22

One of the crucial spinoffs of the MTA scale lies within the new research criteria for AD. MTA scoring is incorporated in the algorithm to judge hippocampal atrophy on MRI.23 ,24 For both sets of new criteria, operationalisation and standardisation of MTA scoring is further needed to provide reliable and practical cut-off values.


The story of MTA scoring shows that simple things may have great value. The MTA visual rating scale has been validated in multiple ways over the past 20 years and has served its purpose to fuel research into its value in multiple settings and multiple types of dementia. It has been shown to serve as a robust tool for the assessment of medial temporal lobe atrophy related to AD, in daily clinical practice, as well as in large (multicentre) study cohorts. The recent focus on biomarkers in AD and its prodromal stage highlights the importance of practical tools to assess these markers and reconfirms the importance of visual rating of MTA.


The authors thank all those who have contributed to the use and spread of the MTA scale. Special thanks to Frederik Barkhof, Henri Weinstein and Didier Leys.

View Abstract


  • Competing interests None.

  • Provenance and peer review Commissioned; not externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.