Article Text

Download PDFPDF
  1. Emmanuel Jammeh1,
  2. Camille Carroll2,
  3. Stephen Pearson2,
  4. Javier Escudero1,
  5. Athanasios Anastasiou1,
  6. John Zajicek2,
  7. Emmanuel Ifeachor1
  1. 1Plymouth University School of Computing and Mathematics
  2. 2Plymouth University Peninsula Schools of Medicine and Dentistry


Background Up to 50% of patients with dementia may not receive a formal diagnosis, limiting access to appropriate services. It may be possible to build a picture of ‘underlying undiagnosed dementia’ from a profile of symptoms recorded in routine clinical practice.

Aim To develop a machine learning tool to identify patients who may have underlying dementia but have not yet received formal diagnosis from analysis of routinely collected NHS data.

Method Routinely collected NHS READ-encoded data were obtained from 18 consenting GP surgeries across Devon, UK, totalling 26,483 patient records of those aged >65 years. 539 Patients were identified as having dementia within the 2 year study period (June 2010 to June 2012). We determined other codes assigned to these patients that may contribute to dementia risk. The dataset was used to train a supervised classifier (Naives Bayes) to discriminate between patients with underlying dementia and healthy controls using a ten-fold cross-validation approach.

Results The model obtained a sensitivity of 72.31% and a specificity of 83.06% for identifying dementia.

Conclusion Routinely collected NHS data can be used to identify patients who are likely to have undiagnosed dementia. This type of methodology is promising for increasing dementia diagnosis within primary care.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.