Background Up to 50% of patients with dementia may not receive a formal diagnosis, limiting access to appropriate services. It may be possible to build a picture of ‘underlying undiagnosed dementia’ from a profile of symptoms recorded in routine clinical practice.
Aim To develop a machine learning tool to identify patients who may have underlying dementia but have not yet received formal diagnosis from analysis of routinely collected NHS data.
Method Routinely collected NHS READ-encoded data were obtained from 18 consenting GP surgeries across Devon, UK, totalling 26,483 patient records of those aged >65 years. 539 Patients were identified as having dementia within the 2 year study period (June 2010 to June 2012). We determined other codes assigned to these patients that may contribute to dementia risk. The dataset was used to train a supervised classifier (Naives Bayes) to discriminate between patients with underlying dementia and healthy controls using a ten-fold cross-validation approach.
Results The model obtained a sensitivity of 72.31% and a specificity of 83.06% for identifying dementia.
Conclusion Routinely collected NHS data can be used to identify patients who are likely to have undiagnosed dementia. This type of methodology is promising for increasing dementia diagnosis within primary care.