Article Text

Systematic review
Neurocognitive and psychiatric outcomes associated with postacute COVID-19 infection without severe medical complication: a meta-analysis
Free
  1. Sarah A B Knapp1,2,
  2. David S Austin1,
  3. Stephen L Aita1,3,
  4. Joshua E Caron1,3,
  5. Tyler Owen4,
  6. Nicholas C Borgogna4,5,
  7. Victor A Del Bene6,
  8. Robert M Roth7,8,
  9. William P Milberg9,10,
  10. Benjamin D Hill11
  1. 1 Department of Mental Health, VA Maine Healthcare System, Augusta, Maine, USA
  2. 2 Department of Mental Health, White River Junction VA Medical Center, White River Junction, Vermont, USA
  3. 3 Department of Psychology, University of Maine System, Orono, Maine, USA
  4. 4 Department of Psychological Sciences, Texas Tech University, Lubbock, Texas, USA
  5. 5 Department of Psychology, University of Alabama at Birmingham School of Social and Behavioral Sciences, Birmingham, Alabama, USA
  6. 6 Department of Neurology, The University of Alabama at Birmingham Heersink School of Medicine, Birmingham, Alabama, USA
  7. 7 Department of Psychiatry, Dartmouth Health, Lebanon, New Hampshire, USA
  8. 8 Department of Psychiatry, Dartmouth College Geisel School of Medicine, Hanover, New Hampshire, USA
  9. 9 Geriatric Research, Education and Clinical Center (GRECC) and Translational Research Center for TBI and Stress Disorders (TRACTS), Boston VA Medical Center, Boston, Massachusetts, USA
  10. 10 Department of Psychiatry, Harvard Medical School, Cambridge, Massachusetts, USA
  11. 11 Department of Psychology, University of South Alabama, Mobile, Alabama, USA
  1. Correspondence to Dr Stephen L Aita; stephen.aita{at}va.gov

Abstract

Background Cognitive symptoms are often reported by those with a history of COVID-19 infection. No comprehensive meta-analysis of neurocognitive outcomes related to COVID-19 exists despite the influx of studies after the COVID-19 pandemic. This study meta-analysed observational research comparing cross-sectional neurocognitive outcomes in adults with COVID-19 (without severe medical/psychiatric comorbidity) to healthy controls (HCs) or norm-referenced data.

Methods Data were extracted from 54 studies published between January 2020 and June 2023. Hedges’ g was used to index effect sizes, which were pooled using random-effects modelling. Moderating variables were investigated using meta-regression and subgroup analyses.

Results Omnibus meta-analysis of 696 effect sizes extracted across 54 studies (COVID-19 n=6676, HC/norm-reference n=12 986; average time since infection=~6 months) yielded a small but significant effect indicating patients with COVID-19 performed slightly worse than HCs on cognitive measures (g=−0.36; 95% CI=−0.45 to –0.28), with high heterogeneity (Q=242.30, p<0.001, τ=0.26). Significant within-domain effects was yielded by cognitive screener (g=−0.55; 95% CI=−0.75 to –0.36), processing speed (g=−0.44; 95% CI=−0.57 to –0.32), global cognition (g=−0.40; 95% CI=−0.71 to –0.09), simple/complex attention (g=−0.38; 95% CI=−0.46 to –0.29), learning/memory (g=−0.34; 95% CI=−0.46 to –0.22), language (g=−0.34; 95% CI=−0.45 to –0.24) and executive function (g=−0.32; 95% CI=−0.43 to –0.21); but not motor (g=−0.40; 95% CI=−0.89 to 0.10), visuospatial/construction (g=−0.09; 95% CI=−0.23 to 0.05) and orientation (g=−0.02; 95% CI=−0.17 to 0.14). COVID-19 samples with elevated depression, anxiety, fatigue and disease severity yielded larger effects.

Conclusion Mild cognitive deficits are associated with COVID-19 infection, especially as detected by cognitive screeners and processing speed tasks. We failed to observe clinically meaningful cognitive impairments (as measured by standard neuropsychological instruments) in people with COVID-19 without severe medical or psychiatric comorbidities.

Data availability statement

Data are available on reasonable request. Data are available on reasonable request to the corresponding author.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Is milder postacute COVID-19 infection without severe medical/psychiatric comorbidity associated with neurocognitive deficits?

WHAT THIS STUDY ADDS

  • Meta-analysis of 696 effects extracted from 54 studies (pooled n’s=6676 for COVID-19 and 12 986 for controls/norm-reference) revealed a small significant overall effect (g=−0.37), largest for cognitive screeners (g=−0.56) and smallest for orientation tasks (g=−0.02). Largest effects were observed in COVID-19 samples with elevated depression, anxiety and fatigue symptoms; as well as in patients requiring hospitalisation and mechanical ventilation.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • Small cognitive effects observed following less severe COVID-19 infection are associated with psychiatric symptoms, fatigue and/or illness severity. In the absence of these factors, neuropsychological instruments are unlikely to reveal any clinically meaningful objective cognitive deficits.

Introduction

Common neurobehavioural symptoms associated with COVID-19 include headache, ‘brain fog’, myalgia, hyposmia, chronic fatigue, insomnia and dysautonomia.1 Those with mild-spectrum COVID-19 infection generally recover from acute symptoms within 7–10 days while more severe cases have protracted recovery (3–6 weeks) from the acute disease period.2 The terms ‘long-COVID’ and ‘postacute sequela of COVID-19’ (PASC) are used interchangeably to describe sequelae persisting ≥3 months from infection.2 Over one-third of COVID-19 cases report neuropsychiatric symptoms 6 months post-COVID-19 infection.3–7 Base rates of persisting sequela approach 50% for those requiring intensive care unit (ICU) placement.3–7

ICU admission and mechanical ventilation are significant risk factors for cognitive deficits following COVID-19 infection, especially for older adults and those with multiple comorbidities.8 9 Cognitive outcomes in those who do not require ICU placement or life-saving interventions are less clear. Previous studies indicate that patients with milder infection report more cognitive complaints and greater psychiatric concerns than those with more severe infections despite better objective cognitive performance for mild cases.10 11 Pooled base rates of self-reported fatigue (32%) and subjective cognitive dysfunction (22%) ≥12 weeks post-COVID-19 infection are high according to prior meta-analysis.12 A comparable base rate of brain fog (31%) was observed in a PASC cohort 12 months postinfection, and this symptom was associated with physical, social and employment-related functional outcomes independent of disease severity.13 In a separate vein, recent literature also highlights a robust association between performance validity test (PVT) failure and poor cognitive performance in those with PASC.14 Importantly, the extent which PVTs are incorporated in this literature is not known.

Despite focused attention on cognitive decline related to COVID-19 infection,15–17 no comprehensive meta-analyses of objective neuropsychological measures exist. Three studies exclusively meta-analysed cognitive screeners (ie, brief cognitive tests used to screen for the presence of cognitive deficits) across a small subset (k=5) of recovered COVID-19 samples. These yielded inconsistent and heterogeneous small (standardised-mean-difference=−0.41; 95% CI=−0.55 to –0.27),16 moderate (g=−0.68; 95% CI=−1.05 to –0.31)17 and strong (unstandardised mean difference=−0.94; 95% CI=−1.59 to –0.29)15 pooled effects.

The current study primarily aimed to comprehensively meta-analyse objective cognitive performance predominately in persons with milder (ie, less severe) COVID-19 infection not requiring ICU and/or other life-saving intervention. A secondary aim was to explore potential psychiatric, fatigue and infection severity moderating variables. Two hypotheses were evaluated: (1) COVID-19 samples will perform worse on cognitive tests when compared with healthy controls (HCs) or normative-referenced criteria and (2) emotional distress, fatigue and illness severity markers, such as hospitalisation and/or mechanical ventilation, will moderate effects such that more severe samples would yield stronger effects.

Methods

Study design and registration

This meta-analysis followed the Meta-Analyses of Observational Studies in Epidemiology (MOOSE) guidelines.18 This protocol was preregistered by Open Science Foundation (https://osf.io/23j5s).

Eligibility criteria for studies

Inclusion criteria were peer-reviewed research published in the English language, original empirical research, cross-sectional design or availability of baseline data in the context of a longitudinal study, adult (≥18 years), use of clinical cognitive measures (ie, used in clinical settings) and sufficient data provided for estimation of effect size(s). Studies were included if they had an HC sample and/or reported cognitive outcomes with norm-referenced data. Studies were excluded if the COVID-19 sample had a severe disease course, operationalised as the majority of the sample (ie, >50%) requiring ICU care or ventilator intervention. Studies of COVID-19 samples with documented/reported severe medical, psychiatric and/or neurological comorbidities were also excluded. That is, studies were excluded if participants had significant medical comorbidities that were non-COVID-related (ie, pre-existing) and/or COVID-19 related (eg, severe medical complications within the disease course requiring life-saving intervention). Lastly, studies were excluded if COVID-19 diagnoses were self-reported without confirmation of infection. Our first (and primary) study aim was to study neurocognition in less severe COVID-19, and as such, we elected to exclude patients with severe psychiatric and medical comorbidity/complication. That being said, we still included investigation of psychiatric and illness severity variables in our second aim as we foresaw heterogeneity of COVID-19 samples in the literature. Thus, the second aim was devised to quantitatively account for these factors with respect to neurocognitive outcomes.

Search strategy and selection of studies

A medical librarian built the search terms for three databases (Embase, PubMed and PsycINFO) to identify studies examining neurocognition among patients with COVID-19 infection, published between 1 January 2018 and 8 June 2023 (see online supplemental appendix 1). Search results were exported to Covidence, an online platform for systematic literature review management.19 First, two reviewers (SABK/DSA) independently screened title/abstracts for relevance. Next, appropriate citations were independently full-text reviewed according to inclusion/exclusion criteria. Discrepancies were resolved by a third reviewer (SLA). Inter-rater reliability was assessed with Cohen’s kappa (κ).

Supplemental material

Data extraction and outcomes

The following data were extracted into spreadsheets by two independent coders (SABK/DSA): first author name, publication year, COVID-19 subgroup (eg, severity and hospitalisation/intervention information), cognitive measure and the domain assessed (ie, cognitive screener, global cognition, orientation, processing speed, basic/complex attention, learning/memory, executive functions, language, visuospatial/construction and motor; see online supplemental tables 1−10 for tasks in each domain; domains included, group descriptive statistics for cognitive outcomes, demographics (ie, M age and years of education, % male, % white), cognitive test score type (ie, raw vs norm-calibrated/standardised), and whether the COVID-19 group was being compared with an actual HC group or against norm-references. Cognitive screener refers to scores from any brief assessment used to identify (ie, ‘screen’) cognitive impairment (eg, Montreal Cognitive Assessment), whereas global cognition was used to indicate composite (ie, overall) scores from a test battery.

We also coded whether studies used PVTs to appraise engagement with cognitive testing. Discrepancies were resolved on a case-by-case basis. When COVID-19 sample performances were compared against norm-referenced data (ie, M=0.0, SD=1.0 for z-scores; M=10, SD=3 for Scaled Scores; M=100, SD=15 for Standard Scores and M=50, SD=10 for T-scores), the clinical sample size was used for the control sample size to estimate an effect size. Clinical characteristics were also extracted across studies: per cent of sample hospitalised, time since infection, length of hospitalisation, per cent on ventilator, per cent admitted to the ICU and length of ICU stay. Lastly, COVID-19 group means for self-rating data on depression, anxiety and fatigue measures were coded when available. These were then categorised into 0=within normal limits (WNL) and 1=elevated if the group M crossed the threshold of mild-or-greater severity according to respective self-report measure norms/scoring criteria.

Risk of bias assessment

Bias assessment risk was coded by TO and subsequently reviewed by SLA. The National Institutes of Health (NIH) Study Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies was used to quantify study quality/bias across 14 criteria.20 As three of these were not applicable across virtually all studies, we qualified studies with a score of 0–3 as poor (high bias risk), 4–7 as fair (moderate bias risk) and 8–11 as good (low bias risk; see online supplemental appendix 2 for selected criteria).

Data analysis plan

Comprehensive Meta‐Analysis (V.4) statistical software was used.21 Analyses were modelled using random effects with ‘study’ set as the level of analysis (with the exception of moderator analyses in which ‘subgroup’ was the level of analysis). Hedges’ g was the index of pooled effect sizes to adjust findings for sample size-related bias.22 Negative g coefficients were interpreted as lower cognitive performance in the COVID-19 versus HCs/norm-reference, and magnitude was appraised as |0.00–0.29|=small, |0.30–0.79|=medium and |≥0.80|=large.23 Ferguson’s recommended cut-off for the minimum effect to be considered practically significant (|g≥0.40|) was applied to capture areas of cognition with observable change.24 Effects were aggregated across domains and within specific cognitive subdomains. For studies with complex data structures (eg, multiple clinical subgroups and/or multiple outcome measures in one or more domains), effects were treated as dependent. These were combined within each study using the random effects model, which assumes a common variance component.25

Heterogeneity among studies was estimated using Cochran’s Q test, with significant values indicating variation due to ‘true’ between-study differences rather than sampling error.26 We also computed tau (τ), which is derived from the variance of true effect sizes (τ2). Thus, τ signifies the SD of true between-study effect sizes and is interpreted in units of Hedges’ g.26 We also computed 95% prediction intervals (PI) of g to display the range of predicted true intergroup performance differences.26 Publication bias was assessed using visual inspection of funnel plot symmetry, one-tailed p values from Egger’s regression test, and Duval and Tweedie’s trim-and-fill method.26 Funnel plots and bias analyses were conducted across domains and for specific domains with k≥10 studies.25

We entered the following continuous and binary variables into separate meta-regression models to test whether these account for heterogeneity: age, education, sex, race/ethnicity and NIH bias score; and per cent of COVID-19 sample requiring hospitalisation, ICU admission and ventilator treatment.26 We also entered the following binary variables: referent group (actual HC group vs norm-reference), PVT (used vs not used), score type (raw vs standardised), depression (WNL vs elevated), anxiety (WNL vs elevated) and fatigue (WNL vs elevated). Age, education, sex and race/ethnicity were coded separately for clinical and control groups. These were then linked and entered in multivariate meta-regression models to test respective superordinate prediction. Barring these linked variables, all moderators were tested using univariate regression models (ie, predictors entered singularly) to optimise statistical power. Finally, pooled effects were computed within each stratum of the categorical variables listed above (eg, elevated vs WNL depression in clinical samples) and formally compared using subgroup analysis.26

All moderator analyses were conducted at the omnibus level (ie, across domains). The minimum requisite number of subgroups (k) for continuous variables in meta-regression models was k≥6, and k≥3 for each layer of categorical variables in subgroup analyses.25 27 We conducted separate sensitivity analyses removing effects from the main analyses for studies with (a) an actual HC comparison group, (b) using norm-referenced data comparisons and (c) screening measure-derived outcomes. Finally, we computed meta-regression statistics pooling task-specific effects for common clinical measures: Verbal Fluency, Digit Span, Trail Making Test and Stroop Test.

Results

Literature search and characteristics of included studies

Figure 1 displays a Preferred Reporting Items for Systematic Reviews and Meta-Analyses28 diagram depicting the screening/selection process of 4997 citations from the search strategy. Of the 3513 titles/abstracts screened, there were 113 conflicts between reviewers (96.78% agreement), and Cohen’s κ indicated a substantial degree of inter-rater reliability (κ=0.67; 95% CI=0.62 to 0.73). Online supplemental table 11 provides detailed information for included studies. 54 studies were included, with 724 effects extracted from 63 unique COVID-19 samples (n=6676 COVID-19 and n=11 559 HCs represented across analyses). Number (k) of studies published in 2020=4, 2021=8, 2022=25 and 2023=17. Using the modified study quality categories, most (k=45) studies were graded as fair followed by good (k=6) and poor (k=3). The original score categories resulted in k=21 as poor (0–5) and the remaining k=33 studies as fair (6–10). See online supplemental table 12 for study-by-study quality ratings. 50 studies reported the time of neurocognitive testing since COVID-19 diagnosis, which was approximately 6 months postinfection (M=172.7 days, SD=133.0 days; Med=142 days). Severity information was provided for only k=10 COVID-19 samples (asymptomatic=1, mild=5, mild to moderate=1 and moderate=3), whereas most samples (k=53, 84.1%) were characterised as PASC of unspecified severity. Virtually all COVID-19 samples were explicitly described as postacute (ie, recovered; k=58, 92.1%). Data for depression (k=31; 38.7% elevated, of which 72.7% mild and 27.3% moderate severity), anxiety (k=27; 51.9% elevated, of which 85.7% mild and 14.3% moderate severity) and fatigue (k=18; 83.3% elevated) were reported for COVID-19 samples. For cognitive outcomes, k=33 (61.1%) studies featured raw scores and k=23 (43.4%) used norm-calibrated scores (note: k=2 studies featured both). Actual HC samples were used as a referent in k=36 (66.7%) studies, and k=19 (35.2%) compared COVID-19 sample data against norm-references (note: one study featured both types of comparison). Finally, only k=3 studies (5.6%) incorporated PVTs in tandem with neurocognitive assessments.

Figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow chart of the study identification, screening and selection process.

Omnibus (across-domain) pooled estimates and moderator analyses

Figure 2 displays an across-domain forest plot of pooled effects for the primary meta-analysis. Omnibus meta-analysis of across-domain effects resulted in a small negative effect, indicating COVID-19 samples performing −0.36 SD units (95% CI=−0.45 to –0.28) below HCs and normative referents. Between-study effects heterogeneity was high (Q=242.30, p<0.001, τ=0.26 g-units; 95% PI=−0.89 to 0.17). Figure 3 displays the omnibus (ie, across-domain) funnel plot, which showed some asymmetry (negative skew) in effects at the lower end of the plot (ie, yielded by small n studies). Consistently, Egger’s regression test was significant (t(52)=2.15, pone-tailed =0.018), indicating a trend of increased effect sizes in relation to studies with larger standard error (ie, smaller n’s). However, the Trim-and-Fill procedure did not identify any studies likely missing due to publication bias; thus, the pooled estimate was unchanged.

Figure 2

Summary descriptive information, pooled effects forest plots and summary meta-analysis statistics for primary and secondary analyses. Solid black diamonds reflect omnibus (ie, across-domain) pooled estimates (g) whereas squares reflect effects aggregated within domain. The grey line over 0 reflects the ‘null’ (ie, no between-group difference) and pooled effects to the left of 0 indicate worse performance in the COVID-19 versus HC samples. ES, effect size; HC, healthy control; k, number of unique studies. *Cochran’s Q test p<0.05.

Figure 3

Omnibus (ie, across-domain) funnel plot of SE by Hedges’ g. The black filled in diamond and lack of black filled in circles indicates no studies removed by the trim-and-fill procedure. Thus, unadjusted (white) and adjusted (black) pooled estimates (g) are equal.

Meta-regression summary statistics for the omnibus (across-domain) model are displayed in table 1. The following moderators were significant predictors of between-study effect heterogeneity: higher per cent males in HC sample (though this only explained 4% of variance), higher per cent of COVID-19 sample requiring hospitalisation (pseudo-R2=0.29) and ventilation (pseudo-R2=0.64), and elevated fatigue symptoms (pseudo-R2 =0.35). Non-significant moderators that accounted for a noteworthy proportion of variance were higher per cent COVID-19 sample requiring ICU placement, accounting for 38% of effects heterogeneity in its respective model.

Table 1

Summary meta-regression statistics across all studies (omnibus)

Summary statistics and forest plots for visualisation of comparisons for select categorical moderators are presented in figure 4. Omnibus subgroup analyses indicated that pooled negative effects yielded by COVID-19 samples with elevated fatigue symptoms (g=−0.35) were significantly larger versus those with minimal symptoms (g=0.02). While not reaching significance, there was a consistent trend of relatively larger effects in COVID-19 samples with elevated versus WNL depression (g=−0.41 vs −0.24) and anxiety (g=−0.35 vs −0.22) symptoms.

Figure 4

Subgroup analysis summary statistics and forest plots across studies (omnibus). The grey line over 0 reflects the ‘null’ (ie, no between-group difference) and pooled effects to the left of 0 indicate worse performance in the COVID-19 versus HC samples. *Depression, anxiety and fatigue moderators coded for the clinical (COVID-19) samples. HC, healthy control; k, number of unique subgroups.

Domain-specific pooled estimates

Figure 2 presents within-domain pooled effect estimates across all studies; online supplemental figures 1−10 present study-by-study effect forest plots for each domain. Cognitive screeners produced the largest effect, which was of moderate magnitude (COVID-19 samples performing −0.55 SD units (95% CI=−0.75 to –0.36) below HCs and normative referents). Small but significant negative effects were observed for processing speed (g=−0.44; 95% CI=−0.57 to –0.32), global cognition (g=−0.40; 95% CI=−0.71 to –0.09), simple/complex attention (g=−0.38; 95% CI=−0.46 to –0.29), learning/memory (g=−0.34; 95% CI=−0.46 to 0.22), language (g=−0.34; 95% CI=−0.45 to –0.24) and executive function (g=−0.32; 95% CI=−0.43 to 0.21). Motor (g=−0.40; 95% CI=−0.89 to 0.10), visuospatial/construction (g=−0.09; 95% CI=−0.23 to 0.05) and orientation (g=−0.02; 95% CI=−0.17 to 0.14) domains were not significant.

Heterogeneity was considerable across analyses. Aside from the orientation domain, all Cochran Q tests were significant (p’s<0.001), with τ ranging from 0.21 (simple/complex attention) to 0.47 (motor) g units (see online supplemental figures 1−10 for domain-specific PIs). With regard to domain-specific publication bias analyses, Egger’s regression test was significant for cognitive screener (t(18)=4.75, pone-tailed<0.001) and processing speed (t(34)=1.78, pone-tailed=0.042) domains. The trim-and-fill procedure identified probable missing studies due to bias for cognitive screener (k=6, adjusted g=−0.36), simple/complex attention (k=3, adjusted g=−0.34) and language (k=3, adjusted g=−0.30) domains.

Sensitivity/secondary analyses

Figure 2 displays across-domain and within-domain meta-analysis findings for sensitivity analyses looking at pooled effects for studies with an HC sample referent, normative referent, and with screener-derived outcomes removed. In line with the moderator analyses, omnibus aggregated effects produced from studies with HC (g=−0.41; 95% CI=−0.53 to –0.30) and normative (g=−0.30; 95% CI=−0.43 to –0.18) referents were similar. Domain-specific findings were also comparable, with most significant domains falling between −0.30 and −0.40 SD units, and non-significant for motor and visuospatial/construction. Next, as screeners tend to yield robust effects, all related outcomes were removed, and analyses were repeated to determine whether effects were inflated due to cognitive screeners. The across-domain (omnibus) pooled estimate with screeners removed (g=−0.34; 95% CI=−0.43 to –0.25) was nearly identical to the primary model with all effects included, and commensurate significant within-domain effects emerged ranging between −0.30 and −0.40 SD units. Additional statistics for commonly administered specific task paradigms (eg, Trail Making Test) are presented in figure 5, most falling between −0.30 and −0.40 SD units (in line with primary analyses). Finally, we examined whether duration from COVID-19 infection is associated with effects using meta-regression. Mean or median days from COVID diagnosis at the time of neurocognitive testing were available and coded for 55 clinical samples. Analysis indicated duration from COVID infection was not associated with effect heterogeneity (B=0.0002, SE B=0.0003, p=0.570, pseudo-R2<0.01). ‘Minimum’ number of days postinfection from the date of testing was available and coded for four additional samples; repeat analysis including these was unchanged (p=0.563).

Figure 5

Summary meta-analysis statistics for specific commonly used neuropsychological tests in practice The grey line over 0 reflects the ‘null’ (ie, no between-group difference) and pooled effects to the left of 0 indicate worse performance in the COVID-19 versus HC samples. *Cochran’s Q test p<0.05. ES, effect size; HC, healthy control; TMT, Trail Making Test.

Discussion

This study provided a comprehensive quantitative synthesis of the relation between COVID-19 and cognitive functioning by comparing the cognitive performance profiles of persons with a history of generally mild COVID-19 to HCs or normative referents. The first hypothesis was that COVID-19 samples would perform worse on cognitive tests compared with HCs and/or normative-referenced criteria. Meta-analytical evidence supported this hypothesis as COVID-19 samples showed mildly reduced cognitive performance relative to HCs and norm-reference data. The largest (moderate magnitude) negative effect was observed for cognitive screeners, whereas global cognition, processing speed, learning/memory, language, simple/complex attention and executive functions yielded small but significant pooled effects. Findings were unchanged across sensitivity analyses. Considerable heterogeneity was found across analyses, though our pooled estimates were broadly smaller than previous studies (typically >−0.40 SD units).15–17 This may reflect the balance of our robust and representative sample of studies, thorough inclusion criteria of non-complex/largely mild COVID-19, and conservative pooled estimate index (g). More broadly, pooled estimates from our meta-analytic synthesis fall in line with recent findings from a large-scale community-based long COVID sample (n=112 964), which noted small effects for memory and executive function tasks (−0.33 to −0.20 SD units).29

The second hypothesis was that emotional distress and fatigue would moderate effects such that COVID-19 samples with elevated symptoms would yield stronger effects. Moderator analyses generally supported this hypothesis as fatigue significantly predicted heterogeneity in meta-regression analysis, and larger negative effects were associated with COVID-19 samples with elevated depression, anxiety and fatigue symptoms in subgroup analyses. However, depression and anxiety failed to reach significance, which may reflect limited statistical power (ie, small number of subgroups in respective analyses), high heterogeneity and our studying a limited range of affective severity (as we excluded studies/samples with known psychiatric comorbidities). Indeed, among COVID-19 samples with elevated mood symptoms, the majority fell in the ‘mild’ range. Next, markers of COVID-19 severity, particularly per cent of sample requiring hospitalisation and mechanical ventilation, accounted for a sizeable proportion of effect heterogeneity (ICU placement also predicted considerable variance but did not reach significance likely due to limited power and high heterogeneity). This is consistent with data from a recent large community-based study which found greater cognitive deficits in hospitalised patients (−0.35 SD units).29 Post hoc analysis also revealed that duration from COVID-19 infection was not associated with neurocognitive outcomes; this finding is consistent with our focus on patients with milder (ie, less severe) COVID-19.

Given the large amount of people who report cognitive concerns following COVID-19 infection, the current findings suggest that the subjective cognitive impairment experienced by many with COVID-19 is associated with mood symptoms, fatigue and disease severity, keeping with prior literature.30 31 Cognitive deficits are more likely when disease severity is high, such as when ICU or ventilator treatment is required. ICU admission is associated with a variety of postdischarge complications (eg, psychiatric symptoms, delirium, cognitive dysfunction), and the constellation of these symptoms are collectively referred to as ‘postintensive care syndrome’.32 Rates and extent of cognitive impairment are similar between patients admitted for acute respiratory disease syndrome and general ICU admission, and there are numerous plausible pathophysiological mechanisms for such, including multiorgan failure, neuroinflammation, hypoxia/anoxia, medication effects, metabolic factors and immune response.33 Given the shared characteristics of those with COVID and non-COVID pathologies admitted to ICU (eg, older age, hypoxia/anoxia, respiratory/cardiac failure, medication exposures, mechanical ventilation), those admitted to ICU with severe/critical COVID likely experience similar postdischarge clinical complications as non-COVID pathologies.

Our findings mirror those reported in the postinfective chronic fatigue syndrome literature. Early investigations identified mild attention and concentration deficits among those with chronic fatigue syndrome, and degree of neurocognitive impairment was proportionate to depression and fatigue symptoms.34 Findings in postviral fatigue are mixed as objective cognitive test performance is often better than what their subjective (self-reported) concerns would indicate.35 Notably, postviral fatigue is not exclusive to long-COVID; rather, it is commonly seen in a variety of infectious diseases such as Epstein-Barr, dengue, chikungunya, Ebola, Coxiella, burnetiid and Giardia lamblia.36 While there is no consensus with respect to the definition of postviral fatigue and its pathogenic mechanisms,37 38 initial illness severity, pre-existing medical conditions, and baseline psychiatric functioning are key predictors of postinfective fatigue rather than postviral activity.36 Much like the literature on patients with mild TBI-related cognitive symptoms,39 those with cognitive sequela following COVID-19 infection may benefit from education about the expected recovery from milder infection courses in combination with treatment for mood symptoms and/or other symptoms that are maintaining them (eg, sleep disturbance, fatigue).40

Beyond psychiatric and other somatosensory symptoms, central nervous system alterations have been observed in patients across the COVID-19 disease severity spectrum via indirect (eg, hypoxia, neuroinflammation, immunologic/hypercytokinaemia) and direct (eg, cerebrovascular) pathophysiological mechanisms.41 Beyond these, there is mounting evidence of neuroaxonal damage in relation to COVID-19 infection, as measured by neurofilament light chain protein (NfL), which may contribute to perceived and observed cognitive deficits.42 Adjunctive to neuronal damage, this biomarker is linked with a variety of health outcomes in the context of COVID-19 including disease duration and severity, as well as likelihood of ICU admission, mechanical ventilation and death.43 Meta-analytical evidence also indicates blood NfL concentration is elevated (compared with HCs) in the acute phase of COVID-19, even in patients with no or mild-grade neurological symptoms,43 suggesting NfL’s sensitivity to subclinical/subtle neuronal damage.42 On the other hand, patients with severe infection accompanied by neuroinflammatory conditions (eg, meningitis, encephalitis) show higher cerebrospinal fluid NfL levels relative to mild-severity referents, thus suggesting compounded axonal injury in the setting of neurologic complications superimposed on COVID infection.42

In a separate vein, this meta-analysis highlighted a central limitation of the COVID-19 literature—only three of 54 studies used PVTs, which are integral in appraising task engagement. Effects of factitious and somatisation presentations are more common in milder-illnesses with prolonged/poorly defined illness courses (eg, concussion)39 44–46 which have different outcomes in studies without PVTs. Preliminary evidence also shows this in patients 6 months postrecovery from COVID-19 infection.14 47 Lack of task engagement assessment via PVTs should be a concern to the broader postinfective chronic fatigue literature as this clinical presentation shows an increased predisposition to suboptimal task engagement.48 Collectively, additional research is warranted to clarify whether cognitive deficits are attributable to fatigue, psychiatric symptoms, inadequate task effort and/or postviral neurobiological changes among those with long-COVID.49

Limitations

This study had several strengths including our adherence to MOOSE guidelines,18 assessment of study quality using NIH criteria, the inclusion of objective neurocognitive test data from standard measures that are routinely used in clinical practice, and the large globally representative sample of studies analysed (with 37% of studies featuring participants from diverse geographical regions outside of USA/Western Europe). Further, no studies were primarily excluded due to their being published in non-English language during the selection process. However, several limitations, some of which extend from problems with the literature base, warrant discussion. Causal inferences cannot be drawn from the findings of this meta-analysis as we exclusively synthesised data from cross-sectional studies. Thus, it remains possible that cognitive changes observed in the COVID-19+ samples analysed may have been caused by factors other than COVID-19 infection. Relatedly, without longitudinal data, we are unable to rule out the possibility that patients with pre-existing cognitive problems (and other vulnerabilities such as psychiatric problems) are at increased risk of having persistent sequela following COVID-19 infection. In this vein, mood symptoms were associated with worse cognitive performance. Alternatively, recent empirical evidence observed reduced peripheral serotonin among patients with PASC, which is thought to contribute to both mood and neurocognitive symptoms via vagal nerve and hippocampal dysfunction.50

Next, heterogeneity was high in most analyses, which is not surprising given the multidimensional nature of neurocognitive tests.51 We explored sources of heterogeneity, though some moderator analyses likely suffered from limited statistical power. In this vein, some neurobehavioural symptoms and COVID-19 severity variables were non-significant, though this is partly by design as we restricted clinical samples to those medically and psychiatrically non-complex. Despite our stringent inclusion criteria, many studies we analysed featured a mixed-severity sample rather than a purely non-hospitalised or asymptomatic group. This was by virtue of the literature as there were insufficient studies exclusively on non-hospitalised individuals. Thus, we included studies with mixed samples if the majority were not clearly or fully made up of severe cases (ie, history of ICU or ventilation). While attempts were made to control for the effects of disease severity, it was not always possible to determine the proportion of cases with severe illness. Further, some studies were included when it was not entirely certain if the hospitalised group received ICU care or not, which may have inflated our findings.

Notably, the robust pooled effects yielded by cognitive screeners have been seen in other meta-analyses of distinct clinical populations.52 53 This finding may in part be an artefact of relatively small measurement error, range restriction and ceiling effects that accompany most cognitive screeners. In this context, group SD are often small for these tasks, which thereby produces robust effect sizes (even when group mean differences appear trivial). As such, we do not purport that cognitive screening instruments are more sensitive to milder COVID-19 infection than formal neuropsychological test batteries based on our findings (nor are they a suitable substitute). Future investigations may wish to clarify the incremental value of cognitive screeners versus comprehensive neuropsychological batteries with respect to COVID-19 outcomes such as recovery.

Future research would benefit from greater utilisation of PVTs to increase the validity of study conclusions. Additionally, further cross-sectional and longitudinal studies examining milder illness courses exclusively will provide greater clarity about the neurocognitive effects for the majority of people who have been exposed to COVID-19 but did not require hospitalisation or experience confounding complications from the illness that may better explain the presence of cognitive sequela. Longitudinal data will also help clarify the temporal association between cognitive and psychiatric symptoms in the context of COVID-19 infection (especially in relation to risk of developing long-COVID). Inconsistent documentation practices of factors such as illness severity (with only 15.9% of samples explicitly designated a disease severity by study authors) and time since infection precluded our ability to analyse these important variables.31 Lastly, the future studies should consider using established harmonised neuropsychological test batteries to mitigate the marked heterogeneity we observed and enhance the comparability of findings, even within domains.54

Conclusion

This meta-analysis demonstrated that milder COVID-19 infection is associated with a small reduction in performance across cognitive domains, with deficits observed in cognitive screeners, global cognition, processing speed, learning and memory, language, simple/complex attention and executive functioning. Fatigue, mood symptoms and disease severity are associated with effect sizes, suggesting that those with milder COVID-19 infection who do not develop mood symptoms or fatigue—and who did not require hospitalisation/ICU or mechanical ventilation—are unlikely to demonstrate clinically significant differences on neurocognitive testing relative to HCs. Importantly, these findings may generalise to patients with long-COVID/PASC as over three-fourths of our samples were characterised as such. We recommend practitioners prioritise treating mood and fatigue symptoms among patients reporting persistent brain fog (among other cognitive sequela) following the postacute phase of COVID-19 infection.

Data availability statement

Data are available on reasonable request. Data are available on reasonable request to the corresponding author.

Ethics statements

Patient consent for publication

Acknowledgments

We thank medical librarians John M. Shewfelt and Jason G. Smith for their assistance with creating the search strategy for the meta-analysis. We also extend appreciation to William Otero, Ph.D., Eli M. Dapolonia, Ph.D., and Sherry L. Thrasher, Psy.D. for their support related to this project. Authors Sarah A. B. Knapp, Ph.D., David S. Austin, Psy.D. and Stephen L. Aita, Ph.D. share co-first authorship of this manuscript.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • X @AitaPhD

  • Contributors Concept and design: SABK, DSA, SLA, JEC and BDH. Acquisition, analysis or interpretation of data: SABK, DSA, SLA.Drafting of the manuscript: SABK, DSA, SLA, JEC. Critical revision of the manuscript for important intellectual content: All authors. Statistical analyses: SLA, TO and NCB. Obtaining funding: N/A. Administrative, technical or material support: TO, NCB and BDH. Supervision: SLA, JEC, NCB, VADB, RMR, WPM and BDH. SLA is responsible for the overall content as guarantor .

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.