Elsevier

Journal of Neuroscience Methods

Volume 253, 30 September 2015, Pages 254-261
Journal of Neuroscience Methods

Computational neuroscience
Amygdalar and hippocampal volume: A comparison between manual segmentation, Freesurfer and VBM

https://doi.org/10.1016/j.jneumeth.2015.05.024Get rights and content

Highlights

  • Comparison of amygdala and hippocampus volume between manual and automatic segmention with Freesurfer and vbm8.

  • Concordance coefficients reveal weakness in method accordance.

  • Volumetry estimation with VBM8 feasible alternative to Freesurfer V5.

Abstract

Automated segmentation of the amygdala and the hippocampus is of interest for research looking at large datasets where manual segmentation of T1-weighted magnetic resonance tomography images is less feasible for morphometric analysis. Manual segmentation still remains the gold standard for subcortical structures like the hippocampus and the amygdala. A direct comparison of VBM8 and Freesurfer is rarely done, because VBM8 results are most often used for voxel-based analysis. We used the same region-of-interest (ROI) for Freesurfer and VBM8 to relate automated and manually derived volumes of the amygdala and the hippocampus. We processed a large manually segmented dataset of n = 92 independent samples with an automated segmentation strategy (VBM8 vs. Freesurfer Version 5.0). For statistical analysis, we only calculated Pearsons's correlation coefficients, but used methods developed for comparison such as Lin's concordance coefficient. The correlation between automatic and manual segmentation was high for the hippocampus [0.58–0.76] and lower for the amygdala [0.45–0.59]. However, concordance coefficients point to higher concordance for the amygdala [0.46–0.62] instead of the hippocampus [0.06–0.12]. VBM8 and Freesurfer segmentation performed on a comparable level in comparison to manual segmentation.

We conclude (1) that correlation alone does not capture systematic differences (e.g. of hippocampal volumes), (2) calculation of ROI volumes with VBM8 gives measurements comparable to Freesurfer V5.0 when using the same ROI and (3) systematic and proportional differences are caused mainly by different definitions of anatomic boundaries and only to a lesser part by different segmentation strategies. This work underscores the importance of using method comparison techniques and demonstrates that even with high correlation coefficients, there can be still large differences in absolute volume.

Introduction

Volumetric measurement of hippocampal or amygdalar volume is not only of interest because it reflects physiological processes but it might also gain clinical significance as a neuroimaging biomarker for diagnosis and prognostic evaluation.

Changes in hippocampal and amygdalar volumes have been reported in the literature for several mental and neurological disorders. Loss of hippocampal volume has been described for epilepsy (Tasch et al., 1999) and for Alzheimers’ disease, loss of hippocampal volume has been discussed as a diagnostic biomarker. Other disorders include depression or PTSD (Bremner et al., 1995, Bremner et al., 1997), where the neurotoxic effect of stress-related glucocorticoid excretion has been discussed. Changes in amygdalar volume have also been implicated in fear memory. Thus, changes in amygdalar volumes have been discussed for PTSD (Rogers et al., 2009), obsessive-compulsive disorder (Szeszko et al., 2004) or borderline personality disorder (Ruocco et al., 2012). As these studies were conducted in a clinical context, a conclusion which has often been drawn from these results is the derivation of independent diagnostic markers.

With data sets becoming larger, the need for automated instead of time-consuming manual segmentation has emerged. Several software packages enable such an automated estimation of volume with Freesurfer, FSL (FMRIB Software Library) or VBM (voxel-based morphometry, namely the MATLAB based toolbox VBM8) being among the most popular. There have been several methodological studies looking at the effects of different image-processing strategies for segmentation in general or comparing different software packages among each other or a manual “gold standard”. Most of these studies concentrated on the comparison manual versus automated segmentation of the hippocampus or the amygdala in young and healthy subjects with correlation coefficients ranging from 0.6 to 0.9 (Wenger et al., 2014, Klauschen et al., 2009, Nugent et al., 2012), strongly depending on the type of image processing and the brain region involved. Some studies compared whether automated or manual segmentation maximized group differences in mental and neurological disorders. In a recent genetic meta-analysis hippocampal volume estimates from different sources and methods from 5000 participants were pooled to find a genomic association, illustrating the need to understand the agreement between different measures (Stein et al., 2012).

Although comparative studies exist, our study adds novel aspects to the question of comparability. First, most studies on methodological comparisons concentrated on Pearson's correlation coefficient or related intra class coefficients (ICCs). Here we have also added methods that more adequately capture details of the methods comparisons involving Bland–Altman plots, Passing–Bablok regression and Lin's concordance coefficient to test the methods against the gold standard (manual segmentation). Second, as these software packages have been implemented with new algorithms in recent versions, it is interesting to examine under scientific “everyday” conditions the performance of the latest version. VBM8 employs innovative Markov-Random-Fields (MRF) and a high-dimensional non-linear warping, the latest FreeSurfer version realizes, for example, skull stripping with graph cuts available and uses extensive “look-up-tables”. Third, as we had access to a large (n = 92) set of manually segmented amygdalae and hippocampi, we were interested in comparing two widely used software packages (FreeSurfer and VBM8) to manual segmentation. This is especially interesting because most methodological studies compare FreeSurfer with FSL. VBM is most often utilized in the context of statistical parametric mapping (SPM) for voxel-wise analysis (Ashburner and Friston, 2000). However, it is possible to use this workflow for calculation of region-based volumes. As this might be an interesting alternative strategy for users, we provide an exemplary comparison to the Freesurfer V5.0 package. Finally, most other studies reported on smaller samples when manual segmentation was used (cf. Table 3).

Section snippets

Participants

Ninety-two participants were recruited as part of an ongoing study on predictors of posttraumatic stress disorder in not yet trauma-exposed individuals (paramedics at the beginning of their training). Participants shared a common educational background and were in the same age range (18–34 years; mean: 21.64; standard deviation: 2.57). Data on this sample were reported earlier in a study on hippocampal volume and fear conditioning (Pohlack et al., 2012). Subjects with mental disorder as

Visualization with Bland–Altman plots

The Bland–Altman charts and scatterplots (see right side of Fig. 1) allow a simple qualitative inspection of scattering and distortion of the data, as well as inspection of outliers. We restricted them to VBM8 versus manual segmentation of the right hemisphere, as the left hemisphere and Freesurfer did show the same pattern when inspected visually. As the Bland–Altman plots showed a deviation of the differential value (delta on the y-axis) from zero, this indicates a systematic difference

Discussion

The purpose of this study was to compare manual and automatic segmentation of hippocampus and amygdala. In addition, we present and discuss two alternative methods for segmentation (VBM8) and agreement analysis (CCC). Automated segmentation techniques are heterogeneous. In model-based segmentation methods, an MRI atlas that was previously manually labeled by an expert rater is matched to target images using nonlinear registration methods. The resulting nonlinear transformation is applied to the

Acknowledgments

This work was supported by grant of the Deutsche Forschungsgemeinschaft to HF (SFB636/C1).

References (51)

  • K.K. Leung et al.

    Brain MAPS: an automated, accurate and robust brain extraction technique using a template library

    Neuroimage

    (2011)
  • V.A. Magnotta et al.

    Structural MR image processing using the BRAINS2 toolbox

    Comput Med Imaging Graph

    (2002)
  • N.V. Malykhin

    Three-dimensional volumetric analysis and reconstruction of amygdala and hippocampal head, body and tail

    Psychiatry Res

    (2007)
  • R.A. Morey et al.

    A comparison of automated segmentation and manual tracing for quantifying hippocampal and amygdala volumes

    Neuroimage

    (2009)
  • J. Pipitone et al.

    Multi-atlas segmentation of the whole hippocampus and subfields using multiple automatically generated templates

    Neuroimage

    (2014)
  • M.M. Plichta et al.

    Test-retest reliability of evoked BOLD signals from a cognitive-emotive fMRI test battery

    Neuroimage

    (2012)
  • M.A. Rogers

    Smaller amygdala volume and reduced anterior cingulate gray matter density associated with history of post-traumatic stress disorder

    Psychiatry Res

    (2009)
  • A.C. Ruocco et al.

    Amygdala and hippocampal volume reductions as candidate endophenotypes for borderline personality disorder: a meta-analysis of magnetic resonance imaging studies

    Psychiatry Res

    (2012)
  • R. Wolz et al.

    LEAP: learning embeddings for atlas propagation

    Neuroimage

    (2010)
  • E. Alanen

    Everything all right in method comparison studies?

    Stat Methods Med Res

    (2012)
  • K. Amunts et al.

    BigBrain: an ultrahigh-resolution 3D human brain model

    Science

    (2013)
  • J.M. Bland et al.

    Measuring agreement in method comparison studies

    Stat Methods Med Res

    (1999)
  • J.D. Bremner et al.

    MRI-based measurement of hippocampal volume in patients with combat- related posttraumatic stress disorder

    Am J Psychiatry

    (1995)
  • F. Cardinale et al.

    Validation of Freesurfer-estimated brain cortical thickness: comparison with histologic measurements

    Neuroinformatics

    (2014)
  • Cited by (72)

    • Is Hippocampal Volume a Relevant Early Marker of Dementia?

      2023, American Journal of Geriatric Psychiatry
    View all citing articles on Scopus
    View full text