Elsevier

NeuroImage

Volume 184, 1 January 2019, Pages 180-200
NeuroImage

Retrospective harmonization of multi-site diffusion MRI data acquired with different acquisition parameters

https://doi.org/10.1016/j.neuroimage.2018.08.073Get rights and content

Highlights

  • A multi-site diffusion MRI harmonization method that can remove scanner-specific effects across sites.

  • The proposed method accounts for minor differences in acquisition parameters such as b-value, spatial resolution and number of gradient directions.

  • Biological differences due to gender and age in different groups are preserved.

  • At least 16 to 18 well-matched healthy controls from each site are needed to reliably capture scanner related differences.

Abstract

A joint and integrated analysis of multi-site diffusion MRI (dMRI) datasets can dramatically increase the statistical power of neuroimaging studies and enable comparative studies pertaining to several brain disorders. However, dMRI data sets acquired on multiple scanners cannot be naively pooled for joint analysis due to scanner specific nonlinear effects as well as differences in acquisition parameters. Consequently, for joint analysis, the dMRI data has to be harmonized, which involves removing scanner-specific differences from the raw dMRI signal. In this work, we propose a dMRI harmonization method that is capable of removing scanner-specific effects, while accounting for minor differences in acquisition parameters such as b-value, spatial resolution and number of gradient directions. We validate our algorithm on dMRI data acquired from two sites: Philadelphia Neurodevelopmental Cohort (PNC) with 800 healthy adolescents (ages 8–22 years) and Brigham and Women's Hospital (BWH) with 70 healthy subjects (ages 14–54 years). In particular, we show that gender and age-related maturation differences in different age groups are preserved after harmonization, as measured using effect sizes (small, medium and large), irrespective of the test sample size. Since we use matched control subjects from different scanners to estimate scanner-specific effects, our goal in this work is also to determine the minimum number of well-matched subjects needed from each site to achieve best harmonization results. Our results indicate that at-least 16 to 18 well-matched healthy controls from each site are needed to reliably capture scanner related differences. The proposed method can thus be used for retrospective harmonization of raw dMRI data across sites despite differences in acquisition parameters, while preserving inter-subject anatomical variability.

Introduction

The sensitivity of diffusion MRI to microscopic molecular motion forms the foundation to study the neural architecture of the brain. However, these measurements are affected by different hardware specifications (magnetic field strength, number of receiver coils etc.), and different acquisition parameters (echo time, diffusion time, gradient strength, voxel size, number of gradient directions etc.) (Helmer et al., 2016). Therefore, the data acquired by each scanner is substantially different even for the same subject. In fact, even if the same subject is scanned with the same hardware from the same manufacturer, diffusion signal can still be different (Vollmar et al., 2010). This is due to differences in magnetic field inhomogeneities, sensitivity of receiver coils, the number of receiver coils used, vendor-specific MRI reconstruction algorithms and differences in acquisition parameters. Consequently, dMRI data must be harmonized prior to joint analysis.

Several methods have characterized both intra-scanner and inter-scanner variability in structural and dMRI data (Landman et al., 2011, 2007). Based on their study in Walker et al. (2013), the authors recommend the use of physical phantoms to monitor and quickly detect any scanner-related changes in ongoing neuroimaging studies. While the use of physical phantoms is necessary, they are inadequate in capturing the regional and tissue specific scanner differences. Further, it is non-trivial to use the scanner differences observed in physical phantoms to correct human in-vivo data, due to the complexities of biological tissue.

Existing techniques on data pooling or harmonization are based on using diffusion tensor imaging (DTI) derived metrics (Salimi-Khorshidi et al., 2009; Jahanshad et al., 2013; Kochunov et al., 2014; Forsyth et al., 2014; Venkatraman et al., 2015; Jenkins et al., 2016; Pohl et al., 2016; Fortin et al., 2017). For instance, Salimi-Khorshidi et al. (2009); Jahanshad et al. (2013); Kochunov et al. (2014); Palacios et al. (2016); Kelly et al. (2017) use meta-analysis approach which involves combining z-scores of a given diffusion measure (e.g. fractional anisotropy (FA)) from all sites to determine group differences. However, the subject population at each site may not be sufficient to capture the variance of the entire population, a critical requirement to ensure proper pooling and analysis of the z-scores (which depends on the variance and not just the population mean). Further, z-scores may not be the best statistic to use if the distribution of the diffusion measure in the population is not Gaussian (normal). On the other hand, Forsyth et al. (2014); Venkatraman et al. (2015); Fortin et al. (2017) use statistical covariates to regress out the differences between sites in DTI measures such as FA, mean diffusivity (MD) or cortical thickness. Of particular note is the work of Pohl et al. (2016), where the authors use information from 3 traveling subjects to obtain a linear correction factor for scanner related effects in FA (a different correction factor for each ROI analyzed). This method however has limitations when using large ROIs (such as the corticospinal tract), as the scanner-related effects are not only non-linear but also regionally varying (see (Mirzaalian et al., 2016) and Fig. 2). Thus, due to the regional variability of the diffusion signal, using a single regressor for large ROIs can lead to erroneous results in the aggregated data (Mirzaalian et al., 2016; Fortin et al., 2017). Further, it is also well known that the differences in the signal-to-noise ratio (SNR) of the acquisitions at each site might add to variability in the estimated dMRI parameters such as FA (Farrell et al., 2007).

All of the methods mentioned above have to correct for scanner-specific effects in each diffusion measure of interest separately, i.e., a linear correction factor for each diffusion measure, thus making the harmonization procedure entirely model-specific (e.g. single tensor). Recently, Fortin et al. (2017) have proposed a powerful and fast statistical data pooling tool that uses ComBat (a batch-effect correction tool used in genomics) for retrospective data harmonization. This method estimates an additive and a multiplicative site-effect coefficient at each voxel, thus accounting for regional scanner differences. ComBat works on the finalized parameter maps, and in practice can be applied to any parameter map (e.g. FA, MD, mean kurtosis, etc). Despite this, their optimization procedure assumes that the site-effect parameters follow a particular parametric prior distribution (Gaussian and Inverse-gamma), which might not generalize to all scenarios or measures derived from other models (e.g., multi-compartment models). Besides, it is not clear how the nonlinearities in the signal due site-effect propagate through the preprocessing techniques as well as the model fitting procedures.

In our earlier works (Mirzaalian et al., 2016, 2017), we had proposed a model-free dMRI harmonization method which can be used to harmonize the “raw dMRI signal” (and not just a particular dMRI measure of interest) across sites. However, that work exclusively focused on harmonizing dMRI data across sites but with similar acquisition parameters. Thus, the method worked only when the spatial resolution and b-values were the same across sites. Additionally, the earlier method did not have an extensive validation on a large dataset.

In this work, we further build on our existing framework and propose a model free harmonization method that learns an efficient mapping across scanners despite differences in scanner parameters. We extensively validate our algorithm on dMRI data acquired from two different sites with different acquisition parameters. We use two independent data sets of different sizes (BWH: 70 subjects and PNC: 800 subjects) to demonstrate that our harmonization method is not affected by the sample size as opposed to existing approaches that require an accurate estimate of the variance of the underlying population in their model (e.g. meta-analysis methods). To this end, we compute effect sizes between groups separated by age and sex. Specifically, we show that the effect sizes, whether small, medium or large, are preserved by our harmonization procedure in both small (e.g. BWH) and large (e.g. PNC) data sets. Such validation experiments are necessary to robustly demonstrate the generalizability of any harmonization procedure for use with clinical research dMRI studies.

Using two different experiments, we demonstrate that at-least 16 to 18 well-matched controls from each site are needed to obtain robust harmonization results. We also compare our technique with the ComBat statistical data pooling technique (Fortin et al., 2017), and demonstrate its limitations. Further, we also discuss the limitations of the proposed technique in the limitations section.

Section snippets

Neurodevelopmental cohort (PNC)

We used dMRI data from 884 healthy participants from the publically available NIH repository: Philadelphia Neurodevelopmental Cohort (PNC) study (Satterthwaite et al., 2014, 2016). The dMRI data was acquired on a Siemens TIM Trio whole-body scanner, using a 32 channel head coil and a twice-refocused spin-echo (TRSE) single-shot EPI sequence with the following parameters: TR=8100ms and TE=82ms, b-value of 1000s/mm2, 7 b=0 images. DMRI data was acquired with 64 diffusion-weighted directions

Experimental setup

In this section, we describe experiments to evaluate the performance of the proposed algorithm. First, to show that the harmonization works equally well irrespective of the choice of the reference site, we will evaluate the performance of our method using two experiments. In the first experiment, we choose BWH as the reference site and PNC as the target site, whereas in the second experiment we use PNC as the reference site and BWH as the target site. Another aim of these experiments is also to

Discussion

We believe that accurate harmonization of dMRI data is of utmost importance to allow for a large-scale data-driven way to understand brain disorders. In this paper, we presented a harmonization method to retrospectively remove scanner-specific differences from the raw dMRI signal across various sites, even if acquired with different acquisition parameters. The harmonization procedure requires a well-matched set of controls across sites to learn the mapping between sites.

Acquisition parameters,

Limitations

Despite its ability to harmonize clinical dMRI data, the proposed method nevertheless has certain limitations. First, the b-value matching procedure described herein works only in the lower b-value regime (500<b<1500). Beyond this range, a non-linear technique such as the one in Rathi et al. (2014) would have to be used for adjusting the b-values across sites. Note that, this limitation is not specific to our technique, and is also a limitation of other techniques such as ComBat. Interpolation

Acknowledgments

We gratefully acknowledge funding provided by the following National Institutes of Health (NIH) grants: R01MH102377, K24MH110807 (PI: Dr. Marek Kubicki), R01MH097979 (PI: Dr. Yogesh Rathi), R01HD090641, R01MH112748, U01MH109977 (PI: Dr. Sylvain Bouix), R21MH115280 (PI: Dr. Lipeng Ning).

References (47)

  • M. Asato et al.

    White matter development in adolescence: a dti study

    Cerebr. Cortex

    (2010)
  • B.B. Avants et al.

    A reproducible evaluation of ants similarity metric performance in brain image registration

    Neuroimage

    (2011)
  • B.B. Avants et al.

    The optimal template effect in hippocampus studies of diseased populations

    Neuroimage

    (2010)
  • M. Descoteaux et al.

    Regularized, fast, and robust analytical q-ball imaging

    Magn. Reson. Med.

    (2007)
  • T.B. Dyrby et al.

    Interpolation of diffusion weighted imaging datasets

    Neuroimage

    (2014)
  • J. Farrell et al.

    Effects of signaltonoise ratio on the accuracy and reproducibility of diffusion tensor imagingderived fractional anisotropy, mean diffusivity, and principal eigenvector measurements at 1.5t

    J. Magn. Reson. Imag.

    (2007)
  • R.H. Fick et al.

    Mapl: tissue microstructure estimation using laplacian-regularized map-mri and its application to hcp data

    Neuroimage

    (2016)
  • J.K. Forsyth et al.

    Reliability of functional magnetic resonance imaging activation during working memory in a multi-site study: analysis from the north american prodrome longitudinal study

    Neuroimage

    (2014)
  • J.P. Fortin et al.

    Harmonization of multi-site diffusion tensor imaging data

    Neuroimage

    (2017)
  • R.C. Gur et al.

    Sex differences in brain gray and white matter in healthy young adults: correlations with cognitive performance

    J. Neurosci.

    (1999)
  • K.G. Helmer et al.

    Multi-site study of diffusion metric variability: characterizing the effects of site, vendor, field strength, and echo time using the histogram distance

  • N. Jahanshad et al.

    Multi-site genetic analysis of diffusion images and voxelwise heritability analysis: a pilot project of the enigmadti working group

    Neuroimage

    (2013)
  • J. Jenkins et al.

    Harmonization of methods to facilitate reproducibility in medical data processing: applications to diffusion tensor magnetic resonance imaging, in: 2016

    IEEE International Conference on Big Data

    (2016)
  • M. Jenkinson et al.

    BET2: MR-based estimation of brain, skull and scalp surfaces

  • J.H. Jensen et al.

    Diffusional kurtosis imaging: the quantification of non-Gaussian water diffusion by means of magnetic resonance imaging

    Magn. Reson. Med.

    (2005)
  • E. Kellner et al.

    Gibbsringing artifact removal based on local subvoxelshifts

    Magn. Reson. Med.

    (2015)
  • S. Kelly et al.

    Widespread white matter microstructural differences in schizophrenia across 4322 individuals: results from the enigma schizophrenia dti working group

    Mol. Psychiatr.

    (2017)
  • P. Kochunov et al.

    Multi-site study of additive genetic effects on fractional anisotropy of cerebral white matter: comparing meta and megaanalytical approaches for data pooling

    Neuroimage

    (2014)
  • B.A. Landman et al.

    Effects of diffusion weighting schemes on the reproducibility of dti-derived fractional anisotropy, mean diffusivity, and principal eigenvector measurements at 1.5 t

    Neuroimage

    (2007)
  • B.A. Landman et al.

    Multi-parametric neuroimaging reproducibility: a 3-t resource study

    Neuroimage

    (2011)
  • C. Lebel et al.

    Microstructural maturation of the human brain from childhood to adulthood

    Neuroimage

    (2008)
  • J.G. Malcolm et al.

    Filtered multitensor tractography

    IEEE Trans. Med. Imag.

    (2010)
  • H. Mirzaalian et al.

    Inter-site and inter-scanner diffusion mri data harmonization

    Neuroimage

    (2016)
  • Cited by (103)

    • Radiotherapy outcome prediction with medical imaging

      2023, Machine Learning and Artificial Intelligence in Radiation Oncology: A Guide for Clinicians
    View all citing articles on Scopus
    View full text