February 24, 2024

Super-resolution generative adversarial networks with static T2*WI-based subject-specific learning to improve spatial difference sensitivity in fMRI activation


Adhering to the Declaration of Helsinki, informed consent was obtained in writing from all participants prior to participation. The experimental protocols, which were approved by the Institutional Review Board at the National Institutes for Quantum and Radiological Science and Technology, conformed to the safety guidelines for MRI research.

A total of 35 healthy female volunteers (mean age 26.9 ± 6.7 years) with no history of neurological disease were selected as candidates for this study. The data from five subjects were excluded for the following reasons: the image data were damaged due to a technical error (1 subject), the candidate was visually impaired and unable to perform the task appropriately (1 subject), there were severe motion artifacts (1 subject), and the candidate failed to perform the task satisfactorily for indeterminate reasons (2 subjects).

MRI data acquisition

All subjects underwent a 3T MRI scan with a MAGNETOM Verio scanner (Siemens AG; Munich, Germany). fMRI scanning was performed using a gradient-echo echo-planar imaging (GE-EPI) sequence (echo time: 25 ms, repetition time: 500 ms, flip angle: 44°, field-of-view: 1440 mm × 1440 mm, acquisition matrix: 64 × 64, slice thickness: 4 mm, slices: 30, total scans: 900) during a finger-tapping task. In addition, T2*WI were acquired using a two-dimensional (2D) rapid gradient-echo sequence (echo time: 25 ms, repetition time: 2000 ms, flip angle: 90°, field-of-view: 240 mm × 240 mm, acquisition matrix: 128 × 128 and 64 × 64, slice thickness: 4 mm, number of slices: 30). Furthermore, T1-weighted MRI images were acquired using a three-dimensional (3D) magnetization-prepared rapid gradient-echo sequence (echo time: 1.98 ms, repetition time: 2300 ms, flip angle: 9°, field-of-view: 250 mm × 250 mm, acquisition matrix: 256 × 256, slice thickness: 1 mm). Table 1 shows the parameters of the fMRI, T2*-weighted MRI, and T1-weighted MRI scans.

Table 1 Magnetic resonance imaging scan parameters.

Finger-tapping procedure

A finger-tapping task was performed during fMRI scanning. Supplementary Figure 1 outlines the task protocol, which included phases of tapping either the thumb or little finger of one hand and resting phases between each task. Prior to beginning the experiment, participants were given sufficient time to familiarize themselves with the tasks and select which hand they would use for tapping. The instructions on which finger to tap or rest were provided on a screen behind the participant’s head, and were viewed through a mirror mounted on the head coil. The projection was presented using E-prime 1.0 (Psychology Software Tools, PA, USA). Each subject was instructed to tap the cued finger, but not the adjacent fingers, at their own pace.

Functional analysis

Before functional analysis, the first 60 scans were excluded from the analysis to ensure that the magnetization reached equilibrium15. After coregistration of the T1WI structured data to the automated anatomical labeling (AAL) atlas16, the functional data was coregistered to the T1WI data. The transformations were then combined to identify the motor area in the functional data sets. In addition, linear trends in the time series were removed, and the noise level was reduced by applying a low-pass filter to each pixel. Spatial filtering was also applied using a Gaussian filter with \(\sigma =1.5\).

After this preprocessing, functional activation maps were obtained from the image time series by correlating the signal intensity time-course of each pixel with an on/off task design convolved with a canonical hemodynamic response function. SPM12 (revision 7219)17 was used for the analysis. The cross-correlation (CC) coefficient was calculated for each pixel using

$$CC=\frac\overrightarrowR_x\cdot \overrightarrowR_y,$$


where \(\overrightarrowR_x\) is the reference task design and \(\overrightarrowR_y\) is the signal intensity time-course of the pixel15. All image preprocessing and functional analysis was performed in MATLAB R2018b (Mathworks, Natick, MA, USA).

Deep learning-based super-resolution

Figure 1 depicts an overview of the proposed method. The STSS-SRfMRI scheme includes two unique ideas: first, it uses high spatial resolution static T2*WI as the training data; second, it applies subject-specific learning. As described in the introduction, the static T2*WI were used to introduce high spatial resolution information into the training process. Also, as functional signal changes are usually quite small, subject-specific learning was used to eliminate any anatomical variation that might be artificially introduced by including T2*WI data from other subjects.

Figure 1
figure 1

Overview of the Static T2*WI-based Subject-Specific Super Resolution fMRI (STSS-SRfMRI) scheme proposed in this study. The upper and lower parts correspond to the training and testing phases, respectively. In the training phase, the generator (G) was optimized to form a relationship between the low-resolution and high-resolution T2*WI. The discriminator (D) made a decision whether the input was “real” (i.e., the reference high-res T2*WI) or “fake” (i.e., the generated high-res T2*WI). G learned to generate more realistic output via feedback from D. In the testing phase, a high-resolution functional MRI (fMRI) time series was reconstructed from the low-resolution fMRI data using the optimized generator, and subsequently a high-resolution functional map was calculated based on the high-resolution fMRI.

Before training, the pixel intensity of the T2*WI training data was adjusted and scaled to match the intensity of the fMRI data. All 30 slices of the T2*WI data from each subject were used for training and validation to build a subject-specific model. The trained model was then applied to the fMRI data from the same subject.

The SRGAN used in this work was customized in several ways. Rather than using an up-sampling block in the generator G, the low resolution images were upscaled to a 128 × 128 matrix size using lanczos 3 interpolation18,19 before being input. All the batch normalization layers were also removed20. A discriminator (D) was applied with the number of convolutional layers set to 10 to accommodate the size of the input. We implemented the modified SRGAN network using an adaptive moment estimation (Adam) optimizer with an initial decay rate of 0.9, a scaling factor of 2, patch size of 64, batch size of 2, an initial learning rate of 0.0001, and 100,000 iterations. The training images were the 30 slices of the corresponding T2*WI data. The experiments were implemented in PyTorch 1.1.0 on Ubuntu 16.04 LTS.

Identifying the neural activation-related region

The activation maps generated from the low-resolution fMRI data (the raw map) and from the processed output of the STSS-SRfMRI scheme (STSS-SR fMRI map), were compared based on how effectively they localized the activation region. For this purpose, the regions corresponding to the thumb and little finger activation tasks were separately identified for the raw fMRI and STSS-SRfMRI maps of each subject. First, a CC map was calculated for each input image series (i.e., the raw or STSS-SR data) for each subject and each activated finger. Second, the activation-related region in each CC map was defined as the region consisting of pixels having values equal to or above a threshold value, see Fig. 2. The threshold value was defined as

Figure 2
figure 2

Overview of how the activation-related region was defined for each tapping task. First the activation maps were obtained from the raw and the Static T2*WI-based Subject-Specific Super Resolution fMRI (STSS-SRfMRI) image series (top row). Second, the top 25% between the max and minimum CC values was set as the threshold (middle row). Finally, the region consisting of pixels having values equal to or higher than the threshold value was defined as the activation-related region (bottom).



The number of pixels included in the activation-related region of the raw fMRI map was compared to that of the STSS-SR fMRI map for each finger of each subject. As the STSS-SR fMRI maps had pixels that were four times smaller than those of the raw fMRI maps for the same sized area, the number of pixels in the STSS-SR fMRI maps was divided by 4 before comparison.

Independence of the extracted activated regions for the different tasks

The raw fMRI and STSS-SR fMRI maps obtained in the previous sub section were compared to determine which of them has a higher functional resolution for the thumb and little finger tasks. For this purpose, a Dice coefficient21,22 was calculated for the extracted activation-related regions of the thumb and little finger for each subject (Fig. 3). This assessment was based on the well-known fact that the motor function areas for the thumb and little finger are not the same23,24.

Figure 3
figure 3

Definition of the Dice coefficient used in this study to assess how clearly the activated regions corresponding to the thumb and the little finger tasks were separated. The Dice coefficient was calculated for the extracted activation-related regions of the thumb (green) and little finger (blue) for each subject. The light-blue area corresponds to the overlap between the activation-related regions for the thumb and little finger.

Statistical analysis

The number of pixels included in each activation-related region, and the Dice coefficient calculated from the raw fMRI and STSS-SR fMRI maps were statistically compared using the Wilcoxon signed-rank test (p < 0.05 was considered significant). The EZR graphical interface to R version 3.5.225, was used to make these statistical comparisons.