Chapter 2 Brain Imaging Data Structure
2.1 About BIDS
Neuroimaging data are massive and complicated and challenging to organize. Data from the scanner results in a large number of files per participant and later analyses generate even more files. Neuroimaging experiments result in complex data that can be arranged in many different ways. To date, there is no consensus on how to organize and share data. Researchers working in the same lab can opt to arrange their data in different and idiosyncratic ways. Time is wasted rearranging data and rewriting scripts. Using the BIDS formatting, you will save time and reduce errors/misunderstandings. BIDS is for everyone, and all users can take part in the benefits of organized data, reproducible research, and data sharing.
2.2 Benefits
Keeping neuroimaging data organized on your computer system is just as important as good practices in a wet lab: following explicit protocols for reagent usage and storage, logging dates and times of equipment usage, maintenance, and servicing, and maintaining sterile glassware and tools to ensure that chemical reactions occur as intended, thus preventing the need to troubleshoot experimental protocols to overcome the unintended effects of contamination, missing or expired reagents, or omitted experimental steps. Disorganized data can lead to data corruption, switching of participant images, mistakes, etc. Neuroimaging analyses can take years to complete and having to repeat these analyses because of data set uncertainty is a tremendous waste of time and resources.
Other benefits of using BIDS includes:
- It will be easy for another researcher to work on your data. To understand the organization of the files and their format you will only need to refer them to this document. This is especially important if you are running your own lab and anticipate more than one person working on the same data over time. By using BIDS, you will save time trying to understand and reuse data acquired by a graduate student or postdoc that has already left the lab.
- There is a growing number of data analysis software packages that can understand data organized according to BIDS.
- Databases such as OpenNeuro.org, LORIS, COINS, XNAT, SciTran, and others will accept and export datasets organized according to BIDS. If you ever plan to share your data publicly (nowadays some journals require this) you can speed up the curation process by using BIDS.
- There are validation tools (also available online) that can check your dataset integrity and let you quickly spot missing values.
2.3 File Formats
BIDS focus is on raw NIFTI data (minimally processed), not source (e.g., DICOM) or derived data (e.g., post-processed images). For imaging data, files should be in NIFTI format converted from DICOM using dcm2niix program. BIDS does include other files as well. Non-imaging data like demographics, neuropsychological data, etc. are kept in tabular files (.tsv). Imaging metadata is saved as JSON file. Other metadata files include data dictionary for tabular files, data description, etc.
2.4 Single Session
MRI acquisitions are considered a single session when they are acquired during a continuous, uninterrupted block of time in the scanner.
2.4.1 T1w
STUDY_DIR/
|–– sub-001/
|–– anat/
|–– sub-001_T1w.json
|–– sub-001_T1w.nii
|–– participants.tsv
|–– dataset_description.json
2.4.2 T1w and in-plane T2
STUDY_DIR/
|–– sub-001/
|–– anat/
|–– sub-001_T1w.json
|–– sub-001_T1w.nii
|–– sub-001_inplaneT2.json
|–– sub-001_inplaneT2.nii
|–– participants.tsv
|–– dataset_description.json
2.4.3 T1w, in-plane T2, and DWI
STUDY_DIR/
|–– sub-001/
|–– anat/
|–– sub-001_T1w.json
|–– sub-001_T1w.nii
|–– sub-001_inplaneT2.json
|–– sub-001_inplaneT2.nii
|–– dwi/
|–– sub-001_dwi.json
|–– sub-001_dwi.nii.gz
|–– sub-001_dwi.bvec
|–– sub-001_dwi.bval
|–– participants.tsv
|–– dataset_description.json
2.5 Single Session with Multiple Acquisitions
Since no more than one file may be defined at a given level, additional labeling is needed for multiple runs during a single session with the same acquisition (acq) parameters.
2.5.1 T1w
STUDY_DIR/
|–– sub-001/
|–– anat/
|–– sub-001_acq-test1_T1w.json
|–– sub-001_acq-test1_T1w.nii
|–– sub-001_acq-test2_T1w.json
|–– sub-001_acq-test2_T1w.nii
|–– participants.tsv
|–– dataset_description.json
2.5.2 T1w and in-plane T2
STUDY_DIR/
|–– sub-001/
|–– anat/
|–– sub-001_acq-test1_T1w.json
|–– sub-001_acq-test1_T1w.nii
|–– sub-001_acq-test2_T1w.json
|–– sub-001_acq-test2_T1w.nii
|–– sub-001_acq-test1_inplaneT2.json
|–– sub-001_acq-test1_inplaneT2.nii
|–– sub-001_acq-test2_inplaneT2.json
|–– sub-001_acq-test2_inplaneT2.nii
|–– participants.tsv
|–– dataset_description.json
2.5.3 T1w, in-plane T2, and DWI
STUDY_DIR/
|–– sub-001/
|–– anat/
|–– sub-001_acq-test1_T1w.json
|–– sub-001_acq-test1_T1w.nii
|–– sub-001_acq-test2_T1w.json
|–– sub-001_acq-test2_T1w.nii
|–– sub-001_acq-test1_inplaneT2.json
|–– sub-001_acq-test1_inplaneT2.nii
|–– sub-001_acq-test2_inplaneT2.json
|–– sub-001_acq-test2_inplaneT2.nii
|–– dwi/
|–– sub-001_acq-test1_dwi.json
|–– sub-001_acq-test1_dwi.nii.gz
|–– sub-001_acq-test1_dwi.bvec
|–– sub-001_acq-test1_dwi.bval
|–– sub-001_acq-test2_dwi.json
|–– sub-001_acq-test2_dwi.nii.gz
|–– sub-001_acq-test2_dwi.bvec
|–– sub-001_acq-test2_dwi.bval
|–– participants.tsv
|–– dataset_description.json
2.6 Multiple Sessions
Defining multiple sessions is appropriate when data acquisitions are performed on all subjects multiple times across sessions. Sessions can be across days, but can also be across time within a single day. Therefore, multiple sessions don’t have to be synonymous with a longitudinal study.
2.6.1 T1w
STUDY_DIR/
|–– sub-001/
|ses-pre/
|–– anat/
|–– sub-001_ses-pre_T1w.json
|–– sub-001_ses-pre_T1w.nii
|ses-post/
|–– anat/
|–– sub-001_ses-post_T1w.json
|–– sub-001_ses-post_T1w.nii
|–– participants.tsv
|–– dataset_description.json
2.6.2 T1w and in-plane T2
STUDY_DIR/
|–– sub-001/
|ses-pre/
|–– anat/
|–– sub-001_ses-pre_T1w.json
|–– sub-001_ses-pre_T1w.nii
|–– sub-001_ses-pre_inplaneT2.json
|–– sub-001_ses-pre_inplaneT2.nii
|ses-post/
|–– anat/
|–– sub-001_ses-post_T1w.json
|–– sub-001_ses-post_T1w.nii
|–– sub-001_ses-post_inplaneT2.json
|–– sub-001_ses-post_inplaneT2.nii
|–– participants.tsv
|–– dataset_description.json
2.6.3 T1w, in-plane T2, and DWI
STUDY_DIR/
|–– sub-001/
|ses-pre/
|–– anat/
|–– sub-001_ses-pre_T1w.json
|–– sub-001_ses-pre_T1w.nii
|–– sub-001_ses-pre_inplaneT2.json
|–– sub-001_ses-pre_inplaneT2.nii
|–– dwi/
|–– sub-001_ses-pre_dwi.json
|–– sub-001_ses-pre_dwi.nii.gz
|–– sub-001_ses-pre_dwi.bvec
|–– sub-001_ses-pre_dwi.bval
|ses-post/
|–– anat/
|–– sub-001_ses-post_T1w.json
|–– sub-001_ses-post_T1w.nii
|–– sub-001_ses-post_inplaneT2.json
|–– sub-001_ses-post_inplaneT2.nii
|–– dwi/
|–– sub-001_ses-post_dwi.json
|–– sub-001_ses-post_dwi.nii.gz
|–– sub-001_ses-post_dwi.bvec
|–– sub-001_ses-post_dwi.bval
|–– participants.tsv
|–– dataset_description.json
2.7 ACAP Study
Here’s how the ACAP study should be organized.
ACAP/
|–– sub-ACAP1001/
|–– ses-BL/
|–– anat/
|–– sub-ACAP1001_ses-BL_T1w.json
|–– sub-ACAP1001_ses-BL_T1w.nii
|–– sub-ACAP1001_ses-BL_T2w.json
|–– sub-ACAP1001_ses-BL_T2w.nii.gz
|–– dwi/
|–– sub-ACAP1001_ses-BL_acq-b900_dwi.json
|–– sub-ACAP1001_ses-BL_acq-b900_dwi.nii.gz
|–– sub-ACAP1001_ses-BL_acq-b900_dwi.bvec
|–– sub-ACAP1001_ses-BL_acq-b900_dwi.bval
|–– sub-ACAP1001_ses-BL_acq-b2000_dwi.json
|–– sub-ACAP1001_ses-BL_acq-b2000_dwi.nii.gz
|–– sub-ACAP1001_ses-BL_acq-b2000_dwi.bvec
|–– sub-ACAP1001_ses-BL_acq-b2000_dwi.bval
|–– ses-FU3/
|–– anat/
|–– sub-ACAP1001_ses-FU3_T1w.json
|–– sub-ACAP1001_ses-FU3_T1w.nii
|–– sub-ACAP1001_ses-FU3_T2w.json
|–– sub-ACAP1001_ses-FU3_T2w.nii.gz
|–– dwi/
|–– sub-ACAP1001_ses-FU3_acq-b900_dwi.json
|–– sub-ACAP1001_ses-FU3_acq-b900_dwi.nii.gz
|–– sub-ACAP1001_ses-FU3_acq-b900_dwi.bvec
|–– sub-ACAP1001_ses-FU3_acq-b900_dwi.bval
|–– sub-ACAP1001_ses-FU3_acq-b2000_dwi.json
|–– sub-ACAP1001_ses-FU3_acq-b2000_dwi.nii.gz
|–– sub-ACAP1001_ses-FU3_acq-b2000_dwi.bvec
|–– sub-ACAP1001_ses-FU3_acq-b2000_dwi.bval
2.8 Tabular Files
If creating .tsv is daunting, converting from comma-separated files (.csv) is easy. TSV file is a delimited text file that uses tabs to separate values. A CSV file uses a comma to separate values. Either is a simple file format used to store tabular data (numbers and text) in plain text. Each line of the file is a data record. Files in either TSV or CSV format can be imported to and exported from programs that store data in tables, such as Microsoft Excel, R, etc. The only participant property required is the participant_id which should match the sub-
participant_id study_id group gender age WASI2FSIQ
sub-001 1004 tbi male 15.82 75
sub-002 1005 oi male 12.88 106
sub-003 1011 oi female 13.48 110
sub-004 1015 tbi female 9.33 72
2.9 Metadata
The data description file needs to contain information about the dataset and open licenses if applicable.
{
"BIDSVersion": "1.0.0",
"Name": "Mild Injury Outcome Study"
}
JSON files for the imaging data are automatically generated from the dcm2niix program and contain information about the scanner and sequence.
{
"Modality": "MR",
"MagneticFieldStrength": 3,
"Manufacturer": "Siemens",
"ManufacturersModelName": "Skyra",
"InstitutionName": "Anonymous_Institution",
"DeviceSerialNumber": "45603",
"BodyPartExamined": "HEAD",
"PatientPosition": "HFS",
"ProcedureStepDescription": "HEAD_RESEARCH_BRAIN",
"SoftwareVersions": "syngo_MR_E11",
"MRAcquisitionType": "3D",
"SeriesDescription": "SAG_IR-MPRAGE",
"ProtocolName": "SAG_IR-MPRAGE",
"ScanningSequence": "GR_IR",
"SequenceVariant": "SK_SP_MP_OSP",
"ScanOptions": "IR",
"SequenceName": "_tfl3d1_16ns",
"ImageType": ["ORIGINAL", "PRIMARY", "M", "ND", "NORM"],
"SeriesNumber": 5,
"AcquisitionTime": "19:07:19.265000",
"AcquisitionNumber": 1,
"SliceThickness": 1.2,
"SAR": 0.0799092,
"EchoTime": 0.00229,
"RepetitionTime": 2.3,
"InversionTime": 0.9,
"FlipAngle": 8,
"PartialFourier": 1,
"BaseResolution": 256,
"ShimSetting": [
3993,
-5236,
-3657,
0,
-115,
11,
129,
-2 ],
"TxRefAmp": 349.117,
"PhaseResolution": 1,
"PhaseOversampling": 0.15,
"ReceiveCoilName": "Spine_32",
"ReceiveCoilActiveElements": "HE1-4;NE1,2;SP1",
"PulseSequenceDetails": "%SiemensSeq%_tfl",
"ConsistencyInfo": "N4_VE11A_LATEST_20140830",
"PercentPhaseFOV": 90.625,
"PhaseEncodingSteps": 266,
"AcquisitionMatrixPE": 232,
"ReconMatrixPE": 256,
"ParallelReductionFactorInPlane": 2,
"PixelBandwidth": 200,
"DwellTime": 9.8e-06,
"ImageOrientationPatientDICOM": [
-0.0414933,
0.902626,
0.428422,
-0.00450578,
0.428618,
-0.903475 ],
"InPlanePhaseEncodingDirectionDICOM": "ROW",
"ConversionSoftware": "dcm2niix",
"ConversionSoftwareVersion": "v1.0.20171215 Clang9.0.0"
}
2.10 Checking Compliance
You can check whether the data are compliant with the BIDS format. The link only works with Chrome and Firefox: http://incf.github.io/bids-validator.