What to do with sh***y (clinical) data? How to apply your perfect estimator/model/method to data that contains outliers?

Project info

Title: What to do when (clinical) Diffusion Weighted Image data quality is sh***y: How to adjust for it in modeling and estimate the confidence of your model afterward?

Project lead: Viljami Sairanen Twitter @ViljamiSairanen

Project collaborators: DICE Lab (PI: Alessandro Daducci, University of Verona, Italy), BABA Center (PI: Sampsa Vanhatalo, University of Helsinki, Finland)

Registered Brainhack Global 2021 Event: Brainhack Atlantis The Atlantic Ocean - Micro2Macro

Project Description:

1 Aim: Modify existing brain structure modeling algorithms and estimators to adjust for outliers in the measurements to enable their usage in clinical neuroresearch. The aim is to publish the results as a journal article.

2 Reason: Clinical data is often corrupted by outliers and artifacts from various sources (e.g. subject motion shown if Fig.1). image Fig. 1 Visualization of the differences between the cutting-edge research data of the Human Connectome Project and clinical patient data of a neonatal subject. A) A sagittal slice of a DW-MRI data from an HCP subject and B) an axial slice from the same HCP subject are only slightly affected by noise thus being easy to process with any modeling algorithm. C) A sagittal slice of a DW-MRI data from a clinical neonatal subject with the red arrow pointing to two adjacent axial slices in D) showing a good quality slice and E) showing missing data error due to subject motion. This clinical example is a very common case of suboptimal image quality in neonatal imaging projects or other projects with uncooperative patients and subjects. Therefore, processing steps developed using data primarily from sources with excellent quality could result in highly inaccurate connectome models when applied to clinical data.

3 How: A highly realistic ground-truth simulation of diffusion-weighted image (DWI) set of the human brain structures is shared with the participants. Simulation is based on the Human Connectome Project data and is generated using MRTrix3 and COMMIT framework. A Python script is provided to introduce the outliers and realistic noise distribution to the ground-truth dataset for algorithm evaluation purposes. The simulated dataset will be sent directly to the participants as the HCP data (which this quite directly is) should not be hosted outside their original repository.

Participants can either propose their own algorithms to be evaluated with this data or completely new estimation algorithms can be proposed. Finally, the best algorithms will be compared with real human data acquired from neonatal subjects that contain typical motion artefacts. This data cannot be shared with the participants due to the sensitive nature of data. The acquisition consists of multiple reference b0 images and two shells on (60 DWIs with b-value of 750 and 74 with b-value of 1800).

Data to use: A highly realistic synthetic diffusion-weighted MRI dataset is shared with the participants here: GoogleDrive

Link to project repository/sources: Code resources will be shared here https://github.com/vilsaira/brainhack2021. Notably, the dataset is too large for GitHub.

Goals for Brainhack Global 2020: Evaluation of multiple robust microstructural models or DWI signal representations with the provided data. Results would be published in a peer-review journal and as a conference abstract (if suitable).

Good first issues:

  1. Selecting which model estimators will be used in this project (tensors, csd, noddi, etc.).
  2. Implementing the robust augmentations in each estimator.
  3. Comparing both the original and the robust estimator with the synthetic dataset.
  4. Optional comparison with the clinical neonatal data.

Skills: OS: Any Probably necessary: Python or Matlab, understanding of NIFTI or other neuroimage formats Likely useful: shell scripting, MRTrix3, COMMIT Definitely appreciated: consistent writing skills both in code and manuscript

Tools/Software/Methods to Use: Visual Studio Code, Python, Matlab, MRTrix3, COMMIT

Communication channels: https://mattermost.brainhack.org/brainhack/channels/brainhack-micro2macro-2020 https://mattermost.brainhack.org/brainhack/channels/brainhack-micro2macro-robust-applications


Project labels #bhg:micro2macro_gbr_1

Project Submission

Submission checklist

Once the issue is submitted, please check items in this list as you add under ‘Additional project info’

Optionally, you can also include information about:

We would like to think about how you will credit and onboard new members to your project. If you’d like to share your thoughts with future project participants, you can include information about:

Jan 1, 0001 12:00 AM