Free manipulation of BIDS datasets according to the Inheritance Principle
Robert E. Smith; https://x.com/Lestropie
No response
Brainhack Aus
The Inheritance Principle (“IP”) has been a feature of the Brain Imaging Data Structure (BIDS) since its inception. It describes how the metadata relevant to one specific data file may in fact originate from multiple metadata files. Encoding some common metadata field just once, and having it be deemed applicable to multiple metadata files, may both reduce redundancy and communicate the intrinsic hierarchical structure of the dataset. This however comes at considerable complexity to the specification itself and the software responsible for interfacing with such data. Indeed nearly a decade after its creation, many software APIs for parsing BIDS data still do not support the Inheritance Principle.
The future of the IP is at a crossroads, particularly as work progresses toward BIDS 2.0:
The goals of this project are twofold.
There is a dearth of existing software tools for managing the IP. If a dataset that exploits the IP were provided as input to a BIDS App that is naive to the IP, that App could perform improper association of metadata. Existing datasets may contain substantial metadata redundancy, due to not currently having an easy way through which to exploit the IP. A tool to automate such data manipulations could therefore be useful in the BIDS ecosystem.
Consequential decisions regarding the IP and related issues in BIDS 2.0 would be better informed if stakeholders could see the effects that proposed specifications would have on how compliant datasets appear.
https://github.com/Lestropie/IP-freely
Primary goal: Produce one or more exemplar BIDS datasets that exemplify the complexity of the Inheritance Principle, which can therefore be used to demonstrate the influence of its removal or exploitation under different rule sets: https://github.com/Lestropie/IP-freely/issues/5
Primary goal: Take a BIDS dataset that may include use of the Inheritance Principle, and create a new version of that dataset that removes all usage of the Inheritance Principle: https://github.com/Lestropie/IP-freely/issues/1
Primary goal: Take a BIDS dataset that may (but likely does not) include use of the Inheritance Principle, and create a new version of that dataset that makes maximal use of the Inheritance Principle (without overloading): https://github.com/Lestropie/IP-freely/issues/2
Secondary goals: Other issues listed at https://github.com/Lestropie/IP-freely/issues .
For anyone not accustomed to open source projects, perhaps the most accessible entrypoint will be the construction of one or more exemplar datasets: https://github.com/Lestropie/IP-freely/issues/5
TBA
Python: Moderate
The code logic for manipulating metadata will necessitate some competence with Python dictionaries; nothing super complex, but maybe not accessible for someone who has never touched them.
Expect to make use of the pathlib
module.
If any contributor were to have existing expertise with pybids
that may also be beneficial.
BIDS: Variable
The necessary fundamentals of BIDS are just filesystem structure (directories, entities and suffices) and the idea of a metadata dictionary. However creation of exemplar datasets would be facilitated by someone who:
Current status of the Inheritance Principle: https://bids-specification.readthedocs.io/en/stable/common-principles.html#the-inheritance-principle
Original context that motivated pursuit of augmentation of the Inheritance Principle: https://github.com/bids-standard/bids-bep016/issues/50
Previously proposed (but rejected) extension of the Inheritance Principle: https://github.com/bids-standard/bids-specification/pull/1003 (and other threads linked therein)
Existing repository that contains data-empty exemplar datasets for the purpose of validation: https://github.com/bids-standard/bids-examples
Discussion regarding potential modification of Inheritance Principle for BIDS 2.0: https://github.com/bids-standard/bids-2-devel/issues/65
Experience building a piece of open source scientific software from scratch
Basic algorithmic design for the various manipulations of metadata
Understanding correspondence between specification rules as described in plaintext language and programmatic implementation
Can provide an introduction to the basics of Git for any participants not already familiar with such.
Proper evaluation of the proposed tool will necessitate construction of one or more novel datasets.
Some insight may be gained from executing the tool against any existing BIDS datasets, whether public or private. Of particular interest would be any datasets known to already be utilising the Inheritance Principle.
1
Intend to set up the GitHub all-contributors bot for this new Project.
Depending on scope and eventual adoption, publication in the Journal of Open Source Science (JOSS) may be pursued.
Leave this text if you don’t have an image yet.
coding_methods, data_management
0_concept_no_content
reproducible_scientific_methods
BIDS
Python
behavioral, DWI, ECG, ECOG, EEG, eye_tracking, fMRI, fNIRS, MEG, MRI, PET, TDCS, TMS
0_no_git_skills
No response
Hi @brainhackorg/project-monitors my project is ready!