Thanks to Tomas I submitted an abstract to the Linux conference Australia and I was very lucky that I was invited to present. This was my first Linux conference and it was a great experience – all talks I saw had a very high quality and I learned a lot in the sessions and in the hallway tracks (e.g. lock-picking). Below is the talk I gave where I am attempting to explain what “high performance science” is, what tools we use and which things I would love to see in the future. My favorite feedback after the session was: “Wow – I had no idea how complex scientific data analysis can be and I now have a much better understanding what tools we can develop to help”
Abstract: Today’s neuroscientific results are based on complex image analyses that involve a large variety of open source software. Unfortunately the last years have shown that many neuroscientific findings cannot be replicated and that data sets are not yet shared widely to enable efficient reuse of data and alternative analyses. In this talk I want to give an overview of how the neuroscience field aims for open and reproducible analyses and by this enables better research and more reliable findings. I want to show how the field progressed to use open data formats for storing raw data acquired from MRI scanners (ISMRMRD), how processed images that can reach multiple TB in file size are stored, including data provenance information (MINC), and how these data are analysed in a reproducible way using software containers (SINGULARITY) that allow scaling an analysis from a small development platform to large scale high performance computing systems and enable the efficient managing of different software versions and the reproducibility of analyses. Finally, I want to touch on sharing neuro scientific data in open repositories that allow the re-use and pooling of large datasets to derive new results that would be impossible without data sharing.
Slides:
SteffenBollmannNI
0 Comments