{"id":1,"date":"2020-05-27T06:46:04","date_gmt":"2020-05-27T06:46:04","guid":{"rendered":"https:\/\/mri.sbollmann.net\/?p=1"},"modified":"2021-04-28T05:38:59","modified_gmt":"2021-04-28T05:38:59","slug":"google-colab-osf","status":"publish","type":"post","link":"https:\/\/mri.sbollmann.net\/index.php\/2020\/05\/27\/google-colab-osf\/","title":{"rendered":"Building an Interactive  paper supplement with Google Colab and the Open Science Foundation"},"content":{"rendered":"\n<p>For a paper we recently submitted (Improving FLAIR SAR efficiency at 7T by adaptive tailoring of adiabatic pulse power using deep convolutional neural networks &#8211; pre-print here: <a href=\"https:\/\/arxiv.org\/abs\/1911.08118\">https:\/\/arxiv.org\/abs\/1911.08118<\/a> and published article here: <a href=\"https:\/\/doi.org\/10.1002\/mrm.28590\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/doi.org\/10.1002\/mrm.28590<\/a>) I was wondering if we can do more than just provide the source code. One problem I often see is that some people have trouble setting up the environments and executing the code, especially when it involves Tensorflow, and it often takes a few emails back and forth before things are running smoothly. Containers do help to a certain extend, but don&#8217;t solve all problems. Another problem is that not everyone has a GPU available for running Deep Learning Training algorithms or prediction workloads and often results differ when running on a CPU vs GPU (e.g. 3D convolutions in older versions of Tensorflow crash on a CPU).<\/p>\n\n\n\n<p>How good would it be to just open a fully configured environment in the browser and start playing? Looking at options out there, one finds things like <a rel=\"noreferrer noopener\" href=\"https:\/\/mybinder.org\/\" target=\"_blank\">binder<\/a>, <a rel=\"noreferrer noopener\" href=\"https:\/\/codeocean.com\/\" target=\"_blank\">code ocean<\/a>, or <a rel=\"noreferrer noopener\" href=\"https:\/\/colab.research.google.com\/\" target=\"_blank\">google colab<\/a>. <\/p>\n\n\n\n<p>Google Colab stood out as working really well for the application I had in mind, but one problem  quickly surfaced as it is not possible to package data easily into the notebook. One solution is using the storage provided by the <a rel=\"noreferrer noopener\" href=\"https:\/\/osf.io\/\" target=\"_blank\">open science foundation<\/a>.<\/p>\n\n\n\n<p>So this is how the current version looks like: The <a href=\"https:\/\/osf.io\/y5cq9\/\" target=\"_blank\" rel=\"noreferrer noopener\">OSF repository<\/a> contains the training data, the tensorflow checkpoint to reproduce the results from the paper and a link to the <a rel=\"noreferrer noopener\" href=\"https:\/\/colab.research.google.com\/drive\/1EOg5r30w0NtXJvTFdGhlR0ysGfaEaTpP\" target=\"_blank\">google colab notebook<\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/colab.research.google.com\/drive\/1EOg5r30w0NtXJvTFdGhlR0ysGfaEaTpP?usp=sharing\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"665\" src=\"https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2020\/05\/image-1024x665.png\" alt=\"\" class=\"wp-image-80\" srcset=\"https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2020\/05\/image-1024x665.png 1024w, https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2020\/05\/image-300x195.png 300w, https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2020\/05\/image-768x499.png 768w, https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2020\/05\/image.png 1308w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>In the notebook we download the OSF files to the google colab VM using the command line interface provided by the <a href=\"https:\/\/github.com\/osfclient\/osfclient\" target=\"_blank\" rel=\"noreferrer noopener\">OSFclient <\/a>project:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>!osf\u00a0-p\u00a0y5cq9 clone\u00a0.<\/code><\/pre>\n\n\n\n<p>There seems to be a bug in the command line client that breaks the clone command if the github add-on is activated as well, so in case this step fails try to deactivate the github add-on in OSF. <\/p>\n\n\n\n<p>Also uploading the initial training data was very convenient using the CLI:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install osfclient\nosf init\nosf upload -r . .<\/code><\/pre>\n\n\n\n<p>When the training is done we store the results in the users google drive:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from google.colab import drive\nimport os\ngoogle_drive_dir = '\/content\/drive\/My Drive\/scout2B1'\ndrive.mount('\/content\/drive')\nif not os.path.isdir(google_drive_dir):\n  os.mkdir(google_drive_dir)\nos.chdir(google_drive_dir)<\/code><\/pre>\n\n\n\n<p>What do people think about this? Is this easy enough and useful? What are others using to share results and analysis pipelines with readers? Let me know in the comments \ud83d\ude42<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For a paper we recently submitted (Improving FLAIR SAR efficiency at 7T by adaptive tailoring of adiabatic pulse power using deep convolutional neural networks &#8211; pre-print here: https:\/\/arxiv.org\/abs\/1911.08118 and published article here: https:\/\/doi.org\/10.1002\/mrm.28590) I was wondering if we can do more than just provide the source code. One problem I [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":80,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,28,6,27,7,4,3,5],"tags":[],"class_list":["post-1","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-code","category-data-management","category-deep-learning","category-osf","category-publishing","category-python","category-reproducibility","category-tensorflow"],"jetpack_featured_media_url":"https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2020\/05\/image.png","_links":{"self":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts\/1","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/comments?post=1"}],"version-history":[{"count":4,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts\/1\/revisions"}],"predecessor-version":[{"id":333,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts\/1\/revisions\/333"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/media\/80"}],"wp:attachment":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/media?parent=1"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/categories?post=1"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/tags?post=1"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}