{"id":303,"date":"2021-01-11T00:57:39","date_gmt":"2021-01-11T00:57:39","guid":{"rendered":"https:\/\/mri.sbollmann.net\/?p=303"},"modified":"2021-01-11T00:57:40","modified_gmt":"2021-01-11T00:57:40","slug":"world-wide-mirrors-and-load-balancing-for-neurodesk-containers","status":"publish","type":"post","link":"https:\/\/mri.sbollmann.net\/index.php\/2021\/01\/11\/world-wide-mirrors-and-load-balancing-for-neurodesk-containers\/","title":{"rendered":"World wide mirrors and load balancing for NeuroDesk Containers"},"content":{"rendered":"\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Mirror Servers for NeuroDesk\" width=\"750\" height=\"422\" src=\"https:\/\/www.youtube.com\/embed\/dYcjDm4D1yY?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>Although NeuroDesk is still a young project we already have users outside of Australia. This led to a few problems:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>our singularity containers are stored on Swift object storage on the Nectar Research Cloud and users outside of Australia saw very slow downloads of the containers<\/li><li>there is only a single storage location, so when Nectar Swift storage had a few hiccups lately our container downloads just stalled and users where confused why nothing happened <\/li><\/ul>\n\n\n\n<p>So we tried to come up with a robust and sustainable solution to this problem. We first thought about setting up a load balancer or a commercial CDN, but both solutions would either involve more maintenance or high costs.<\/p>\n\n\n\n<p>After testing a few things we settled for an interesting setup that might also be useful for others: We use multiple object storages distributed over Australia, Europe and the US and we decided to do the load balancing on the client side using aria2 for downloading our container files.<\/p>\n\n\n\n<p>I hadn&#8217;t used aria2 before, but it&#8217;s basically cUrl on speed &#8211; It can download files from multiple sources at the same time:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>aria2c source_1 ... source_n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>aria2c https:\/\/swift.rc.nectar.org.au:8888\/v1\/AUTH_d6165cc7b52841659ce8644df1884d5e\/singularityImages\/$container https:\/\/objectstorage.us-ashburn-1.oraclecloud.com\/n\/nrrir2sdpmdp\/b\/neurodesk\/o\/$container https:\/\/objectstorage.eu-zurich-1.oraclecloud.com\/n\/nrrir2sdpmdp\/b\/neurodesk\/o\/$container\"\r<\/code><\/pre>\n\n\n\n<p>if one of the sources is not available it just skips it \ud83d\ude42 Also, it automatically downloads fastest from the fastest mirrors and surprisingly it&#8217;s not always the Australian mirror for me :p<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"260\" src=\"https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2021\/01\/image-19-1024x260.png\" alt=\"\" class=\"wp-image-306\" srcset=\"https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2021\/01\/image-19-1024x260.png 1024w, https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2021\/01\/image-19-300x76.png 300w, https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2021\/01\/image-19-768x195.png 768w, https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2021\/01\/image-19-1536x390.png 1536w, https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2021\/01\/image-19-2048x520.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>We use github actions to automatically build and upload the containers to the different buckets: <a href=\"https:\/\/github.com\/NeuroDesk\/neurodesk\/blob\/master\/.github\/workflows\/neurodesk.yml\" target=\"_blank\" rel=\"noreferrer noopener\">neurodesk\/neurodesk.yml at master \u00b7 NeuroDesk\/neurodesk (github.com)<\/a><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>  upload_containers_simg:\r\n    runs-on: ubuntu-18.04\r\n    steps:\r\n    - uses: actions\/checkout@v2\r\n    - uses: actions\/setup-python@v2\r\n      with:\r\n        python-version: 3.8\r\n    - name : Check if singularity cache files exist in oracle cloud and swift storage and build if not there\r\n      run: \/bin\/bash .github\/workflows\/upload_containers_simg.sh<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/env bash\r\n# set -e\r\n\r\necho \"checking if containers are built\"\r\n\r\n#creating logfile with available containers\r\npython3 neurodesk\/write_log.py\r\n\r\n# remove empty lines\r\nsed -i '\/^$\/d' log.txt\r\n\r\n# remove square brackets\r\nsed -i 's\/&#91;]&#91;]\/\/g' log.txt\r\n\r\n# remove spaces around\r\nsed -i -e 's\/^&#91; \\t]*\/\/' -e 's\/&#91; \\t]*$\/\/' log.txt\r\n\r\n# replace spaces with underscores\r\nsed -i 's\/ \/_\/g' log.txt\r\n\r\necho \"$GITHUB_TOKEN\" | docker login docker.pkg.github.com -u $GITHUB_ACTOR --password-stdin\r\necho \"$DOCKERHUB_PASSWORD\" | docker login -u $DOCKERHUB_USERNAME --password-stdin\r\n\r\nwhile IFS= read -r IMAGENAME_BUILDDATE\r\ndo\r\n    IMAGENAME=\"$(cut -d'_' -f1,2 &lt;&lt;&lt; ${IMAGENAME_BUILDDATE})\"\r\n    BUILDDATE=\"$(cut -d'_' -f3 &lt;&lt;&lt; ${IMAGENAME_BUILDDATE})\"\r\n    echo \"&#91;DEBUG] IMAGENAME: $IMAGENAME\"\r\n    echo \"&#91;DEBUG] BUILDDATE: $BUILDDATE\"\r\n\r\n    # Oracle Ashburn (with cloud mirror to Zurich)\r\n    if curl --output \/dev\/null --silent --head --fail \"https:\/\/objectstorage.us-ashburn-1.oraclecloud.com\/n\/nrrir2sdpmdp\/b\/neurodesk\/o\/${IMAGENAME_BUILDDATE}.simg\"; then\r\n        echo \"&#91;DEBUG] ${IMAGENAME_BUILDDATE}.simg exists in ashburn oracle cloud\"\r\n    else\r\n        # check if there is enough free disk space on the runner:\r\n        FREE=`df -k --output=avail \"$PWD\" | tail -n1`   # df -k not df -h\r\n        echo \"&#91;DEBUG] This runner has ${FREE} free disk space\"\r\n        if &#91;&#91; $FREE -lt 30485760 ]]; then               # 30G = 10*1024*1024k\r\n            echo \"&#91;DEBUG] This runner has not enough free disk space .. cleaning up!\"\r\n            bash .github\/workflows\/free-up-space.sh\r\n        fi;\r\n\r\n        if &#91; -n \"$singularity_setup_done\" ]; then\r\n            echo \"Setup already done. Skipping.\"\r\n        else\r\n            #setup singularity 2.6.1 from neurodebian\r\n            wget -O- http:\/\/neuro.debian.net\/lists\/bionic.us-nh.full | sudo tee \/etc\/apt\/sources.list.d\/neurodebian.sources.list\r\n            sudo apt-key adv --recv-keys --keyserver hkp:\/\/pool.sks-keyservers.net:80 0xA5D32F012649A5A9\r\n            sudo apt-get update\r\n            sudo apt-get install singularity-container\r\n\r\n            export IMAGE_HOME=\"\/home\/runner\"\r\n            export singularity_setup_done=\"true\"\r\n        fi\r\n\r\n        echo \"&#91;DEBUG] singularity building docker:\/\/vnmd\/$IMAGENAME:$BUILDDATE\"\r\n        sudo singularity build \"$IMAGE_HOME\/${IMAGENAME_BUILDDATE}.simg\"  docker:\/\/vnmd\/$IMAGENAME:$BUILDDATE\r\n\r\n        echo \"&#91;DEBUG] Attempting upload to Oracle ...\"\r\n        curl -v -X PUT -u ${ORACLE_USER} --upload-file $IMAGE_HOME\/${IMAGENAME_BUILDDATE}.simg $ORACLE_NEURODESK_BUCKET\r\n\r\n        if curl --output \/dev\/null --silent --head --fail \"https:\/\/objectstorage.us-ashburn-1.oraclecloud.com\/n\/nrrir2sdpmdp\/b\/neurodesk\/o\/${IMAGENAME_BUILDDATE}.simg\"; then\r\n            echo \"${IMAGENAME_BUILDDATE}.simg was freshly build and exists now :)\"\r\n        else\r\n            echo \"${IMAGENAME_BUILDDATE}.simg does not exist yet. Something is WRONG\"\r\n            exit 2\r\n        fi\r\n    fi\r\n\r\n    # Nectar Swift\r\n    if curl --output \/dev\/null --silent --head --fail \"https:\/\/swift.rc.nectar.org.au:8888\/v1\/AUTH_d6165cc7b52841659ce8644df1884d5e\/singularityImages\/${IMAGENAME_BUILDDATE}.simg\"; then\r\n        echo \"&#91;DEBUG] ${IMAGENAME_BUILDDATE}.simg exists in swift storage\"\r\n    else\r\n        echo \"&#91;DEBUG] ${IMAGENAME_BUILDDATE}.simg does not exist yet in nectar swift - uploading it there as well!\"\r\n        # check if there is enough free disk space on the runner:\r\n        FREE=`df -k --output=avail \"$PWD\" | tail -n1`   # df -k not df -h\r\n        echo \"&#91;DEBUG] This runner has ${FREE} free disk space\"\r\n        if &#91;&#91; $FREE -lt 10485760 ]]; then               # 10G = 10*1024*1024k\r\n            echo \"&#91;DEBUG] This runner has not enough free disk space .. cleaning up!\"\r\n            bash .github\/workflows\/free-up-space.sh\r\n        fi;\r\n\r\n        if &#91; -n \"$swift_setup_done\" ]; then\r\n            echo \"Setup already done. Skipping.\"\r\n        else\r\n            echo \"&#91;DEBUG] Configure for SWIFT storage\"\r\n            sudo pip3 install setuptools\r\n            sudo pip3 install wheel\r\n            sudo pip3 install python-swiftclient python-keystoneclient\r\n            export OS_AUTH_URL=https:\/\/keystone.rc.nectar.org.au:5000\/v3\/\r\n            export OS_AUTH_TYPE=v3applicationcredential\r\n            export OS_PROJECT_NAME=\"CAI_Container_Builder\"\r\n            export OS_USER_DOMAIN_NAME=\"Default\"\r\n            export OS_REGION_NAME=\"Melbourne\"\r\n\r\n            export IMAGE_HOME=\"\/home\/runner\"\r\n            export swift_setup_done=\"true\"\r\n        fi\r\n\r\n\r\n        echo \"&#91;DEBUG] ${IMAGENAME_BUILDDATE}.simg does not exist locally - pulling it from oracle cloud!\"\r\n        if &#91;&#91; ! -f $IMAGE_HOME\/${IMAGENAME_BUILDDATE}.simg ]]; then\r\n            curl -X GET https:\/\/objectstorage.eu-zurich-1.oraclecloud.com\/n\/nrrir2sdpmdp\/b\/neurodesk\/o\/${IMAGENAME_BUILDDATE}.simg -o $IMAGE_HOME\/${IMAGENAME_BUILDDATE}.simg\r\n        fi\r\n\r\n        echo \"&#91;DEBUG] Attempting upload to nectar swift ...\"\r\n        cd $IMAGE_HOME\r\n        swift upload singularityImages ${IMAGENAME_BUILDDATE}.simg --segment-size 1073741824\r\n\r\n        if curl --output \/dev\/null --silent --head --fail \"https:\/\/swift.rc.nectar.org.au:8888\/v1\/AUTH_d6165cc7b52841659ce8644df1884d5e\/singularityImages\/${IMAGENAME_BUILDDATE}.simg\"; then\r\n            echo \"&#91;DEBUG] ${IMAGENAME_BUILDDATE}.simg was freshly build and exists now. Cleaning UP! :)\"\r\n            rm ${IMAGENAME_BUILDDATE}.simg\r\n            sudo rm -rf \/root\/.singularity\/docker\r\n            df -h\r\n        else\r\n            echo \"&#91;DEBUG] ${IMAGENAME_BUILDDATE}.simg does not exist yet. Something is WRONG\"\r\n            exit 2\r\n        fi\r\n    fi\r\ndone &lt; log.txt<\/code><\/pre>\n\n\n\n<p>Maybe this is useful for others who face a similar problem \ud83d\ude42 Let me know if this was helpful or if you found an even better solution?<\/p>\n\n\n\n<p>In terms of cost: I use Oracle cloud for the object storage buckets in Europe and the US and currently have 84 GB in there, costing about 17 cents per day \ud83d\ude42 (compared to a CDN service costing 13 cents per GB transferred) <\/p>\n\n\n\n<p>Cheers<\/p>\n\n\n\n<p>Steffen<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Although NeuroDesk is still a young project we already have users outside of Australia. This led to a few problems: our singularity containers are stored on Swift object storage on the Nectar Research Cloud and users outside of Australia saw very slow downloads of the containers there is only a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":306,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[25,29,3],"tags":[],"class_list":["post-303","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-containers","category-neurodesk","category-reproducibility"],"jetpack_featured_media_url":"https:\/\/mri.sbollmann.net\/wp-content\/uploads\/2021\/01\/image-19.png","_links":{"self":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts\/303","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/comments?post=303"}],"version-history":[{"count":1,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts\/303\/revisions"}],"predecessor-version":[{"id":307,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/posts\/303\/revisions\/307"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/media\/306"}],"wp:attachment":[{"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/media?parent=303"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/categories?post=303"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mri.sbollmann.net\/index.php\/wp-json\/wp\/v2\/tags?post=303"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}