When using a High Performance Computing System you also might have run into the problem of how to get data in and out of this system.

We are very lucky in Australia to have CloudStor, which is Aarnets hosted OwnCloud version with 1TB of storage available for every researcher and we can use it to move data between different computing systems.

The only problem is that on an HPC you can’t open a browser and you also can’t install the owncloud client – so what other options exist?

There is a wonderful tool called rclone, which interfaces with all cloud storage providers out there 🙂

You can find a detailed description how to connect rclone to CloudStor, so let’s go through this step by step and set it up together:

Connect to your favorite HPC and install rclone on there. The official install instructions for rclone are a bit misleading as we will not be able to run as sudo or use their install script without sudo or install rpm or apt packages – so if you try any of the official instructions on an HPC you will see errors similar to this:

Let’s install it in our home directory 🙂

cd
mkdir tools
cd tools
wget https://downloads.rclone.org/rclone-current-linux-amd64.zip 
unzip rclone-current-linux-amd64.zip 
rm rclone-current-linux-amd64.zip 
cd rclone-*
echo "export PATH=\$PATH:$PWD" >> ~/.bashrc
source ~/.bashrc
rclone config

Create a new remote: n

Provide a name for the remote: CloudStor

For the “Storage” option choose: webdav

As “url” set: https://cloudstor.aarnet.edu.au/plus/remote.php/webdav/

As “vendor” set OwnCloud: 2

Set your CloudStor username after generating an access token:

Choose to type in your own password: y

Enter the Password / Token from the CloudStor App passwords page and confirm it again:

Leave blank the bearer_token: <hit Enter>

No advanced config necessary: <hit Enter>

accept the configuration: <hit Enter>

Quit the config: q

Now we can download data to the HPC easily:

rclone copy --progress --transfers 8 CloudStor:/raw-data-for-science-paper .

or upload data to CloudStor:

rclone copy --progress --transfers 8 . CloudStor:/hpc-data-processed

If you need to upload lots of data, there is a very nice wrapper script from AARNET that tweaks a few settings and checks if all files have been transferred: https://github.com/AARNet/copyToCloudstor

git clone https://github.com/AARNet/copyToCloudstor 
cd copyToCloudstor
./copyToCloudstor.sh ~/scratch/data/ CloudStor:/all-my-important-HPC-data-for-nature-paper

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.