Backing up to the cloud for cheap

Motivation

In 2013 I was setting up an Ubuntu dualboot environment, accidentally selected the wrong partition, and saw it immediatly be deleted from the partition table, instantly erasing 300GB worth of data. While I could have recovered most of it with a tool like Recuva, I instead swallowed my pride and accepted this as a lesson: you need a backup. Having had external drives fail on me before, I since longed for a more disaster proof solution that would give me the flexibility I needed.

Backblaze was my first attempt, but it requires you reattach external drives every 30 days, which given my nomadic lifestyle and the offsite nature of the backup drive was an unrealistic burden. Not only this, but the pricing is per device, and not by the gigabyte.

So I found a different solution.

1. Pick a cloud provider

If you already have a provider with a high storage cap, such as Google Drive, OneDrive, Dropbox, or don’t mind paying AWS, then that’s fine: substitute the rest of this tutorial with these.

I ran the numbers and for a low amount of data influx (think 10gb a month), Backblaze B2 is definitely cheaper than even AWS Glacier, if you consider that to access your Glacier you need to pay a hefty extraction fee. Interestingly, it seems to be because Glacier data is stored on tape and a physical robot has to fetch the tape and put it in the reader.

Let me show you:

500gb of data to backup
10gb of new data per month
no deleting and no downloading regularly, except once in disaster

1gb costs 0.005 to upload or download
500 * 0.005 = $2.50
10 * 0.005 = $0.05
for one year: 2.55 * 12 = $30.6
cost of disaster one year in: 620*0.005 = $3.1

total cost: $33.7

backblaze claims that the same will cost $142 with AWS, $122 with Azure, and $136 with GCP, and that seems realistic given my own calculations

No brainer.

2. Set up rclone

rclone is an open source tool written in Go that is essentially rsync (a unix tool for syncing filesystems) but for cloud providers. It supports many, even unexpected ones like Google Photos. rclone will take care of synchronizing our local filesystem with the one in the cloud, only uploading the delta. This makes backups really fast.

For macOS, assuming you already have brew, simply run brew install rclone in the terminal.

3. Connect your cloud

rclone needs to have a valid token for each cloud service, so it can interact with the file system. On some cloud systems you will first need to create a bucket or the equivalent storage block. rclone works with the concept of “remotes”, and in some cases they are stackable, so we will be creating a remote for our bucket, and then a “secret” remote that is connected to the bucket. To achieve this, let’s first create a Backblaze account and create a bucket and tokens.

  1. Create an account at Backblaze.
  2. In the left menu, click Buckets and create a bucket with a random name. I used 1Password to generate a phrase password, like “horse-cookie-serendipity-pool”. Make sure this bucket is Private, encryption is disabled (since we will be using rclone to encrypt), and object lock is disabled.
  3. Go to App Keys and add a new application key. For extra safety, restrict it to the bucket you just created. Ensure you authorise reading & writing, and leave the prefix and duration blank. Note the keyID and applicationKey, remember this is sensitive data.
  4. To keep the keys we are going to save in our local filesystem, in ~/.config/rclone/rclone.conf, secure we need to run rclone config and enter s and then a to set a config encryption password. Use a long, secure password, as this config will contain the keys to the kingdom. I used 1Password.
  5. We are going to create an rclone remote called “remote”. Run rclone config and then follow this:
No remotes found - make a new one
n) New remote
q) Quit config
n/q> n
name> remote
Type of storage to configure.
Choose a number from below, or type in your own value
[snip]
XX / Backblaze B2
   \ "b2"
[snip]
Storage> b2

Now comes the Backblaze specific part. Careful not to confuse your keyID and your applicationKey.

Account ID or Application Key ID
account> -- THIS IS YOUR keyID --
Application Key
key> -- THIS IS YOUR applicationKey --
Endpoint for the service - leave blank normally.
endpoint>
Remote config
--------------------
[remote]
account = 123456789abc
key = 0123456789abcdef0123456789abcdef0123456789
endpoint =
--------------------
y) Yes this is OK
e) Edit this remote
d) Delete this remote
y/e/d> y
  1. Great, now check that you can see your empty bucket: rclone ls remote:name-of-your-bucket. If there’s an error, make sure you didn’t confuse the keyID and the applicationKey.
  2. Let’s create the encrypted remote now called “secret”, by running rclone config again or if you’re already in rclone, adding a new remote.
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> secret
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
[snip]
XX / Encrypt/Decrypt a remote
   \ "crypt"
[snip]
Storage> crypt
** See help for crypt backend at: https://rclone.org/crypt/ **

Remote to encrypt/decrypt.
Normally should contain a ':' and a path, eg "myremote:path/to/dir",
"myremote:bucket" or maybe "myremote:" (not recommended).
Enter a string value. Press Enter for the default ("").

Now enter the name of your bucket

remote> remote:name-of-your-bucket
How to encrypt the filenames.
Enter a string value. Press Enter for the default ("standard").
Choose a number from below, or type in your own value
 1 / Encrypt the filenames see the docs for the details.
   \ "standard"
 2 / Very simple filename obfuscation.
   \ "obfuscate"
 3 / Don't encrypt the file names.  Adds a ".bin" extension only.
   \ "off"

Here you must decide if you want to encrypt file names. As my filenames are not very sensitive, I chose “off”. Your choice.

filename_encryption>
Option to either encrypt directory names or leave them intact.

NB If filename_encryption is "off" then this option will do nothing.
Enter a boolean value (true or false). Press Enter for the default ("true").
Choose a number from below, or type in your own value
 1 / Encrypt directory names.
   \ "true"
 2 / Don't encrypt directory names, leave them intact.
   \ "false"
directory_name_encryption>
Password or pass phrase for encryption.

Choose a secure password. I generated one in 1Password, but you may choose the let rclone generate one for you.

y) Yes type in my own password
g) Generate random password
y/g> y
Enter the password:
password:
Confirm the password:
password:
Password or pass phrase for salt. Optional but recommended.
Should be different to the previous password.
y) Yes type in my own password
g) Generate random password
n) No leave this optional password blank (default)
y/g/n> g

Choose a secure password and do not lose it or you can never recover your files!

Password strength in bits.
64 is just about memorable
128 is secure
1024 is the maximum
Bits> 128
Your password is: JAsJvRcgR-_veXNfy_sGmQ # DON'T LOSE THIS
Use this password? Please note that an obscured version of this
password (and not the password itself) will be stored under your
configuration file, so keep this generated password in a safe place.
y) Yes (default)
n) No
y/n>
Edit advanced config? (y/n)
y) Yes
n) No (default)
y/n>
Remote config
--------------------
[secret]
type = crypt
remote = remote:name-of-your-bucket
password = *** ENCRYPTED ***
password2 = *** ENCRYPTED ***
--------------------
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d>
  1. Ready for take off 🚀 Let’s start the backup process:
rclone sync -P --stats-log-level "ERROR" --skip-links --exclude ".DS_Store" --exclude ".Trashes" /Volumes/MY-DRIVE/ secret:/Backups/

You can also wrap the command so it doesn’t get interrupted and get a log file with the errors:

setsid [command] --log-file rclone.log &>/dev/null

You can exclude more folders or file types using the --exclude flag. If you’re not ready to star the upload, run the command with the --dry-run flag.

What to do in emergency

You can copy to your rclone config to a secure space such as 1Password, and if you need your files on another machine for example, place the config in the correct location and run rclone sync secret:/Backups/ /path/to/local/folder to download your files from your cloud to your local file system. If you don’t keep the rclone config, you will need to set up your remotes all over again.

Fin

Thanks for reading, I hope this was useful to someone adn if you ever have any questions you’re welcome to ping me.

Last modified 2021.03.10