Kafka Streams Topology Testing: Mocked Streams 3.3 is out.

The Backup Principle

Discussing my backup infrastructure using rsync and git-annex.

I just visited a very good friend, and her story of lost data on her external hard drive reminded me of my own data and how important it is to regularly backup personal stuff. It's been a while since my last data loss, about seven years. I learned from this experience and created a backup ritual, which fits my needs. Over the years I've refined the process of backuping and storing data. I'm not a big fan of free available cloud storage services like Dropbox and I'm even suspicious to professional services. Trusting strangers isn't an easy task if personal data, your virtual life, is involved.

I want to keep it simple and in addition I want to keep it open standard, therefore I bought a NAS with a Linux environment. In my personal opinion encryption of data is also a very important principle as a protection against data theft or surveillance. The synchronization task is done via rsync, which is easy to use manually, on command-line. For those of you who prefer graphical user interface should test Grsync. The remote transfer between machines in the backup process is encrypted by combining rsync with ssh. I manually backup the data periodically at least every week, usually on the weekend while housekeeping. It's liberating knowing your data is replicated and hardware failure couldn't be the cause for any data loss.

Like other parts of my digital infrastructure I try to continuously improve my backup procedure. Last month I tried git-annex, which is a version control system for large binary files build on top of the well known distributed version control system git. The main advantage in contrast to rsync is version control, which allows the user to checkout a certain snapshot of modifications. git-annex and its dependencies aren't packaged for ARM architectures, so unfortunately I'm not able to use it on my NAS. But if you run a standard x86 architecture, I recommend you to try this. I'm looking forward to refine my backup procedure with upcoming free software and open standards, I'll keep you informed.

You might think of this text as just yet another backup your data blog post on the web, but I hope that this yet another simple reminder initiates some own thoughts on backuping your data. It's better to backup, without actually needing it, as doing without backup and needing it badly.

One or two mails a month about the latest technology I'm hacking on.