I currently have a Ubuntu 12.04 LTS box serving as a network repository for daily workstation backups for our HQ office. Throughout the day the Win7 workstations use Cobian Backup 11 to backup their user folders to their respective samaba shares on this server. At night this server rsyncs all of the samba folders' contents to an offsite sever in another branch. This setup is working well, with one exception: The backups aren't encrypted. I don't like the idea of having the whole office's unencrypted documents gathered together in one place (technically, two places). Cobian has an option for encrypting the backups, which is great, but rsyncing encrypted files doesn't work so well with rsync's delta algorithm. So, for my purposes, this setup is like 90% there, but not 100% of what I need. My question is: How do others handle this sort of situation? Does anyone store numerous encrypted backups offsite? If so, how do you get them there?
A little more info: the backups represent about 300Gb, but rsync only transfers about 5Gb of delta data per night.
Any advice is appreciated, thanks.
Duplicity seems to be a popular solution for this. I've not used it personally, but the docs claim it can backup directories and encrypt with GPG key or passphrase.
http://duplicity.nongnu.org/duplicity.1.html
http://manpages.ubuntu.com/manpages/hardy/man1/duplicity.1.html
I think the most common solution is to use an on-the-fly encryption of the backup file-system. Tools like dm-crypt
(part of the Linux kernel since quite a while) and true-crypt
allow you to mount an encrypted file-system, just like you mount a HDD partition but with encrypted key protection. The physical data itself remains encrypted all the time, but it gets transparently encrypted and decrypted as the user (with permission) reads or writes to the file system. From the perspective of the user, after the authentification has passed, it just looks like a normal file system (i.e., it is a kernel-space device mapper). So, obviously, backup software like rsync can run on top of it too. It is quite trivial to set this up and all the tools involved are standard-issue Linux tools, here are some very simple instructions. Here's another.
As for all the transmission, well, rsync
can use SSH to as a remote shell to the backup machine, and all communications are encrypted, so this is a none-issue.
Samba shares can also run on top of an encrypted device.
The basic procedure in both cases (rsync or linux-based samba server), after you have set up an encrypted partition (instructions in the links given), you just put the few commands to mount the encrypted drive into the .bashrc
file and the few commands to unmount the encrypted drive into the .bash_logout
file, and you do so for each user account that might log in to the backup machine (I assume you only have a few of those). At this point, whenever there is a remote login to the backup machine, the drive is mounted and is unmounted when no-one uses it. For example if you issue a command to send over the latest deltas with rsync, the rsync program will ssh to the backup machine, triggering the mounting of the encrypted device, then rsync will do its work on that device (not even aware that it is an encrypted device), and then leave (log-out), leaving the encrypted drive unmounted and the data is safe there. This way, the data can be secure all the way since it can be stored in encrypted samba drives, then collected by rsync, encrypted in SSH transmissions, and then stored in an encrypted drive at the destination. It would be hard to do any better, except maybe also encrypting your workstations' user folders.
What CimmerianX suggested is another option, which is basically the option of creating the backup, putting it into a compressed archive, and then encrypting the whole backup with a GPG (or other) method. Duplicity just does that, but in a more convenient all-in-one process, and storing archives of delta's instead of multiple copies of the whole thing (which would be impractical). I tend to find that on-the-fly encryption is more flexible because the encryption occurs under-the-hood (in a device mapper) and you can do whatever you want (rsync, samba, version-control, etc.) on top of it.
I've looked into Duplicity a bit and it looks like a good soultion for a setup with only one "hop" (workstation to backup server). For my situation, though, that wouldn't help with encrypting the data from the workstation to the local backup server--the data would still always reside there unencrypted. Good suggestion, though--it should work great for linux server backups.
@mike 2000 17
That sounds like as good of an option as I'm going to get. I think the fundamental nature of encryption just simply doesn't play well with delta-copy algorithms. I've got a question about your approach, though. You say that the filesystem is mounted when there is a remote login, does that apply to samba logins (does the .bashrc file get read during a samba login...)?
does that apply to samba logins (does the .bashrc file get read during a samba login...)?
Yes.
Ok, that's good to know. Another thing I'm not clear on: when the filesystem is mounted (by a specific users samba login, for example), it's mounted and decrypted for any and all users right? So if my rsync process takes 5 hours to complete, the filesystem will be decrypted for any users/processes to access during that 5 hours? ...or am I thinking of it incorrectly.
You have to understand something about security. User privileges are always a single point of failure. If somebody comes in (remotely or physically) and obtains privileges that he shouldn't have, then there really isn't much you can do about it. Encryption is not going to help with that problem at all.
You need to organize user accounts smartly, protect user-credentials from being disclosed or easily obtained. That's the true recipe for security.
Each workstation (presumably operated by some person) has a user account for that person. That personal user account should have read-write privileges only to its samba mirror (to where it uploads daily changes). But that user account should not have any access to the other samba directories (backups for other users). Then, you have, on the samba server machine, a high-privilege user (super-user) that has read-write privileges to all of the samba directories. Then, you can do the rsync backup to the other remote machine through that super-user account (on a nightly cron job) (and encrypted communication prevents an attacker from eavesdropping).
If an "attacker" somehow obtains the credentials (user-name and password) of one of the workstation's user accounts, all he'll have access to is that user's files (which is the minimal damage you can expect). If he somehow obtains the super-user credentials for one of the backup machines, then there's nothing you can do, he can grab whatever data he wants, encryption is not going to change that, regardless of the scheme you employ.
Encrypting the hard-drive is really about preventing physical attacks (i.e., it's the last line of defense). In other words, if someone obtains physical access to one of the backup machines (i.e., he literally walks into the room where the computer is when no-one's around to guard it), then user-credentials do not matter anymore, there are about a dozen easy ways to hack your machine, for example: he could boot the computer from a liveCD, he could boot in recovery mode and override the root password, he could open the computer and grab the hard-drive, etc.. That's where encryption matters. If the hard-drive is encrypted, he's not gonna be able to steal the data inside it (well, maybe if he brings the HDD back home and works on it for a good while he might crack it, by brute force or otherwise).
So, it really doesn't matter that while the encrypted drive is mounted any user can access it like a normal drive, because that is the domain of user-privilege security. You're security infrastructure is only as solid as its weakest link. If your user accounts have too much privilege and / or your people are too careless about giving away their credentials or using weak passwords or leaving their workstations unlocked and unattended, then that's the weakest link, and that's not a domain where encryption can help you.
The ideal solution was to have the workstation backup software encrypt the data before it gets onto the backup server. In this situation, that enctyption would protect the data in nearly every situation possible (short of an attacker gaining access to the workstation prior to backup, or trying to brute-force the encryption keys ). Since the backup server would never see the unencrypted data, compromised user credentials or even physical access wouldn't give anyone anything other than a pile of gibberish data. In that scenario, my weakest link (considering the goal is to protect the data) was the encryption (not a very weak link at all). But, as we already know, I can't practically rsync that encrypted data. So, your solution sounds like my best option, but it changes my weakest link. And I'm now just trying to make sure I understand what that link is--and as you've mentioned, it's the user accounts. Now that I'm clear on that, I know where I need to focus.
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.