Tarsnap offsite backups

Look at that.

root@halleck:~# tarsnap --list-archives -v | sort -k2 | head -n1
M2009-09        2009-09-01 05:24:01

Having been using the Tarsnap backup service for five years kind of deserves a blog post? Let me share a few of the reason why I keep using Tarsnap for offsite backups, and why you too might want to.

  • Encryption happens locally, allowing one to worry less about the files being included.
  • Data deduplication on a block level, allowing for every kept backup to be a full snapshot.
  • Client uses basically the same command line option as regular tar, making Tarsnap familiar as well as scriptable.
  • While not open source, the Tarsnap client source code is fully available, coupled with a serious bug bounty program.
  • Server side Tarsnap uses Amazon S3 for storage, which has a proven record for durability.
  • IPv6 support.

Then there are these potential issues, which may or may not be an issue for you.

  • Restoring files tend to be a bit slow. That limitation can partially be worked around by restoring separate folders in parallel.
  • You can only use the Tarsnap backup client together with the Tarsnap backup service. Backing up to your own infrastructure is not supported.
  • Payment can only be made a head of time, and if your account runs out of money there is a relatively short window before your backups are deleted. Personally I can keep track of that, but it makes me hesitant to recommend Tarsnap in any work environment.

For more info, see the pages Tarsnap design principles and Tarsnap general usage.

In case you start using Tarsnap, then feel free to take a look at my tarsnap-helpers git repository. It contains a couple of monitoring scripts as well as some bash completions.

Gitolite mirror monitoring

I use gitolite to host my personal git repositories. Being the way I am I also do the whole mirroring dance.

Of course, mirroring requires monitoring. Preferable without granting ones Icinga more access than necessary. Hence these Python snippets.

For some actual specifics, let me direct you towards the README file.

Requering both an SSH key and a YubiKey

As of OpenSSH 6.2 there is the configuration option AuthenticationMethods, allowing for the requirement of more than one authentication method. For me the obvious combination here is requiring both regular ssh key auth as well as a physical YubiKey, both which need to succeed.

This post is a short description of my personal setup, focusing more on the how than on the whys.

In addition to the obvious requirement of having a YubiKey my setup depends on the following:

  • Running at least OpenSSH 6.2, which is provided by default as of Ubuntu 13.10. Debian wise it might be helpful to know that the Wheezy backports currently contains OpenSSH 6.5.
  • The Yubico PAM module. Assuming recent enough Debian/Ubuntu that module can be found in the libpam-yubico package.
  • An API key from https://upgrade.yubico.com/getapikey/.

Here we have the relevant part of sshd_config, only enforcing the additional requirement for selected users.

### /etc/ssh/sshd_config
ChallengeResponseAuthentication no
PasswordAuthentication no
UsePAM yes
Match Group yubiusers
      PasswordAuthentication yes
      AuthenticationMethods publickey,password

Then there is the part about having PAM threat ssh passwords as YubiKey OTPs. Given Debian style /etc/pam.d/ I am modifying /etc/pam.d/sshd to replace the include of /etc/pam.d/common-auth with an include of my own custom /etc/pam.d/yubi-auth.

### /etc/pam.d/sshd
# @include common-auth
@include yubi-auth
### /etc/pam.d/yubi-auth
auth    required        pam_yubico.so mode=client id=NNN key=sEcREt authfile=/etc/yubimap

(No, the /etc/pam.d/yubi-auth file isn’t globally readable.)

In a more general manner the PAM config change is about replacing the auth … pam_unix.so line with an auth … pam_yubico.so line.

The specified /etc/yubimap holds the mapping between usernames and YubiKeys.

### /etc/yubimap

Finally, the result.

andreas@corrino:~$ ssh halleck.arrakis.se
Authenticated with partial success.
andreas@halleck.arrakis.se's password:

Moving to IPv6

My blog has yet again moved; this time to Frobbit. It being 2014 and all I figured that it was about time that the blog became reachable over IPv6.

rsyslog loggly remote

Shipping your logs to a central server is usually a good thing to do. For a large number of servers it provides a better overview, and no matter the numbers of servers a secondary log location can be helpful in figuring out why something bad happened to a server.

My two VPS nodes are now using loggly as a remote (TLS) syslog server. I’m even allowed do that for free, as long as I don’t upload more than 200 MB of logs per day, nor want the log data to be retained for more than a week. Not that I would mind paying a bit for a longer retention period. It’s just that their pricing feels a bit steep, given that I currently log less than a megabyte per day.

(Yes, I do realize that I’m not their obvious target audience.)

Already running rsyslog I decided to follow loggly’s rsyslog instructions, which did a pretty good job of explaining the additional configuration needed. The one thing I did miss in those instruction were a discussion on queue setting, which very much will matter when loggly’s servers for one reason or another becomes unavailable. By default rsyslog only queues a limited number of entries in memory, so for additional resilience I explicitly enabled a disk assisted queue, based on the rsyslog reliable forwarding guide.

Want to test the queuing? Just put appropriate iptables rules in place, and then speed up time by using logger(1) to pipe lots and lots of entries into syslog.

All in all, the following loggly specific configuration seems to do the trick for me.

# /etc/rsyslog.d/loggly.conf
$DefaultNetstreamDriverCAFile /etc/ssl/loggly/loggly_full.crt
$DefaultNetstreamDriverCertFile /etc/ssl/loggly/dummy-halleck.crt
$DefaultNetstreamDriverKeyFile /etc/ssl/loggly/dummy-halleck.key

$ActionSendStreamDriver gtls
$ActionSendStreamDriverMode 1
$ActionSendStreamDriverAuthMode x509/name
$ActionSendStreamDriverPermittedPeer *.loggly.com

$ActionQueueType LinkedList
$ActionQueueFileName loggly
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on

*.* @@logs.loggly.com:<assigned port>

If you are running Ubuntu, and Launchpad bug #1075901 have yet to be fully fixed, you might manually need to chown syslog:syslog /var/spool/rsyslog/. While you are at it, also install the needed rsyslog-gnutls package.

Worth mentioning is that loggly provides the option of archiving your logs to a S3 bucket. Given my modest log volumes the cost of doing that ought to be pleasantly close to zero.

Will be very interesting to see how reliable this solution turns out to be. The plan is to setup some semi-automated testing, and hopefully have some results to share in a follow-up post, say in a month or two.

YubiKey NEO and Ubuntu

My Christmas gift to myself this year turned out to be a YubiKey NEO.

The new feature I myself find most interesting is that the NEO can act as an OpenPGP smartcard. While there is a pretty good introduction in the Yubico blog post YubiKey NEO and OpenPGP I ran into some obstacles getting things running under Ubuntu.

First of all it doesn’t seem like the version of the yubikey-personalization  (1.7.0) included in Ubuntu 12.10 recognizes the YubiKey NEO. Without spending to much time on debugging that issue was solved by upgrading to the current yubikey-personalization version, using the Yubico PPA.

Then there was the matter of getting the device permissions right, allowing my non-root user to use/modify the NEO more actively than just having it act as a keyboard (HID), spitting out one time passwords. Turns out that the /lib/udev/rules.d/70-yubikey.rules provided by the current yubikey-personalization (1.11.1) only matches the ATTRS{idProduct} “0010”, which doesn’t apply to the NEO. I solved that by copying the 70-yubikey.rules to /etc/udev/rules.d/, modifying it to instead match ATTRS{idProduct} against “0010|0111”. According to the add udev rules for YubiKey NEO bug report it probably doesn’t hurt to also through the 0110 id into the mix.

Finally I had the fun experience of running into a limitation in the gnome-keyring’s capacity to act as gnupg-agent (Launchpad bug #884856). Any attempt to have GnuPG interact with the NEO smartcard, while using the gnome-keyring gnupg-agent, resulted in a “selecting openpgp failed: unknown command” error. Not finding any cleaner configuration option I resorted to simply removing /etc/xdg/autostart/gnome-keyring-gpg.desktop, resulting in gnome-keyring no longer hijacking the GPG_AGENT_INFO environment variable, instead letting the real gnupg-agent do its thing.

Now I only need to decide to what extent to actually use the OpenPGP smartcard feature. Yet, that’s a whole different blog post.

Easy IPv4+IPv6 Nagios monitoring using check_v46

Not feeling ready to give up on IPv4 quite yet? In that case you most likely want your Nagios to probe your services on both their IPv4- as well as their IPv6 addresses.

Looking into how how to handle that duplication in a sane manner I stumbled over the rather convenient check_v46 plugin wrapper. Assuming the actual check being run provides the -4/-6 options check_v46 can automatically, based on a hostname lookup, test using IPv4 and/or IPv6, and then return the worst result. See below for a trivial example, as well a matching example response.

define command{
       command_name    dual_check_http
       command_line    /usr/local/nagios/check_v46 -H '$HOSTNAME$' /usr/lib/nagios/plugins/check_http
CRITICAL: IPv6/halleck.arrakis.se OK, IPv4/halleck.arrakis.se CRITICAL

Do note that there is also the option of manually feeding check_v46 IPv4 and IPv6 addresses. See the plugin –help for the actual details. Also note that the check_v46 wrapper does not appear to work with the Nagios embedded Perl.

Of course, a more perfect solution probably requires Nagios itself to be more IPv4 vs IPv6 aware. For example, in the case that a host (or a datacenter) temporarily becomes unavailable over IPv6, it might then be more helpful if the service checks focused primarily on the IPv4 results, instead of either going full ballistic or completely silent. Yet, as long as good enough is good enough, the check_v46 wrapper is definitely an easy win.

Fully using apt-get download

Occasionally I need to download a Debian package or two. While I could find a download link using packages.debian.org / packages.ubuntu.com I really do prefer using apt-get download. In addition to the general pleasantness of using a command line tool the main benefit really is that apt automatically will verify checksums and gpg signatures.

For me the most typical usage scenario is that I want to download a Debian package from a different release than the one I happen to run on my workstation. Instead of putting additional entries in /etc/apt/sources.list, and hence having to deal with apt pinning as well as it making my regular apt-get update runs slower, I find it much more convenient to setup a separate apt environment.

First there is the basic directory structure.

$ mkdir -p ~/.cache/apt/{cache,lists}
$ mkdir -p ~/.config/apt/{apt.conf.d,preferences.d,trusted.gpg.d}
$ touch ~/.cache/apt/status
$ ln -s /usr/share/keyrings/debian-archive-keyring.gpg ~/.config/apt/trusted.gpg.d/
$ ln -s /usr/share/keyrings/ubuntu-archive-keyring.gpg ~/.config/apt/trusted.gpg.d/

(For an Ubuntu system the /usr/share/keyrings/debian-archive-keyring.gpg keyring is provided by the debian-archive-keyring package.)

Then there is the creation of the files ~/.config/apt/downloader.conf and ~/.config/apt/sources.list. They should contain something like the following.

## ~/.config/apt/downloader.conf
Dir::Cache "/home/USERNAME/.cache/apt/cache";
Dir::Etc "/home/USERNAME/.config/apt";
Dir::State::Lists "/home/USERNAME/.cache/apt/lists";
Dir::State::status "/home/USERNAME/.cache/apt/status";
## ~/.config/apt/sources.list
# Debian 6.0 (Squeeze)
deb http://ftp.us.debian.org/debian/ squeeze main contrib non-free
deb http://ftp.us.debian.org/debian/ squeeze-updates main non-free
deb http://security.debian.org/ squeeze/updates main contrib non-free

# Debian 6.0 (Squeeze) Backports
deb http://backports.debian.org/debian-backports squeeze-backports main contrib non-free

# Debian 7.0 (Wheezy)
deb http://ftp.us.debian.org/debian/ wheezy main
deb http://security.debian.org/ wheezy/updates main

# Debian Unstable (Sid)
deb http://ftp.us.debian.org/debian/ sid main

# Ubuntu 12.04 (Precise)
deb http://us.archive.ubuntu.com/ubuntu/ precise main restricted universe multiverse
deb http://us.archive.ubuntu.com/ubuntu/ precise-updates main restricted universe multiverse
deb http://security.ubuntu.com/ubuntu precise-security main restricted universe multiverse

# Ubuntu 12.10 (Quantal)
deb http://us.archive.ubuntu.com/ubuntu/ quantal main restricted universe multiverse
deb http://us.archive.ubuntu.com/ubuntu/ quantal-updates main restricted universe multiverse
deb http://security.ubuntu.com/ubuntu quantal-security main restricted universe multiverse

Given the just described setup, apt-get download can now download packages from any release/codename defined in ~/.config/apt/sources.list.

$ APT_CONFIG=~/.config/apt/downloader.conf apt-get update
$ APT_CONFIG=~/.config/apt/downloader.conf apt-get download git/squeeze-backports
Get:1 Downloading git 1: [6557 kB]
Fetched 6557 kB in 2s (2512 kB/s)
$ APT_CONFIG=~/.config/apt/downloader.conf apt-get download git/precise
Get:1 Downloading git 1: [6087 kB]
Fetched 6087 kB in 3s (1525 kB/s)

Do note that apt-get download was introduced in apt 0.8.11. For Debian that translates into Wheezy (7.0) and for Ubuntu that would be as of Natty (11.04). The main difference between apt-get download and apt-get –download-only install is that the later also does dependency resolution.

My bastardized Masterless Puppet

I am currently using Puppet to control my laptop as well as my two VPS nodes. That is not exactly the scale where I feel the need to have a puppet master running. Especially not since I am not overly keen on the idea of giving an external machine control over my laptop.

That being said, I still want some central location from where my nodes can fetch the latest recipes, allowing me the freedom to push updated recipes even if a node don’t happen to be online at the time. I just don’t want to spend any actual resources on this central location, nor having to trust it more than necessary.

At first my recipes didn’t contain any secrets and I got away with pulling updated recipes from a (public) github repository. The only overhead was the need to have my puppet cron script verify that HEAD contained a valid gpg signed tag.

Now my puppet recipes do depend on secrets. These shouldn’t be available neither to the central location nor to the wrong node. That bringing us to my current homegrown, slightly bastardized, solution.

The current central location for my puppet recipes is a cheap web host. To it I am uploading gpg encrypted tarballs. These tarballs are individually generated as well as encrypted with each nodes own gpg key. For further details, see the included Makefile below.

	apt-get moo

locally: manifests/$(shell facter hostname).pp
	puppet apply --confdir . --ssldir /etc/puppet/ssl ./manifests/$(shell facter hostname).pp

	tarsnap --configfile ./.tarsnaprc -c -f "$(shell date +%s)" .

manifests/%.pp: manifests/defaults.inc manifests/%.inc
	cat $^ > $@

	find -regextype posix-egrep -regex ".+.(pp|inc)" | xargs puppet parser validate
	find -name "*.erb" | xargs ./tools/validaterb.sh

exported/%.tar: manifests/%.pp validate
	tar cf $@ manifests/$*.pp modules/ secrets/common/ secrets/$*/

exported/%.tar.gpg: exported/%.tar
	gpg --batch --yes --recipient puppet@$*.arrakis.se --encrypt $<

exported/%.tar.gpg.sig: exported/%.tar.gpg
	gpg --batch --yes --detach-sign $<

upload-%: exported/%.tar.gpg exported/%.tar.gpg.sig
	scp -o BatchMode=yes exported/$*.tar.gpg.sig andol_andolpuppet@ssh.phx.nearlyfreespeech.net:/home/public/
	scp -o BatchMode=yes exported/$*.tar.gpg andol_andolpuppet@ssh.phx.nearlyfreespeech.net:/home/public/

hosts := halleck hawat leto
deploy: $(addprefix upload-, $(hosts))

.PHONY: default locally backup deploy validate

…and here is the download script running on the nodes. In addition to doing the gpg stuff the script also handles ETags for the http download.


tarball="$(facter hostname).tar"

bailout () {
    rm -rf "$workdir"
    [ -n "$2" ] && echo "$2"
    exit $1

umask 0027

if [ -f "$etagfile" ]; then
    curretag=$(head -n1 "$etagfile" | grep -Ei "^[0-9a-f-]+$")

workdir=$(mktemp --directory)
cd $workdir || exit 1

curl --silent --show-error 
    --netrc-file "$netrcfile" 
    --header "If-None-Match: "$curretag"" 
    --dump-header "$gpghead" --remote-name 

if grep -Eq "^HTTP/1.1 304" "$gpghead"; then
    bailout 0
elif grep -Eq "^HTTP/1.1 200" "$gpghead"; then
    newetag=$(sed -nre "s/^ETag: "([0-9a-f-]+)"s*$/1/pi" "$gpghead")
    [ -n "$newetag" ] && echo "$newetag" > "$etagfile"
    bailout 0 "Failed to get expected HTTP response."

curl --silent --show-error 
    --netrc-file "$netrcfile" --remote-name 

gpgv --keyring /usr/local/etc/puppet/gnupg/trustedkeys.gpg "$sigfile" 2> /dev/null
if [ $? -ne 0 ]; then
    bailout 0 "Signature verification failed."

export GNUPGHOME=/usr/local/etc/puppet/gnupg
gpg --quiet --batch "$gpgball" 2> /dev/null
if [ $? -ne 0 ]; then
    bailout 0 "Decryption failed."

tar --no-same-owner --no-same-permissions -xf "$tarball"
if [ $? -ne 0 ]; then
    bailout 0 "tar extract failed."

rsync --archive --delete --chmod=o-rxw,g-w 
    manifests modules secrets /usr/local/etc/puppet/

if [ $? -ne 0 ]; then
    echo beef > "$etagfile"
    bailout 1 "rsync update failed."

rm -rf "$workdir"

Of course, this approach involves a bit more work while setting up Puppet on a new node. So while I feel that it is a good fit for my current situation it isn’t anything I would use in a larger environment. Also, with a larger amount of nodes there are puppet master features, such as reporting and storeconfigs, being potentially more valuable.


For some reason I decided it was a good idea to register the domain coloncolonone.net. Currently it is only used to serve a static html page, proclaiming the following message.

There’s no place like ::1

Any ideas on other, possible slightly more creative, ways to use the domain name?