Monday, May 10, 2010

Backing up JIRA and Confluence taking advantage of ZFS snapshots

If you're running an instance of JIRA or Confluence (or many other software packages as well) you probably want to make sure that your data is properly and regularly backed. If you've got some experience with JIRA or Confluence you surely have noticed the bundled XML backup facility: a scheduled backup service which takes advantage of it it's even running by default in your instances.

The effectiveness of such a backup facility depends on the size of your installation but the rule of thumb is that it's a mechanism that does not scale well as the amount of data stored in your instances grows up. In fact, XML backup was thought for small-scale installations and is not a recommended backup strategy for larger scale deployments.

In the case of JIRA I still continue to run automated XML backups since they do not store attachments in them but as far as it concerns Confluence, I always disable the automated XML backup and rely on native database backup and attachment storage backup. The database backup must be performed with the native database tools such as pg_dump for PostgreSQL. The backup of your instance's attachments will depend on the type of storage in use. If you're storing your attachment in the database, your attachments will be backed up automatically during your database backup. If you store your attachments in a file systems, as it's the case for both JIRA and Confluence default installations, there's plenty of tools out there to get the job done such as tar, pax, cpio and rsync (to name just a few). Each one of these have advantages and drawbacks and I won't enter in a detailed discussion: it suffices to say that none can beat a Solaris ZFS-based JIRA or Confluence installation. 

Since ZFS inception I've been taking advantage of its characteristics more and more often and snapshots are a ZFS killer feature that will considerably ease your administration duties. Whenever I install a new instance on a Solaris Zone, I set up a ZFS file system for hosting both the database files and JIRA or Confluence home directories:

# zfs create my-pool/my/db/files
# zfs create my-pool/jira/or/confluence/home

Taking a snapshot of a ZFS file system is a one-liner:

# zfs snapshot file-system-name@snapshot-name

In an instant your snapshot will be done and you will be able to send it to another device for permanent storage. ZFS snapshots, combined (or not...) with another tool such as rsync, will incredibly simplify backing up your files and, also, maintaining a cheap history (in terms of storage overhead) of changes in case you need to roll back your file systems (and hence the data stored in your application) in case you needed it.

Take into account that, to recover a single file from a snapshot in case your original pool crashes, you will need to ZFS receive the snapshot in another pool for files to be accessible. That's why I still rely on a scheduled rsync backup together with ZFS snapshots just in case, although with a much lower frequency than in the pre-ZFS epoch.

No comments: