Skip to content

Backing up wordpress with git

To back up my wordpress install, i used to do a mysqldump of the DB, bzip it, separately tar/bzip the install directory and rsync the timestamped files to my backup server. This has worked well and was a simple enough setup. A couple of days ago, however, i decided to update wordpress and all extensions and noticed that some of my manual tweaks were gone.

Now, i have to say that the install/upgrade system of wordpress is one of the simplest and troublefree i've used in both commercial and free software -- this is not a complaint about that system. Clearly I manually tweaked files, which means i inherited the responsibility of migrating those tweaks forward. But it did get me thinking about a different upgrade strategy. But even though I have the backups to determine the changes i had to re-apply, the process was annoying. Untar the backup and manually run diffs between live and backup. The task reminded me too much of how I deal with code changes, so why wasn't I treating this like code?

A blog is (mostly) an append-only system

Sure, there's some editing, but most of it is already revisioned, which makes it append only. That means storage of all wordpress assets, including the DB dump should make it an ideal candidate for revision control. Most of the time, the only changes are additions, and if files are actually changed it does represent an update of an existing system and should be tracked with change history. If the live install was under revision control you could just run the upgrade, do a diff to see the local changes and tweak/revert them one at a time, the commit the finished state.

Setting up git and importing old revisions

My hierarchy looks like this:

 .../iloggable/
              /wordpress
              /backup

Inside backup, I kept the tar'ed up copies of the wordpress directory with date suffix, as well as dated, bzipped mysqldumps of the DB. I generally deleted older backups after rsync, since on the live server i only cared about having the most recent backup.

First thing i did was create a mirror of this hierarchy. I used an old wordpress tar for the wordpress directory and bunzipped the db archive from the same date as iloggable.sql as the only file in backup, since for the git repo i no longer needed the other backups, only the most current database dump. I then ran git init inside of iloggable. I also created .gitignore with wordpress/wp-content/cache in it to avoid capturing the cache data.

I added all files and committed this state. I then unarchived the subsequent archives, copied them on top of the mirror hierarchy and added/committed those files in succession. At this point i had created a single git repo of the backup history I had. Now i could review previous states via git log and git diff.

Turning the live copy into the git repo

Finally, i copied the .git directory into the live hierarchy. Immediately, doing git status, it showed me the changes on production since the last backup. I deleted all the old and again added/committed the resulting "changes". That gave me a git repo of the current state, that I pushed to my back-up server.

Now, my backup script just overwrites the current mysqldump, does a git commit -a -m'$timestamp' and git push. From now on, as i do tweaks, or upgrade wordpress, i can do a git commit before and after and I have an exact change history for the change.