Agile Zone is brought to you in partnership with:

I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 637 posts at DZone. You can read more from them at their website. View Full User Profile

Git backups, and no, it's not just about pushing

05.18.2011
| 26826 views |
  • submit to reddit

Git is a backup system itself: for example, you can version your .txt folders containing TODO lists. Since Git version your files just like it does for code, after accidental deletion or modifications it will be able to bring you back.

Yet, if you do not regularly push your commits, a problem with the drive containing the repository may cause the loss of all your work. You can put the repository in Dropbox or on a similar service, but I don't trust it. Dropbox syncs files in .git independently from the rest and from one another, and it may break temporarily or for good the repository. By the way, I only want to snapshot a backup at specific points in time, not always occupying my connection by instant mirroring.

A note before beginning: with binary data Git is not proficient as a backup tool: text works a lot better (it's like code). This article is dedicated to the backup of code and textual content.

Push is not a backup

For example, because it may lack branches. In general, pushing to origin is not even an option as you may not want to push your changes yet, but still perform a backup. It's only in the open source world that backup corresponds to publishing online.

However, thanks to decentralization there are some simple solutions, involving the creation of repositorite different from origin:

git clone /path/to/working/copy #creates the backup
git pull #origin master of course, updates the backup
# you can specify better branches via the local configuration of the backup copy (git config)

The inverse solution, involging pulling, is also possible:

git init . #in the folder of your backup, or you can use a remote repository
git remote add backup_repo /path/to/backup/repo #or a git:// repo
git push backup #master usually, but also multiple branches
git push --all backup #an alternative that pushes all branches

All the commands, also the one that will follow, are just bash commands: it's easy to create a script and automate its execution with cron, anacron or whatever you want. The Force"del" Unix is powerful in you.

git bundle

git bundle is another command that may be used for backup purposes. It will create a single file containing all the refs you need to export from your local repository. It's often used for publishing commits via USB keys or other media in absence of a connection.

For one branch, it's simple. This command will create a myrepo.bundle file.

git bundle create myrepo.bundle master

For more branches or tags, it's still simple:

git bundle create myrepo.bundle master other_branch

Restoring the content of the bundle is a single comment. Inside an empty repo, type:

git bundle unbundle myrepo.bundle

Instead if you do not have a repo, and just want to recreate the old one:

git clone myrepo.bundle -b master myrepo_folder

In emergency situations, bundle comes handy. But my issue with that command is that I always forget something when I use it: for example in my tutorial repository I had a lot of tags, but bundle did not include them by default (you have to specify the whole references list like for master other_branch.)

Tarballs

An alternative is just to archive the repository in a tar.gz or tar.bz file.

tar -cf repository.tar repository/
gzip repository.tar # or bzip2 repository.tar

After that, you can use scp or even rsync (but I don't think it will speed up much) to put repository.tar.gz on another medium.

The weight is higher in this case, since the repository contains also the checked out working copy. But you don't have to learn new commands: apart from the weight and the lack of incremental updates, this solution works fine.

Bare repositories

You can use

git clone --bare repository/ backup_folder/

to create a bare copy of the repository, as a backup. The bare repository does not maintain a checked out working tree, and as so saves space and time for its transferral.

This method can be used in conjunction with the pull/push or the tarball method.

For restoring the backup:

git clone backup_folder/ new_repository/

will recreate the original situation in new_repository. In any of the cases the new folders are created automatically. I won't advise to just copy the folder as often on other backup filesystems (like an USB key's vfat) permissions, owner and other metadata are lost.

Conclusion

So now you have some alternatives for backing up your repositories or transporting them without setting up a server like Gitosis or passing from the publicly available Github. In fact, I researched this techniques for transporting my tutorial code to phpDay 2011 and the Dutch PHP Conference, and they have worked pretty well.

Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)