art taylor

 
« Back to blog

Subversion Hax0ry

Sometimes it's good to think twice, but those who know me know I just consider that a waste of time. Measure once, cut thirty times, that's my motto.

I was pondering the I was having last week, and realized that I could solve the problem with some disgusting and brittle hackery.

According to the doco, Subversion repository copies are very inexpensive, basically being copy-on-write links, like those short-lived CD-based overlay filesystems.1 Tags, therefore, were especially lightweight, with an O(n) disk footprint based on the number of files in the "tag". Remember, these aren't tags, but rather, someone's answer to "How would we implement tags if we only had a filesystem?" It's the equivalent of a symbolic link farm, but more efficient.

This, combined with the retarded svn merge concept (which they admit now to be nothing more than "diff and apply") allowed me to cobble something together, using the second-nastiest shell scripting I've ever done. Witness the following diagram.

It's basically the same as the previous diagram, with one modification. As Branch 1 is created, a snapshot is made of Trunk at the same time. I might pick a name like [branch]-latest-trunk-merge-point but it could be anything.

As the developer working on Branch 2 (reduced in detail as it is here merely for illustrative purposes) merges up, he creates Trunk version t1.1, necessitating a merge down to Branch 1. Just doing a merge would create all kinds of havoc, possibly overwriting or deleting files added or modified on Branch 1. Of course, one could just svn revert, but it's the well-known pain in the butt that the Subversion developers just wave away with "look at svn log and see where things happened, then use the revision numbers!"

With our trailing Trunk tag, though, we can take the snapshot taken at the last merge-down point (in this case, branch creation) and diff it against the current HEAD of Trunk (yes, I realize the cranio-posterial reference here is not coincidental), and apply that to Branch 1.

Here's a sticky bit, though. Assuming all is well and there are no conflicts (or even if there are), we can't just do a straight svn commit. We'd leave the Branch 1 Trunk snapshot back in an inappropriate state, which would cause us to include older changes (which cannot be guaranteed to apply idempotently) in the next merge down. So we have to use a "special" commit. (I am aware of the euphemisms for "retarded" that are popping up in the context of my solution.) This commit-from-trunk-merge needs to do an svn commit, and, if successful, delete the old tag and recreate it from the HEAD of Trunk.

When it's time to merge Branch 1 up to Trunk, another diff is taken, pulling just the changes to take the most recent version of Trunk to Branch 1, version b1.4.

All of this relies on a bit of discipline, which complicated the script significantly. I had to use the first form of the svn merge command (svn merge svn+ssh://svn.example.com/repos/tags/Branch2-trunk-tag svn+ssh://svn.example.com/repos/tags/trunk branches/Branch2) for performance's sake, relying on the remote svnserve process. I had to check to make sure that the working copies were all in sync with the repository before attempting merges in either direction, although I suspect svn would have caught me with this anyway. (Developers should be merging down from Trunk frequently anyway, so I don't feel bad about this.) I have to make sure all the developers use the correct merge and commit statements to keep the repository intact, and I will probably have to resort to locking the repository during these operations, a thing I don't really look forward to doing.

I think I can support about 4-6 developers using this mechanism, more if they understand version control and branching. Beyond that, and I think I'm going to have to be very aggressive in breaking up projects into multiple modules and befouling the world with yet another SOA.

The underlying problem is that, really, you can apply the diff of any two sets of files to any other set of files, as long as the gnu diff can be applied to whatever target you have in mind. Sure, this is sweet and powerful in some fanciful situation, but for the common case of a branching project, it's like a gun with a hairtrigger and a secret firecracker set in the handle to give you a surprise.


1 Holy crap, this didn't die.

Comments (0)

Leave a comment...