Why not to use Subversion


I’m a big Subversion fan, and have been since it was first released. Or, I was, until recently, when I discovered that Subversion is fundamentally broken.
Some background: I’ve been using Subversion since 2000, when a project I was working on for the Forest Service chose to convert a CVS repository to it. Since then, I’ve been running a Subversion server that’s been hosting a number of open source projects I work on, as well as having provided hosting service for a couple of projects that I don’t work on. By and large, Subversion has been adequate, and stable enough, and I’ve been able to work my way out of any DB wedges or failed upgrades that I’ve gotten myself into. I’ve never used it on a large-scale, multi-user, multi-branch project, but I’ve taken comfort from the fact that projects such as KDE do use SVN.
Recently, however, I’ve been using Subversion at work. We switched to SVN from CVS (there’s a whole story behind that) while at the same time the rest of the company was consolidating on ClearCase. I’m not going to go into why Subversion is superior to ClearCase; I’m still convinced that it is, and I strongly believe that you have to be insane to use ClearCase in the Enterprise. But now I think that while ClearCase should be avoided like the plague, I also believe that Subversion shouldn’t be used, either.
The reason for this is because Subversion’s merging mechanism is fundamentally, structurally, and architecturally broken. Subversion simply does not perform merges correctly.

  1. Subversion’s –reintegrate is useless if you use any sort of cherry-picking. Since I’ve never encountered a situation where cherry-picking wasn’t leveraged at some point or other, –reintegrate is always useless.
  2. Subversion performs merges changeset by changeset, which means that you will always perform more conflict resolution than you need to.
  3. Subversion gets very confused, very easily, about tree merges. This is mostly because of (2).
As an illustration of how … well, no other adjective really captures the scale of the issue, so forgive me … of how fucked up (2) is, consider a series of changes from A to P on branch 1, and a similar – yet unrelated – series of changes from A to P on branch 2. Now merge branch 2 into branch 1. Despite the fact that P is the same on both branches, due to the fact that Subversion merges incrementally, that means that you’re going to end up resolving a lot of conflicts that – ultimately – are entirely unnecessary.
While you’re considering this, also consider how other version control systems, such as Mercurial, Bazarr, and Git, do not have this problem.
Finally, Subversion’s whole merge-tracking mechanism is really buggy, which makes it a pain in the ass to use. And I don’t mean “buggy in the edge cases,” I mean “buggy in that you’re always going to hit this bug in any non-trivial merge.” There are work-arounds for the bugs, usually, but you only get to these work-arounds after running into one of the bugs, spending some time finding what the work-around is, and then re-starting your merge. You can spend hours running around in circles, hitting a bug, finding a work-around, restarting your merge, resolving conflicts, hitting a bug, finding a work-around, restarting your merge, resolving the same conflicts, getting a little further, hitting another bug, and so on. Hours.
So, I’m ditching Subversion. It just isn’t worth it any more, when there are better revision control systems out there that do handle merging properly, and have other advantages such as distributability. If you never want to merge branches, then SVN is fine… but if you’re doing only single-developer development (which is about the only place you’re likely to entirely avoid branching and merging), then there are other reasons why you should consider a DVCS, anyway. Mercurial, Bazaar, and Git are easier to set up, transport, and share with untrusted peers. They’re less effort to maintain – admittedly, only by a tiny bit; I’ve never found SVN to take much administration, except in the bad-old BDB days. And they do merging correctly. Plus, most of them have nice features (sometimes as plug-ins), like block-level commits, cherry picking that doesn’t break their versions of re-integrate (grumble), flexible rebasing, nice graphical views of the merge history (which can be really useful), patch queue/shelving support, integrated bisecting, and so on. And, usually, a workspace with full history often consumes not significantly more than a single SVN check-out. A Mercurial repository conversion (full history) of my work SVN repository consumes .99 GB, versus a single checked-out workspace in SVN at .3GB. Two SVN check-outs take up .6GB; two HG check-outs (using hg clone) consumes 1.12G. Three SVN, .9GB. Three HG, 1.26. You can see where this is going. And this is an unusual repository, with a large number of binaries stored in the repository.
When I use Mercurial, I miss some Subversion features, such as cheap copies and branches. Some people hate it, but I love the fact that tags and branches in Subversion are merely COW links. It solves a lot of problems in a very elegant way. Mercurial has its own quirks – they all do – but at least merging isn’t broken like it is in Subversion.
The original title of this was “Why Subversion shouldn’t be used for the enterprise,” but then I realized that there’s no reason to use it at all. It would be a fine choice for the enterprise if its merging was done properly and wasn’t full of bugs. The enterprise is really the only place where Subversion’s feature set makes it superior to DVCSes, but the enterprise is the one place where merging really has to be as trouble-free and correct as possible.

Copyright © Sean Elliott Russell

comments powered by Disqus