A situation in which a DVCS could help us out a lot: Adam’s work on re-architecting and reorganizing our domain layer.
First, a little background. We use Subversion, a CVCS (centralized version control system), at work. I geek out a little (a lot) on source control. I use DVCS‘s (both Git and Mercurial) for personal projects, but I’m still using Subversion at work.
Back to the re-architecture project. As you can tell, Adam’s making some seriously disruptive code changes. They’re necessary, but they’re disruptive. He’s been checking in all his changes to our [Subversion] development branch, where everyone else is also doing development on their own features. We’re three weeks into the project.
I was thinking today: what would happen if we said, “Hey, this re-architecture project is not worth the risk. Let’s drop it for now and come back to it later.” Or maybe, “We’re going about this the wrong way. We need to basically undo what we did and start over.”
The entire development team would probably simultaneously crap their pants and look at each other with embarrassed looks on our faces. We didn’t think of that. Now what do we do? Go through and pick out Adam’s re-architecture changes one-by-one and undo them? Or maybe we should start from the version before he started committing his re-architecting changes and applying all NON-re-architecting changes back in one-by-one? Ugh. Every possible approach to this sounds awfully tedious and extremely error-prone. I don’t trust him to be able to do that safely, do you? Of course not, you don’t even know him. We’ve comprehensively backed ourselves into a corner. We can get out, yes, but not without a lot of work and pain.
Hindsight is 20/20
What would’ve been better is for him to create a “feature branch” for the re-architecture work, and work with that branch until his re-architecting work was complete. Then – and only then – he would merge his changes into the main development branch. This way, if we decided to abort that re-architecture project, he can just throw away the branch (or just stop working on it for now), and go back to using the main development branch, which has been kept clean from all those risky re-architecting changes.
Let’s Think this Through
Let’s say we created that nice feature branch for him to work with. He goes along, quietly humming to himself, happily pulling out the proverbial rug from beneath the application code that relies on it, and fixing all the build errors (or not) and committing changes…all in a nice, isolated feature branch. But then how does he ensure a successful merge if and when we complete that re-architecture project? He’s now made major changes to most of our core objects, and others are referencing those objects and even making changes to the very same objects. If after 6 weeks he decides to try to check in all his changes, it’s going to be a Merge From Hell, as you can imagine.
The ideal case is that Adam works in his feature branch, and continually (every morning perhaps) merges changes from the development branch into his feature branch. This way he can make sure he resolves any conflicts soon after they are committed, keeping his feature branch in a state that allows him to merge his branch into the main development branch at just about any time with minimal effort. This ensures that he makes the right decisions when resolving those sometimes-nasty manual merges. He can confer with the necessary developers to help resolve any conflicts while he and that developer both have those pieces of code fresh in their minds. Not to mention, they have time to fix the merge conflict right because they’re not up against a deadline.
This method also avoids the aforementioned Merge From Hell which could easily take several long, tedious, mind-boggling 12-hour days to complete. And any time you have a tired, frustrated developer merging changes which involve others’ work he’s not sure about, you’re bound to run into problems. You might not end up with build errors, mind you, but you will almost certainly have issues with behavior of the system. Sometimes that button that says “Yes, overwrite the other developer’s changes with my current working copy” looks tempting when you’ve been staring at diffs and merges all day, it’s 11pm, and your sole source of nourishment today (or lack thereof) has been coffee.
So, About the Greener Grass on the Other Side?
What does this have to do with DVCS vs. CVCS, you say? Well, it hasn’t so far – until now. In order to work a feature branch in Subversion, Adam would need to make a branch of the entire code base. This itself takes a while. Then, those daily merges need to happen. Those merges will incur about the same overhead in both version control systems (VCS’s), I would say. That Final Merge – you know, the one where the feature is complete and ready to be merged into the development branch – is not handled so well by Subversion. Subversion has strange ways of “remembering” which changes were already merged into a branch.It “remembers” the changes which were merged into a branch using svn:mergeinfo property settings on folders. It’s cluttered and messy, and sometimes just plain doesn’t work. I’ve had problems with these properties in the past where Subversion won’t even let me commit the changes it made to the svn:mergeinfo properties due to some sort of corruption issue, so I regrettably had to manually remove them. In any case, it’s weird and unreliable, and causes much weeping and gnashing of teeth.
DVCS’s handle continuous merging much better because every commit is given a unique hash, and that makes it easy to determine if a change has already been merged into a particular branch. This means that when the DVCS merges two branches, it doesn’t duplicate merges, and is generally better about automatically resolving any merge conflicts because of the way changes are tracked.
Good Habits
Being good with version control systems in general is partly about forming good habits and avoiding bad habits: always add meaningful, concise comments when committing changes; make a tag whenever you do a deployment; learn how to handle merge conflicts (no, pressing “Resolve Conflict” does not resolve the conflict!!!), etc. The good habit to form here is to always create a feature branch when developing new features.
This keeps feature-related commits isolated until they are ready for general consumption. It also gives you the flexibility to abort or pause development on a feature without the risk of deploying unfinished code.
Good Habits Must Be Convenient
It’s arguable that this whole “Always Create a Feature Branch” thing is a good habit no matter what version control system (VCS) you use. However, DVCS’s make this and extremely inexpensive operation, while Subversion (and other CVCS’s) present a hurdle when creating feature branches. The right way to do things has to be convenient.
Have you ever signed up for a gym membership that wasn’t on your way to and from work? If you have, you probably went a couple of times, but it soon became too inconvenient to go to the gym. It was too out-of-the-way, and you just didn’t have time to work out, and so your abs retreated further into your gut (you’re convinced they’re still there somewhere). You knew in your mind that going to the gym was the right thing to do, but you just didn’t have time. The real problem is that it was too inconvenient.
Subversion is like that gym across town. It allows you to do the right thing (create feature branches), sure – but it’s not convenient. In Subversion, you have to go through a whole process of creating a branch, which happens on the server. Then, you have to either switch your working copy over to that new branch, or check out the branch somewhere on your disk. It’s a process which is seemingly reserved for version control purists.
In Mercurial, here’s what you would do from your working copy directory to create a feature branch:
hg branch re-architect
That’s it! And it’s INSTANT. You won’t have time to even toggle back to your browser, let alone read that latest Onion article, as you would when waiting for that new branch to check out on your disk when using Subversion. Even projects with thousands of files will take no time at all for a DVCS to make a new branch. The point is that DVCS’s make feature branches convenient, and that one of the biggest reasons why I like DVCS’s. I know that once I get my team to overcome the hurdle which is the switch from Subversion to Mercurial (the time is coming), I can train them to make feature branches because it’s convenient. This habit will make us more agile, and more adaptable to a constantly changing, increasingly competitive environment.