The grass is greener on the DVCS side of the fence

A situation in which a DVCS could help us out a lot: Adam’s work on re-architecting and reorganizing our domain  layer.

First, a little background. We use Subversion, a CVCS (centralized version control system), at work. I geek out a little (a lot) on source control. I use DVCS‘s (both Git and Mercurial) for personal projects, but I’m still using Subversion at work.

Back to the re-architecture project. As you can tell, Adam’s making some seriously disruptive code changes. They’re necessary, but they’re disruptive. He’s been checking in all his changes to our [Subversion] development branch, where everyone else is also doing development on their own features. We’re three weeks into the project.

I was thinking today: what would happen if we said, “Hey, this re-architecture project is not worth the risk. Let’s drop it for now and come back to it later.” Or maybe, “We’re going about this the wrong way. We need to basically undo what we did and start over.”

The entire development team would probably simultaneously crap their pants and look at each other with embarrassed looks on our faces. We didn’t think of that. Now what do we do? Go through and pick out Adam’s re-architecture changes one-by-one and undo them? Or maybe we should start from the version before he started committing his re-architecting changes and applying all NON-re-architecting changes back in one-by-one? Ugh. Every possible approach to this sounds awfully tedious and extremely error-prone. I don’t trust him to be able to do that safely, do you? Of course not, you don’t even know him. We’ve comprehensively backed ourselves into a corner. We can get out, yes, but not without a lot of work and pain.

Hindsight is 20/20

What would’ve been better is for him to create a “feature branch” for the re-architecture work, and work with that branch until his re-architecting work was complete. Then – and only then – he would merge his changes into the main development branch. This way, if we decided to abort that re-architecture project, he can just throw away the branch (or just stop working on it for now), and go back to using the main development branch, which has been kept clean from all those risky re-architecting changes.

Let’s Think this Through

Let’s say we created that nice feature branch for him to work with. He goes along, quietly humming to himself, happily pulling out the proverbial rug from beneath the application code that relies on it, and fixing all the build errors (or not) and committing changes…all in a nice, isolated feature branch. But then how does he ensure a successful merge if and when we complete that re-architecture project? He’s now made major changes to most of our core objects, and others are referencing those objects and even making changes to the very same objects. If after 6 weeks he decides to try to check in all his changes, it’s going to be a Merge From Hell, as you can imagine.

The ideal case is that Adam works in his feature branch, and continually (every morning perhaps) merges changes from the development branch into his feature branch. This way he can make sure he resolves any conflicts soon after they are committed, keeping his feature branch in a state that allows him to merge his branch into the main development branch at just about any time with minimal effort. This ensures that he makes the right decisions when resolving those sometimes-nasty manual merges. He can confer with the necessary developers to help resolve any conflicts while he and that developer both have those pieces of code fresh in their minds. Not to mention, they have time to fix the merge conflict right because they’re not up against a deadline.

This method also avoids the aforementioned Merge From Hell which could easily take several long, tedious, mind-boggling 12-hour days to complete. And any time you have a tired, frustrated developer merging changes which involve others’ work he’s not sure about, you’re bound to run into problems. You might not end up with build errors, mind you, but you will almost certainly have issues with behavior of the system. Sometimes that button that says “Yes, overwrite the other developer’s changes with my current working copy” looks tempting when you’ve been staring at diffs and merges all day, it’s 11pm, and your sole source of nourishment today (or lack thereof) has been coffee.

So, About the Greener Grass on the Other Side?

What does this have to do with DVCS vs. CVCS, you say? Well, it hasn’t so far – until now. In order to work a feature branch in Subversion, Adam would need to make a branch of the entire code base. This itself takes a while. Then, those daily merges need to happen. Those merges will incur about the same overhead in both version control systems (VCS’s), I would say. That Final Merge – you know, the one where the feature is complete and ready to be merged into the development branch – is not handled so well by Subversion. Subversion has strange ways of “remembering” which changes were already merged into a branch.It “remembers” the changes which were merged into a branch using svn:mergeinfo property settings on folders. It’s cluttered and messy, and sometimes just plain doesn’t work. I’ve had problems with these properties in the past where Subversion won’t even let me commit the changes it made to the svn:mergeinfo properties due to some sort of corruption issue, so I regrettably had to manually remove them. In any case, it’s weird and unreliable, and causes much weeping and gnashing of teeth.

DVCS’s handle continuous merging much better because every commit is given a unique hash, and that makes it easy to determine if a change has already been merged into a particular branch. This means that when the DVCS merges two branches, it doesn’t duplicate merges, and is generally better about automatically resolving any merge conflicts because of the way changes are tracked.

Good Habits

Being good with version control systems in general is partly about forming good habits and avoiding bad habits: always add meaningful, concise comments when committing changes; make a tag whenever you do a deployment; learn how to handle merge conflicts (no, pressing “Resolve Conflict” does not resolve the conflict!!!), etc. The good habit to form here is to always create a feature branch when developing new features.

This keeps feature-related commits isolated until they are ready for general consumption. It also gives you the flexibility to abort or pause development on a feature without the risk of deploying unfinished code.

Good Habits Must Be Convenient

It’s arguable that this whole “Always Create a Feature Branch” thing is a good habit no matter what version control system (VCS) you use. However, DVCS’s make this and extremely inexpensive operation, while Subversion (and other CVCS’s) present a hurdle when creating feature branches. The right way to do things has to be convenient.

Have you ever signed up for a gym membership that wasn’t on your way to and from work? If you have, you probably went a couple of times, but it soon became too inconvenient to go to the gym. It was too out-of-the-way, and you just didn’t have time to work out, and so your abs retreated further into your gut (you’re convinced they’re still there somewhere). You knew in your mind that going to the gym was the right thing to do, but you just didn’t have time. The real problem is that it was too inconvenient.

Subversion is like that gym across town. It allows you to do the right thing (create feature branches), sure – but it’s not convenient. In Subversion, you have to go through a whole process of creating a branch, which happens on the server. Then, you have to either switch your working copy over to that new branch, or check out the branch somewhere on your disk. It’s a process which is seemingly reserved for version control purists.

In Mercurial, here’s what you would do from your working copy directory to create a feature branch:

hg branch re-architect

That’s it! And it’s INSTANT. You won’t have time to even toggle back to your browser, let alone read that latest Onion article, as you would when waiting for that new branch to check out on your disk when using Subversion. Even projects with thousands of files will take no time at all for a DVCS to make a new branch. The point is that DVCS’s make feature branches convenient, and that one of the biggest reasons why I like DVCS’s. I know that once I get my team to overcome the hurdle which is the switch from Subversion to Mercurial (the time is coming), I can train them to make feature branches because it’s convenient. This habit will make us more agile, and more adaptable to a constantly changing, increasingly competitive environment.

The Right Thing to Do isn’t always The Right Thing to Do

The current way of doing things leaves us wide open to a SQL injection attack. -Andrew

I work with this guy who is called Andrew. He tends to be very thorough, and has a valuable, but often annoying habit of finding subtle inconsistencies in our software. Often times he laughs, I turn to see him smiling ear-to-ear, and he proclaims with great pride that he’s found a potential bug in the software. He derives great amusement from this every time. He’s been known to laugh, remove his headphones, and declare, “Haha! I found a bug! I don’t see how this could ever have worked!!” And it gets under my skin every time, because I’m the one who probably wrote that piece of code that’s giving him some twisted form of pleasure, and I’ve already predetermined that somehow, someone else broke it.

Personal egos aside, these are scary words to hear. In this particular instance, he’s pointed out an area of our software in which we could be susceptible to a SQL injection attack.

Is It Worth It?

We take pride in our software, and something like this hits me in the center of my idealistic core. Alarms are going off in my head, and my heart rate increases. My initial thought is, “Let’s drop everything and fix it.” But then I take a step back. This vulnerability exists in one place, in a very specific, very unlikely scenario. And, we’re also talking about a very secure application that is exposed to only users with proper credentials to access. And, in order to fix this vulnerability, we’d have to spend a week rewriting a bunch of pieces of code just to close up this one tiny hole. We ask ourselves, “Is it worth it?”

I tend to be an idealist about the design of software I work on. My team and I work in a very fast-paced environment, and we often come across areas of the software which we believe should work differently. We discuss the Right Thing to Do and come to a conclusion of a much grander, much more flexible design. But then we always have to step back and ask ourselves, “Is it worth it?”

Software that Works

At the end of the day, what really matters is that we deliver software that works. Security, of course, is of paramount importance. Maintainability and beauty of the code is important and necessary, but in my experience, most of the time we just don’t have time to redesign and rewrite things to be as they “should” be. We’d be making a better code base, yes, but do our customers really notice or even care? No. If our team spends 6 months making all the code “Right”, and then made the software available to our customers, why would they upgrade? They would ask “What are the new features?” We could only say, “Well, we rewrote the back-end of a couple of our reports, re-factored a bunch of business logic classes, and changed a bunch of our icons to use these things called ‘sprites’ instead.” Big deal. Nobody’s going to upgrade the software if there are no measurable benefits. That would be 6 months wasted. 6 months during which the competition gets ahead of us.

I would say there are different levels of “acceptability” as follows:

  • Unacceptable
  • Acceptable
  • Good
  • Perfect
The Law of Diminishing Returns is in full effect here. The key is to define which pieces of the software fall in the “Unacceptable” category, and address those problems the best you can, while still adding customer-perceived value to the software. It’s definitely a tricky thing to do. It can go against your idealistic tendencies. It can cause heated arguments among product managers and developers. You have to remember that if all you do is go around re-factoring everything to make the code more beautiful, you’ll never get anywhere.