Monday, January 15, 2007

Convention is better than cure

In a pragmatic world, programmers working in a team adopt a common coding convention because it reduces the amount of mental acrobatics that members of the team have to perform when maintaining each other's code. In an ideal world however, it doesn't matter what coding style you adopt, because it doesn't make a tangible difference to maintaining the code base. Everyone just mindlessly uses whatever convention was drilled into him by well-meaning but short-sighted professors, with no regard for the nuances of good style.

It appears therefore that I'm living in an ideal world - which ironically, is far from ideal. I was the only one working on the project for the first year of its development, and so naturally all the code exhibited a single style - bar minor differences where my style changed over the course of that year (I was still developing my C++ knowledge to a large extent), and even those minor differences I was dilligent about reconciling back to The One Style.

In that first year, I managed to write a large portion of the system - which I'd inherited in the crudest of forms (a couple of hundred lines of shaky (C++)-- code) from a colleague who has since gone on to hopefully better things. However, about 18 months ago, my department finished it's other major project and the entire crew was put full force onto my project, and I had to contend with three other people maintaining my beautiful code.

What made matters worse was that my boss was busy implementing a separate piece of the system (the money-maker component, you could say) from a Delphi spike implementation (almost line for line) - basically independently of the application it was going to have to fit into. I didn't think much of it at the time, but looking back, I'd take continuous integration over the nightmare that resulted from that integration any day - the interface between the two components still feels like a house of cards. His coding conventions were obviously different to mine, but because we weren't even working in the same repository, let alone the same solution, neither of us realised the headache it was going to cause.

So now we're at the point where the system is nearly two and a half years old, and convention is basically non-existent. The particular style element that's got a bee in my bonnet of late is the declaration of pointer variables, and specifically, whether the * should accompany the type, or the variable name. I'll give you an example:

  1. int* myInt = &someInt;
  2. int *myInt = &someOtherInt;

As silly as it sounds, this one character difference has caused a bunch of problems in our team and for our project. Obviously this convention clash hasn't caused bugs - semantically the two statements are equivalent, but we have had problems related to day-to-day feature addition and maintenance programming. For example, when I need to fix a bug in a class, and I want to browse subversion's history for the class's header file, I see that the file has about 20 different changesets and it takes about 10 minutes just to isolate which of those changesets don't alternatively have me changing the convention within the file to style 1 and one of my colleagues changing it to style 2.

This is the kind of crap that can derail your train of thought and kill your productivity for the rest of the day. What makes matters worse is that the "tidying up" is usually done by the programmer in exactly this situation - when they come across the file looking for a bug. This means, when the bug is fixed and the code is checked in, the "tidying up" code is hidden amongst all the bug-fixes, or more commonly and much more dangerously, the bug fix (which is important) is hidden amongst all the "tidying up" (which isn't) - that is, there's a bad signal to noise ratio.

I could go into a 3 page discussion on why style 1 is better than style 2 (which it clearly is), but let's just say for now that if I were asked to join a team working on a code-base with 250K lines of code, and every line used style 2, I'd hopefully be pragmatic (or extreme) about it and just adopt style 2 while working on that project - a common convention makes it easier for everybody.

Or alternatively, being the stick in the mud that I am, I'd write a little script to report all the places where the one instance was used, and I'd either manually go and change all of them (I'm also a sucker for punishment), or write another script to do that. In fact, that's just what I did do to our code today. I threw together a little regular expression (first using Windows' findstr command, but quickly switching to a ruby script) which would tell me all the places in the code where style 2 was used, so I could change them all to style 1.

What I should have done first I guess was write the script and run it for both combinations - so that I could figure out which style was used more. That's what I should have done. What I actually did was assume that my preferred style was more widespread (given that myself and two of the other three maintainers use it exclusively, and the fact that I'd written so much of the code in the first place), and so I jumped straight in and started changing the bad style to the good.

To cut a long story short, about six hours (this is all in my spare time, how sad is that?) and a number of exceptional cases(1) later, my ruby script was reporting that there were no more occurrences of the offending pattern. Feeling very satisfied with myself, I integrated the ruby script into our nant build script and set the build to fail if a violation of the convention occurs again. Luckily, I'd had the foresight to make these changes in a separate working copy of the code, and so I still had the original code to refer to. I made some modifications to the script and ran it to count how many instances there were of style 1 and style 2 on the original code, and the results very gratifying. Approximately 1800 instances of style 1, and 300 instances of style 2 - which just proved my suspicions nicely.

So all that's left to do is check the code into the repository tomorrow when I get back to work, and also the rather distasteful task of informing my team that there is now a strict convention regarding the placement of the * when declaring pointer types, and any instances of going against the convention will cause the build to fail (I believe there's a joke aboyt aryan variables and my coding-naziness hiding in there).

I've toyed with the idea of rather hooking the script into subversion's pre-commit hooks, so that no bad code even gets into the repository, and the checks don't add onto our already straining build time, but unfortunately I haven't had the time to play with such an idea. I can see great benefits though if a coding convention is enforced at this level across all our projects in all the languages we use (mostly C# lately), so perhaps I'll spend some more spare time over the coming weeks identifying little violations like this to build up a comprehensive library of checks and integrating them one by one. I've already installed a pre-commit hook to check that comments always accompany checkins, but even that just gets subverted by people typing in crappy, useless comments.

Thankfully though, it doesn't appear that the script is going to have any major effect on our build-time, which Riaan and I reduced by 15 minutes to 30 minutes total last week (I know, I know, 10 minute build :( ). The script runs on my laptop in about 1.6 seconds, so I think we're OK. For now, I'll just have to wait and see whether this has a positive or negative influence - and the next item on my hitlist is curly braces - both ones missing from single-line conditionals, and ones hiding on the right hand side of the expression, instead of on their own line below. Hopefully that won't meet with too much resistance either.

1. The exception cases included instances where a variable or iterator was being dereferenced, or where multiplication was taking place, or where a function call in a dll was declared.

No comments: