Tuesday, October 9, 2007

Sweeping comments under the rug

In an ideal world, you would never have to deal with source code where there are more lines of code commented out than there are actually being compiled (I'm not talking about informational comments here, I'm talking about commented out lines of code). The harsh reality however, is that there is at least one such source tree in exactly this condition, and it's the one that I've been trying (not single-handedly) to replace for the last three years.

As far as I can see, a code base can only get into a state like this if the following are true:

  • Poor or no source control. We need to change or remove this code, but we're not sure if it's going to work, so rather than delete it and have to re-code it again later, we're going to just comment it out until we're sure the new solution works. (Yeah, right). The right answer is to just delete the code you're not executing, because there's always undo, and for longer-term undo's the code is still in your repository and you can always go back and get it out from there.
  • A culture of ugly source code. Broken windows, mise en place or entropy, call it what you will, but bad code begets much worse code. Keeping code clean is a constant uphill battle against multiple coding conventions, dead or commented-out code, poor design choices, magic numbers and any number of other code smells, but more than anything I think it keeps technical debt low and productivity high.

So now we've examined the problem, let's consider two potential solutions.

  1. Go into your editor's options page and alter the colour scheme so that comments show up as a very light gray over a white background, making them almost invisible, unless the text is highlighted. I've got a coworker who does this, and it seems like completely the wrong "solution" to the problem. It doesn't benefit anyone else, and it means in some cases that he's only got 7 or 8 lines of usable space onscreen at a time. Any useful comments he may actually want to put into the code will either require him to strain his eyes, rely on guesswork or very accurate typing, or will require him to select the text after typing it to see that what he typed is what he got. Don't do this. It's clever, but not smart.
  2. DELETED!!. Do this. Delete all the unused source code you can (especially if it was active at some point in the past in the repository), and also delete as many "helpful" comments as is pragmatically possible. If something is unclear, refactor it to make it clearer, don't try to make excuses for your first attempt at banging the code out. Nobody's perfect, but what differentiates readable code from unreadable code is that the programmer who wrote the readable code first wrote unreadable code, realized how confusing it was, and then refactored it. The other programmer didn't realize (in which case he is either na├»ve or trying to write fortran programs in the language) or he did realize but was too lazy or unprofessional to tidy up after himself.

As a final rant, I'd also like to request the collective effort of every .NET programmer on the planet to ensure the swift and timely death of triple-slash documentation comments. You wouldn't believe how many times I've sat in abject frustration while going through someone else's source code where there are more triple-slash comments than lines of code. They also either go out of date very quickly, or reduce the amount of useful refactoring done, because the two are effectively mutually exclusive.

What makes this even more horrific is that the triple-slash comment is rarely anything other than the return type and the name of the property or method. Having triple-slash comments (and bothering to maintain them when you're on a refactoring binge is one thing), but having generic tool-generated ones that just duplicate what the compiler understands is something even more insane.

Yes, I get that you can collapse them, but then you're left with all these ugly /*...*/ blocks all over the place. And not all of us use Visual Studio for everything - there are alternative text editors out there.

Once again this is an area where pragmatic naming conventions can really make this problem go away. I can sort of understand the need to document the usage of method parameters when your framework almost requires every variable to start with lpsz, as in lpszMyLongPointerToANullTerminatedString, but in .NET with it's reflective capabilities, powerful IDE and generally good naming conventions, triple-slash comments just add to the cruft that I have to wade through to fix your bug.

A better middle-ground approach for those who produce API's which must be consumed by others with no access to the source appears to be Sandcastle. According to the DNR podcast I listened to today, Microsoft programmers also baulked at the idea of documentation writers all up in their code, and so Sandcastle supports the notion of keeping the documentation tied to, but in a separate file to the code. Apparently you can link an external documentation file to the code in Sandcastle, and when you compile the HTML help, it will issue all the relevant warnings as though the comments were in the source file. Good luck keeping the two in synch and up to date, but if you have to produce MSDN-style documentation, I think Sandcastle's probably going to be your best bet.

In conclusion, I'd just like to reiterate that I encourage all programmers to think very carefully about the negative effects of adding any kind of comment whatsoever to your code. Write those comments which are pragmatic and essential, but leave everything else up to good programming style.