Saturday, December 22, 2007

Can I quit sleeping like I quit coffee and alcohol?

In an ideal world there would be no sleep. Everybody would be awake all of the time - we would be like gods - surely if god knows everything you do then he hasn't got time for sleep, unless his name is also Bruce and he gets a daily summary in his inbox.

All of you people (all maybe 1 or 2 of you) are probably sitting there thinking to yourselves "Hey, I like my sleep. I wouldn't trade it in for anything". But you're missing the point. In the utopia I'm describing, you wouldn't have any of those feelings of emptiness at the loss of something vital - something that you've always had, because it would never have been there to begin with.

I'll give you an example. In 20 years or so your kids won't be able to imagine a time when it wasn't possible to communicate with a friend in under 30 seconds, no matter where in the world that friend is. Your kids will grow up in a world where cellphones and the internet are ubiquitous, and actually probably obsolete.

My mom likes to tell stories about how in those days they used to listen to test match cricket (no Twenty20 back then, or even ODI's) on the radio, while building jigsaw puzzles. There was no television back then. She still remembers the test picture that was broadcast in the months leading up to actual programs being launched. They used to watch the test picture! My gran remembers this too.

For you and I television is a basic human right. It's something that's been there ever since we first started yelling our opinions into this world, and it's not going away any time soon. For the older generations this isn't true, and this difference in opinions is blatantly obvious to me. My gran, for example, cannot bear the thought of the television not being on, regardless of the time of day or the quality of the programming.

I suspect this stems from remembering a worse time - a time when you didn't have a choice as to whether or not you wanted to watch TV. Either there was no TV, or it was prohibitively expensive, and in the latter case you probably buckled down, worked hard and saved up enough money to buy a set and the ridiculous "license" that goes with it, so you could enjoy your basic human rights.

As a quick aside, I wonder if the Americans were to re-examine their constitution and tack on a few more amendments if they majority of them might not choose to scratch out "the right to bear arms" and replace it with "the right to watch TV"? Looking at their average physique I think they might, despite all those NRA crackpots. I wonder what would happen if you asked them to vote on it - if they could only have one of those phrases in their constitution - which one would come out. I guess one of the phrases would end up with more votes and they'd end up putting the other phrase into the amendment.

Now, far be it for me to stand here and point fingers without examining my own tendencies. I'm no different in mechanics when it comes to this sort of behaviour - it's just the details that vary. For me, I personally get a mental equivalent of "the shakes" if I'm away from an internet connection for more than a day or two. It's the same thing. I can remember a time when I didn't have an internet connection. I can also remember a time when I did have one, but it was prohibitively expensive. Finally it's gotten to the point where it's still prohibitively expensive, at least where I live, but I fork out for it anyway, because now I can't imagine living without it.

But anyway, I digress slightly. Back to that ideal world. As always, there's no such thing as Utopia, so I guess we'll just have to find a way to hack what we have got a little bit. In the real world, most people sleep between 6 and 9 hours a day - or rather a night. They do this in one long phase and are therefore said to have a monophasic sleeping pattern. These people are, by and large, content with this status quo - they never question whether there's any other way to structure their sleep.

However, thanks to the glorious age of the internet, a few square-peg-in-a-round-hole kind of people have experimented with other sleeping schedules, including polyphasic sleep. These people have found that it's possible to function normally (or even better than "normally") on as little as 2 or 3 hours of sleep a day, spread out over 20-25 minute naps every four to six hours, depending on the details of the actual schedule.

To cut a long story short, I first discovered this idea over 2 years ago, and decided I'd like to give it a try. Unfortunately, at the time I was less than a year into my first job out of university, and this is the sort of thing that takes a month or so to adjust to, so I haven't had an opportunity to experiment until now. I've saved up basically all my leave over the last 3 years and now, over the December holiday period, I'm going to try and go polyphasic.

I think I'll try and post occasional updates here to journal my progress, but I don't really want to do the whole cliched blog-your-polyphasic-experiment thing. With that out of the way, let me launch straight into a status report (there's nothing like a little hypocrisy on zero sleep) :

  • I'm now starting what I consider to be day 2 - it's almost 8 AM and I haven't been to bed since I woke up almost 24 hours ago. I did have a 20 minute nap around 5:20 this morning, but I don't think it was very restful. Surprisingly, I didn't struggle to wake up after it, but that may partly be down to a cool mp3 I found which I think will help me drop off more easily, and will then also wake me up after the right amount of time.
  • I did sort of ease into the experiment by staying up late (3 AM) the previous 2 days before my all-nighter, and waking up relatively early (9 AM, which is my usual waking time) - so I have been running short on sleep for a couple of days already.

For now, I have nothing more to report, so I'm going to wrap this entry up, because it's time to try and take a nap again. Coincidentally, I think I'm going to try naps at 4, 8 and 12 AM and PM - so feel free to not phone or sms me at these times - at least until I get an alarm mp3 player that isn't my cellphone :). Enjoy hibernating.

Friday, November 30, 2007

Experimenting with IronRuby

In an ideal world, the last 2 months would not have blown me away in a hurricane of work and study, and I would have had some time to play with IronRuby. Unfortunately as you by now know, I don't live in an ideal world. :(

Anyway, enough feeling sorry for myself and on with the experimentation. I resumed tonight (after updating my working copy from the server) in my efforts to get some simple http programming going in IronRuby. I've spent the entire week in Juval Lowy's WCF Masterclass and although http is just the tip of the iceberg in terms of WCF, watching an expert in action has inspired me somewhat. A large part of this inspiration stems from the absolutely phenomenal amount of extremely high-quality material (including free source code!!) that Juval makes available not only to attendees, but to the public in general at IDesign.

So, first things first, I took a quick look in the file rss.rb in my oddly named "projects\rss" folder to try and remember what I'd been working on the last time. Inside I found the following:


require 'net/http'

h = Net::HTTP.new('www.pragmaticprogrammer.com', 80)
resp, data = h.get('/index.html', nil)
if resp.message == "OK"
    data.scan(/<img src="(.*?)"/) { |x| puts x }
end

Oh yes, that's right! I'd pulled this piece of code almost directly (maybe directly?) out of Ruby's Pickaxe book - the .chm version which installs with MRI. This was supposed to be my glorious start on the road to understand the inner workings of RSS - hence the (for now) irrelevant folder and file names.

I remember I was trying to get it to run under IronRuby and I also remember that the IronRuby parser had some trouble with parsing the net/http file. At the time, I spent about 3 hours tracing through the debugger just trying to figure out which line in net/http the problem was on and eventually I narrowed it down to a couple of methods with the (for me) unusual syntax of:


...
variable = dictionary_lookup["key"] or return nil
...

It turned out that at the time, the parser couldn't handle this (the or return nil part), so I dutifully wrote a test case which reproduced the problem without requiring all 2300 lines of net/http.rb:


def fun(input)
    a = input == 'test' or return nil
    return a
end
puts fun('not_test')

Unfortunately, at this point I had no idea where to even begin to look to sort this bug out, so I simply submitted the test case to the mailing list, got snowed under at work and forgot about the bug.

In between, a few months have passed and I remember seeing an email on the list a few weeks ago saying that this problem had been resolved, so tonight I decided to try my luck again. After spending half an hour or so trying to puzzle my way through the solution configuration (rake compile config=release is no good if you want to step through the debugger) I eventually managed to get the interactive console running in the debugger. A quick require 'net/http' later and I was once again up to my elbows in exceptions:


System.NotImplementedException: Statement transformation not implemented: WhileLoopStatement
  at Ruby.Compiler.Ast.Statement.Transform(AstGenerator gen) in D:\Projects\IronRuby\trunk\src\ironruby\Compiler\Ast\Statements\Statement.cs:line 32
  ...

Unfortunately, I'd rolled back all the code that I'd put in to print the lines of ruby code as they were being parsed to track down the parser errors (so that I didn't get conflicts while updating from the repository), so again it took a little fiddling with the debugger to puzzle out exactly where the problem was. Apparently my previous bug has been sorted out, but that's just allowed the compiler to progress a little further through the file before stumbling on something else. It looks like the code for while loop statements (not expressions) hasn't been written yet, and that's causing the compiler to throw a NotImplementedException while trying to understand the request method on class HTTP which has the following code:


def request(...)
...
    begin
        res = HTTPResponse.read_new(@socket)
    end while res.kind_of?(HTTPContinue)
...
end

It looks like the code is throwing an exception in the base class (there's no overriding implementation of Transform in WhileLoopStatement yet), so my first shot at getting past this problem was just to override the method and return null. That seemed to help a little, because I no longer saw the same exception, but it was just the illusion of progress because a short while later I got an assertion about an object that cannot be null.

A little bit of searching through the solution led me to the WhileLoopExpression class in Ruby/Compiler/Ast/Expressions. This class already has an implementation in it, so I figured I'd try and just copy/paste (shudder) it into the WhileLoopStatement class and see what happened. After resolving a few compiler issues (I had to instantiate a List<Statement> object and put the single Statement object that I had into the list before passing this list through to gen.TransformStatements) I was able to compile and start the interactive console up again. Another quick require 'net/http' and I seem to be past the issue, but now I get:


System.NotImplementedException: TODO
  at Ruby.Compiler.Ast.SuperCall.TransformRead(AstGenerator gen) in D:\Projects\IronRuby\trunk\src\ironruby\Compiler\Ast\Expressions\SuperCall.cs:line 42

Hmm, it seems something in the file is trying to call the superclass's implementation of whatever it's overriding. Two minutes looking into SuperCall.cs and I realize that I'm not going to get so lucky with this one. I don't know enough about the internal workings of the compiler (or ruby itself for that matter), so I try a different tack. I open http.rb and I search through it for the word "super". It turns out that it's only called from 2 places in the file, and it doesn't look like the compiler will break if I just comment those two lines out. (Note: At this point I've made a copy of net/http.rb in case I ever actually want to use it for anything - the copy sits in a net folder inside of where Visual Studio builds my interactive console to).

Back to rbx and I type my single line of ruby once again and... Success! Well, sort of. It looks like the compiler is done with net/http - it's managed to get all the way through a 2300 line ruby library - not bad for pre-alpha software. Unfortunately, net/http doesn't live in a bubble - it requires net/protocol, which I haven't yet copied to my working directory. This causes rbx to throw an exception:


Ruby.Runtime.LoadError: Could not load file or assembly 'net\/protocol' or one of its dependencies. The given assembly name or codebase was invalid. (Exception from HRESULT: 0x80131047) ---> System.IO.FileLoadException: Could not load file or assembly 'net\/protocol' or one of its dependencies. The given assembly name or codebase was invalid. (Exception from HRESULT: 0x80131047)
File name: 'net\/protocol'

Apart from the HRESULT (COM -> *shudder*) this seems like a fairly easy problem to solve. I copy protocol.rb from the same place I got http.rb and try again, and now I'm met with the exception:


Ruby.Runtime.LoadError: Could not load file or assembly 'socket' or one of its dependencies. The system cannot find the file specified. ---> System.IO.FileNotFoundException: Could not load file or assembly 'socket' or one of its dependencies. The system cannot find the file specified.
File name: 'socket'

"This is going to be easy" I think to myself as I alt-tab back to the ruby libraries folder. "What's taking Microsoft so long?" I think to myself as I start to search for socket.rb. Unfortunately this thought is cut short when I realize that there doesn't seem to be a socket.rb file anywhere in my ruby libraries folder. Fearing the worst, and already regretting my hasty MS-bashing, I widen my search to just look for files with socket in them. This turns up socket.c => uh-oh. It looks like the socket library hasn't been implemented in ruby itself but rather in C (for reasons of speed, cross-platform...ity and the convenience of working closer to the hardware I presume). "So this is what they've been talking about on the list - re-implementing the native C libraries" I mutter.

After giving it some thought while writing this article, it looks like my RSS reader is going to have to wait for a while - at least until somebody adds enough socket support to IronRuby to allow me to do some http communication. Unfortunately, the only code I've been able to add so far is the copy/pasting I did in WhileLoopStatement, and from what I understand, Microsoft is not taking community contributions to the compiler - only to the libraries.

I guess I'll fire off an email to the list anyway with the code in it, in case it is actually correct and they do actually accept it, and even if they don't perhaps I can get some feedback as to what I should have done differently so that I'll know better for next time. In the mean time, I guess I can get started implementing some of the socket functionality - it doesn't look like anybody else has offered to tackle this yet. At least there, I've got more chance of having my work accepted, and it also seems like a smaller chunk to bite off than an entire http library.

Wish me luck!

Tuesday, October 9, 2007

Sweeping comments under the rug

In an ideal world, you would never have to deal with source code where there are more lines of code commented out than there are actually being compiled (I'm not talking about informational comments here, I'm talking about commented out lines of code). The harsh reality however, is that there is at least one such source tree in exactly this condition, and it's the one that I've been trying (not single-handedly) to replace for the last three years.

As far as I can see, a code base can only get into a state like this if the following are true:

  • Poor or no source control. We need to change or remove this code, but we're not sure if it's going to work, so rather than delete it and have to re-code it again later, we're going to just comment it out until we're sure the new solution works. (Yeah, right). The right answer is to just delete the code you're not executing, because there's always undo, and for longer-term undo's the code is still in your repository and you can always go back and get it out from there.
  • A culture of ugly source code. Broken windows, mise en place or entropy, call it what you will, but bad code begets much worse code. Keeping code clean is a constant uphill battle against multiple coding conventions, dead or commented-out code, poor design choices, magic numbers and any number of other code smells, but more than anything I think it keeps technical debt low and productivity high.

So now we've examined the problem, let's consider two potential solutions.

  1. Go into your editor's options page and alter the colour scheme so that comments show up as a very light gray over a white background, making them almost invisible, unless the text is highlighted. I've got a coworker who does this, and it seems like completely the wrong "solution" to the problem. It doesn't benefit anyone else, and it means in some cases that he's only got 7 or 8 lines of usable space onscreen at a time. Any useful comments he may actually want to put into the code will either require him to strain his eyes, rely on guesswork or very accurate typing, or will require him to select the text after typing it to see that what he typed is what he got. Don't do this. It's clever, but not smart.
  2. DELETED!!. Do this. Delete all the unused source code you can (especially if it was active at some point in the past in the repository), and also delete as many "helpful" comments as is pragmatically possible. If something is unclear, refactor it to make it clearer, don't try to make excuses for your first attempt at banging the code out. Nobody's perfect, but what differentiates readable code from unreadable code is that the programmer who wrote the readable code first wrote unreadable code, realized how confusing it was, and then refactored it. The other programmer didn't realize (in which case he is either naïve or trying to write fortran programs in the language) or he did realize but was too lazy or unprofessional to tidy up after himself.

As a final rant, I'd also like to request the collective effort of every .NET programmer on the planet to ensure the swift and timely death of triple-slash documentation comments. You wouldn't believe how many times I've sat in abject frustration while going through someone else's source code where there are more triple-slash comments than lines of code. They also either go out of date very quickly, or reduce the amount of useful refactoring done, because the two are effectively mutually exclusive.

What makes this even more horrific is that the triple-slash comment is rarely anything other than the return type and the name of the property or method. Having triple-slash comments (and bothering to maintain them when you're on a refactoring binge is one thing), but having generic tool-generated ones that just duplicate what the compiler understands is something even more insane.

Yes, I get that you can collapse them, but then you're left with all these ugly /*...*/ blocks all over the place. And not all of us use Visual Studio for everything - there are alternative text editors out there.

Once again this is an area where pragmatic naming conventions can really make this problem go away. I can sort of understand the need to document the usage of method parameters when your framework almost requires every variable to start with lpsz, as in lpszMyLongPointerToANullTerminatedString, but in .NET with it's reflective capabilities, powerful IDE and generally good naming conventions, triple-slash comments just add to the cruft that I have to wade through to fix your bug.

A better middle-ground approach for those who produce API's which must be consumed by others with no access to the source appears to be Sandcastle. According to the DNR podcast I listened to today, Microsoft programmers also baulked at the idea of documentation writers all up in their code, and so Sandcastle supports the notion of keeping the documentation tied to, but in a separate file to the code. Apparently you can link an external documentation file to the code in Sandcastle, and when you compile the HTML help, it will issue all the relevant warnings as though the comments were in the source file. Good luck keeping the two in synch and up to date, but if you have to produce MSDN-style documentation, I think Sandcastle's probably going to be your best bet.

In conclusion, I'd just like to reiterate that I encourage all programmers to think very carefully about the negative effects of adding any kind of comment whatsoever to your code. Write those comments which are pragmatic and essential, but leave everything else up to good programming style.

Thursday, September 13, 2007

Contributing to IronRuby

In an ideal world, John Lam would email me personally begging me to write him an efficient YAML parser using the DLR, because Microsoft just doesn't have the expertise to pull this off internally. In reality, he's obviously more than competent, and based on the IronRuby mailing list activity in just the first couple of days it looks like he's going to have no shortage of people willing to step in and help out getting IronRuby off the ground. He's posted a Getting Started guide for contributions to the IronRuby project, in which he says his most pressing need is for C# implementations of a YAML parser and zlib.

I've been toying with the idea of contributing to an open source project in my spare time (for me, spare time only exists in my delusional, ideal world by the way - not the real world), and I've been trying to decide which project is worthy of all the free time I'm going to have after I go polyphasic at the end of the year. Here's a brief list of the best candidates, as well as the reasons why I don't think the relationship would work out:

  • NHibernate - Appears to be stable now and I'm not sure if I want to do lots of maintenance coding. Also, with LINQ coming, I'm not sure where NHibernate's future lies. Lastly, although I do use it on occasion, I'm not nearly proficient enough to know how/where to contribute.
  • Boost - Although I am part C++ programmer, I'm mostly (98% at this point) focused on Windows development, and as such I'm worried I'd be writing non-portable code, which they'd naturally reject. Plus, the complexities of hardcore template meta-programming, and the issues involved in compatibility between different compilers scares the hell out of me. Just looking at the boost source code (we use a couple of the libraries) is enough to discourage one from contributing.
  • Subversion - For me, source control is a tool, not a domain. All my professional development is focused on financial software, and while it's true that I couldn't get by day-to-day without subversion, the mechanics of source control just aren't that interesting to me. Plus again there's platform issues, and while C is notably easier than boost-level C++, it's still not that pleasant to work in for a .NET junkie.
  • Mono - Not enough Linux experience. I would say this is sort of my holy grail, my Everest if you will. Perhaps one day I will be fit to join the ranks of contributors to the Mono project. All Hail Miguel!

Since starting to work in the real world and thereby having all day internet access, I've been exposed to the wonders of RSS and more recently to programming.reddit.com. One of the first things you learn when reading the likes of Martin Fowler and Uncle Bob is that the language that all the cool kids are using these days is Ruby.

Similarly, when listening to .NET podcasts and screencasts, one of the first things you learn is that Microsoft is aware of the rise of Ruby, and has launched a Dr. Evil (or Pinky and the Brain maybe) -style plan to take over the world by having .NET run in all modern browsers on all modern desktop platforms, with John Lam as the Number 2 Man. ("This is my Number 2 man. His name... is Number 2.")

Coupled to these two nuggets is Microsoft's recent stance reversal in the areas of open/closed source code and source code licensing, and together this trinity of events has presented me with what I believe to be the ideal target for my over-eager programming zeal. So, I'm going to explore the possibility of contributing to the IronRuby project.

This is of course easier said than done - John's contribution guide and request list is rather threadbare at this point (to be fair, the project isn't even two weeks old yet from a public point of view). With only two items on the wishlist (the YAML and zlib bits), and seemingly a plethora of people with more time and ruby knowledge than myself eager to contribute, it looks like I'll have to wait around in the cold until things mature a little more. At least, this is what I thought after my first read through of the Getting Started Guide.

However, when I read it through a second time, an innocuous little sentence jumped out at me and gave me a really good idea: "We're also looking for bug reports and bug fixes." I realized that I don't have to be an idiom wielding ruby wizard to contribute, and I also don't have to have tons of free time on my hands. All I need to do is write ruby programs and make sure that they work in IronRuby. Surely at this point in the implementation it wouldn't be hard for me to write something (albeit perhaps not in idiomatic ruby), that runs under MRI but fails to run under IronRuby?

I've already got a few pet-project ideas (a simple RSS reader, a Sudoku game and a html-screen-scraper for my country's satellite TV guide (for which the service provider doesn't provide an XMLTV service for fear of losing sales of their proprietary PVR system)) which I think I could pull off without too much effort in ruby, so for now my strategy is going to consist of making a start on one of these pet projects in MRI and make it work in IronRuby and on Moonlight as well.

I do also have a fallback strategy for helping with those parts of the code which I may not be allowed to contribute to at the present. Martin Fowler appears to have started a rumour a while back that John and his team may not be allowed (from a corporate legal policy point of view) to look directly into the source code of MRI (for fear of being blinded I guess) when implementing IronRuby. This will naturally make it difficult for them to make a compliant runtime.

One possible remedy I see to this situation is for those of us who have no such restrictions to write a suite of unit tests for the core developers - a set of guide rails (no pun intended) which they can use to flesh out their implementation. That is - if I can write a unit test which exercises some portion of MRI's standard libraries (which I can do easily as I have access to the source) and which passes under MRI but fails under IronRuby, then I can pass that test along to Microsoft and have them write the code to make the tests pass.

By doing this, they aren't exposed to legal issues of looking at the source directly (there's a geeky solar eclipse metaphor in here somewhere I'm sure), and the community develops a set of unit tests for core library functionality which will be useful not only for IronRuby, but for MRI as well (for backwards-compatibility reasons) as well as JRuby and also to anyone else who wants to write or fork a ruby implementation.

In all, I think the future looks bright for Microsoft and Open Source and myself. Now if I could only get around to sleeping less than 3 hours a day...

Wednesday, March 28, 2007

Mentoring

In an ideal world, mentoring would be easy. You would already know ahead of time where your pupil's skills were most lacking, and you'd immediately know the best way of building those skills - making the tiny tears in his mental muscles everyday that cause them to grow and strengthen. Just to remind you, this is all happening in that ideal world I live in - the one in my head.

Back out here in the real world however, it turns out that mentoring isn't quite as easy as all that. There are the inevitable shortcomings of the pupil that just don't seem to take to the exercises, but thankfully those incidents have been few and far between (mostly to do with low-level stuff that they just don't teach kids these days). There's also the issue of the shortcomings of the mentor, but let's just gloss over that for a minute shall we? Then there's the fact that he appears to pick up a lot of the things that I believe are important in software development (basically anything in Uncle Bob's PPP book) without any trouble, which is leaving me struggling to find enough work for him to do.

Before I proceed, a quick disclaimer: At the present moment, I've been a mentor for around one month, and I've only had to mentor one individual and he's come to work for us straight out of university - i.e. he's got no previous professional software development experience as far as I know. This is important for a couple of reasons:

  1. I'm acting as the customer. The product we're building is as close to shrink-wrapped software that we as a company get, and at present we don't have it running at any clients, so I'm acting as the customer, and more importantly as the business expert until we get a stable, feature-complete version to take out to clients for further input. This means that I get to shoot down any suggestion I don't like by providing a valid (but probably incomprehensible to him) business reason as to why we can't do it. (I try not to abuse this power too much MUAHAHA).
  2. I've recently been promoted to the position of architect for our entire product line going forward, a bold and risky move on the part of my employer if I do say so myself :). There are all sorts of issues here - enough for at least a separate article, but as far as this topic goes, it means that the program he's working on is my baby (my first baby), and its success will be the feather in my cap that proves that my company's confidence in me is not misplaced. In other words, I desperately need this system to score big time.
Given my role as architect / coach, it appears that more of this mentoring work lies in my near future, but I suspect the circumstances for the next pupil will be remarkably different. Firstly, the person I'll be mentoring next has actually been in professional software development a lot longer than I have - my two-and-a-bit year's worth hardly seems enough to actually be an architect or write about mentoring, whereas I suspect he's been with the company on and off (mostly on) for close to 10 years now. That being said, 10 years ago there was no automated testing or continuous integration at my company - there wasn't even object orientation, and most developers worked in a proprietary Pascal-like language over a BTrieve database which is still driving a whole bunch of our product lines today. Don't even think about there being any web-based development - we're still trying to catch up in this area in 2007.

Another obstacle is going to be the fact that this time round I'm coming in with almost no business expertise - this system is only very loosely tied to the project I've been on for the last two years. My new pupil on the other hand wrote almost the entire old system that we've been asked to replace. This is scary for me, because I like to have as many facts as possible, and I worry that I'm going to make some terrible decision down the road because I didn't know enough about the business. I'm trying to counteract this risk with agile development, where almost by definition no decision is made before as much is known about it as possible, but that means investing a lot of time and effort into training TDD into the pupil. (Not that I'm bemoaning that time or effort - it's fun seeing the spark in someone's mind when they realize there's a better way, but it is time consuming, and there may be some resistance to the approach that I'll need to try and work around / dissolve.)

As far as my own short-comings with regards to being a mentor (and you thought I'd just skip the self-flagellation), it turns out that one of my deficiencies is an inability to communicate verbally when it comes to source code. That is, I find myself trying to explain how we're going to do something, but after a two minute whiteboard explanation when I'm still met with a blank stare, I just resort to typing the code that's burning a hole in my brain into the editor and hitting compile myself, with the understudy watching on (probably in disbelief). This usually takes less than the two minutes that I wasted on an explanation, which leads me to wonder why I'm even wasting time mentoring other people when I could be cranking out code :).

It's a problem in the other direction as well, in that I have a hard time visualising other people's code ideas in my head. I can't count the number of times where someone has explained a problem with their code to me, and I just can't line the pieces up in my head - I always resort to saying "I can't really make a comment without actually seeing the code". This leads me to wonder if I'm unique or at least in the minority in this regard, or if this is a fairly common incident.

On the whole, as far as mentoring goes (remember I'm doing this in between writing code for the projects I'm personally still responsible for, and architecting our new systems going forward), I guess things could be going much worse. It's useful to have a lackey to do the odd job that you just don't get around to (like changing our server apps from console applications to windows services), and it's encouraging to think that maybe in a couple of years more people I work with will know about the Single Responsibility Principle (or even polymorphism would be nice), than those that don't. Also I'm enjoying the teaching aspects of the job, but that could just be because he's a pretty sharp learner. We'll just have to see how things go with pupil number two, which should start this week still if I can find the time for it.

Monday, March 5, 2007

Go blow a fish

In an ideal world, everybody would be an expert on everything (not just my girlfriend), and we wouldn't need to stand on the shoulders of others to see over the next hill. However, back here in the real world, this isn't the case - some people just know more about some stuff than other people.

Now, usually I'd be the last person to admit that I don't know something about something, but when it comes to the security of applications which are responsible for the transferral of huge sums of money every day, I'm more than willing to step back and say to someone else "It's OK. You can field this one."

Take my boss for instance. Sometime late last year, around September I should imagine, he implemented the security layer for our financially-oriented system. The basic plan was as follows:

  1. We know which encryption scheme we want to use - blowfish. We know this because our old system, which we're sort of replacing, uses blowfish and we haven't had any complaints with it.
  2. Let's find some source code on the net that does "blowfish encryption" bend it to our will, wrap it up in a dll, and call it from our code.
  3. <Insert typical pot-at-the-end-of-the-rainbow step 3 here>. Something like "Drink coffee" or the underpants gnomes "Profit" would be appropriate.

So everything proceeded according to plan, and soon enough we had a workable encryption solution, endorsed by our client's Microsoft consultant, and my words of caution were dismissed as negativity. These words of caution went something along the lines of "we don't know anything about security or encryption, so we have no way of verifying whether this algorithm we've spot welded in will be reproducible by anyone else".

A little bit about our architecture: Our system is client-server, but the API to the server is both proprietary and open - proprietary in the sense of it's not HTTP, or BitTorrent, or SMTP or any other common protocol - and open in the sense that our API is documented in a publicly available document and we've just written a reference implementation for our client to pass on to their clients.

So, amidst the congratulations at securing a multi-million dollar a day application in under a week, my protests and cautions were drowned out with comments like "Blowfish is a standard algorithm". Turns out though, now that we're almost into external testing (where these fabled "other developers" will have a chance to execute and test the code that they've written to our API on our test servers), and all of a sudden, none of them are getting the expected outputs when they run their inputs through their blowfish encryption algorithms.

So we were at a fork in the road today: We could either

  • drop the blowfish (inadvisable at this late stage in the game, but still a possibility) in favour of something built into .NET
  • give everyone else our blowfish implementation (in dll or C++ source form), and hope that none of them want to write a client in java (this was standard practice on older systems)
  • change our code to the new standard blowfish implementation, which we've managed to find on another website.

Hmm, that last one didn't come out nearly sarcastic-sounding enough, considering that's the approach we've gone with, and the 20 extra heartbeats I'm generating every minute since the decision was made.

Something just seems very wrong to me here. My boss is insistent that encryption is just something you do - you get some source code from somewhere (hopefully you're allowed to copy and paste it, but we won't go into that here), you build it into a dll, write a test program that makes sure "Hello world" comes out as "@#$@9asdm,we*" on the other end, and finally make sure that "@#$@9asdm,we*" can be decrypted back to "Hello world" again and then you're good to go.

But we have no way of knowing. We have no way of knowing how big our key is supposed to be, we have no way of knowing if anyone else is actually going to be able to encrypt anything in a form we can understand - we don't know anything about security because we write financial software.

At the moment our key is 8 bytes long - so I presume this is "64-bit encryption", but I don't really have any way of knowing - other than spending some time researching this stuff - time I really don't have right now. We are planning on increasing this once everyone's comfortably able to encrypt and decrypt the data though.

So for now, we've spent another day patching our blowfish algorithm to work the same way our API user's does. I guess this means we need to email the two other developers who've told us that they can't get blowfish working - the ones that we already gave copies of our (broken) code and we'll need to send them new code.

It just seems so unprofessional to me. We should have just conceded that we knew nothing about encryption in the first place, and at least tried to use something that didn't come off of Joe Bloggs's website, but rather from some reputable source like boost, or the .NET BCL. It's a common theme I see all the time here at my work - leak-plugging: "Ooh, the dam wall is leaking here, let me just stick my finger in it until the tide goes out".

This type of thinking can do us in.

I'm all for open source, and we use it quite a lot, but this is security, and your security is paramount, especially in finance.

Monday, January 15, 2007

Convention is better than cure

In a pragmatic world, programmers working in a team adopt a common coding convention because it reduces the amount of mental acrobatics that members of the team have to perform when maintaining each other's code. In an ideal world however, it doesn't matter what coding style you adopt, because it doesn't make a tangible difference to maintaining the code base. Everyone just mindlessly uses whatever convention was drilled into him by well-meaning but short-sighted professors, with no regard for the nuances of good style.

It appears therefore that I'm living in an ideal world - which ironically, is far from ideal. I was the only one working on the project for the first year of its development, and so naturally all the code exhibited a single style - bar minor differences where my style changed over the course of that year (I was still developing my C++ knowledge to a large extent), and even those minor differences I was dilligent about reconciling back to The One Style.

In that first year, I managed to write a large portion of the system - which I'd inherited in the crudest of forms (a couple of hundred lines of shaky (C++)-- code) from a colleague who has since gone on to hopefully better things. However, about 18 months ago, my department finished it's other major project and the entire crew was put full force onto my project, and I had to contend with three other people maintaining my beautiful code.

What made matters worse was that my boss was busy implementing a separate piece of the system (the money-maker component, you could say) from a Delphi spike implementation (almost line for line) - basically independently of the application it was going to have to fit into. I didn't think much of it at the time, but looking back, I'd take continuous integration over the nightmare that resulted from that integration any day - the interface between the two components still feels like a house of cards. His coding conventions were obviously different to mine, but because we weren't even working in the same repository, let alone the same solution, neither of us realised the headache it was going to cause.

So now we're at the point where the system is nearly two and a half years old, and convention is basically non-existent. The particular style element that's got a bee in my bonnet of late is the declaration of pointer variables, and specifically, whether the * should accompany the type, or the variable name. I'll give you an example:

  1. int* myInt = &someInt;
  2. int *myInt = &someOtherInt;

As silly as it sounds, this one character difference has caused a bunch of problems in our team and for our project. Obviously this convention clash hasn't caused bugs - semantically the two statements are equivalent, but we have had problems related to day-to-day feature addition and maintenance programming. For example, when I need to fix a bug in a class, and I want to browse subversion's history for the class's header file, I see that the file has about 20 different changesets and it takes about 10 minutes just to isolate which of those changesets don't alternatively have me changing the convention within the file to style 1 and one of my colleagues changing it to style 2.

This is the kind of crap that can derail your train of thought and kill your productivity for the rest of the day. What makes matters worse is that the "tidying up" is usually done by the programmer in exactly this situation - when they come across the file looking for a bug. This means, when the bug is fixed and the code is checked in, the "tidying up" code is hidden amongst all the bug-fixes, or more commonly and much more dangerously, the bug fix (which is important) is hidden amongst all the "tidying up" (which isn't) - that is, there's a bad signal to noise ratio.

I could go into a 3 page discussion on why style 1 is better than style 2 (which it clearly is), but let's just say for now that if I were asked to join a team working on a code-base with 250K lines of code, and every line used style 2, I'd hopefully be pragmatic (or extreme) about it and just adopt style 2 while working on that project - a common convention makes it easier for everybody.

Or alternatively, being the stick in the mud that I am, I'd write a little script to report all the places where the one instance was used, and I'd either manually go and change all of them (I'm also a sucker for punishment), or write another script to do that. In fact, that's just what I did do to our code today. I threw together a little regular expression (first using Windows' findstr command, but quickly switching to a ruby script) which would tell me all the places in the code where style 2 was used, so I could change them all to style 1.

What I should have done first I guess was write the script and run it for both combinations - so that I could figure out which style was used more. That's what I should have done. What I actually did was assume that my preferred style was more widespread (given that myself and two of the other three maintainers use it exclusively, and the fact that I'd written so much of the code in the first place), and so I jumped straight in and started changing the bad style to the good.

To cut a long story short, about six hours (this is all in my spare time, how sad is that?) and a number of exceptional cases(1) later, my ruby script was reporting that there were no more occurrences of the offending pattern. Feeling very satisfied with myself, I integrated the ruby script into our nant build script and set the build to fail if a violation of the convention occurs again. Luckily, I'd had the foresight to make these changes in a separate working copy of the code, and so I still had the original code to refer to. I made some modifications to the script and ran it to count how many instances there were of style 1 and style 2 on the original code, and the results very gratifying. Approximately 1800 instances of style 1, and 300 instances of style 2 - which just proved my suspicions nicely.

So all that's left to do is check the code into the repository tomorrow when I get back to work, and also the rather distasteful task of informing my team that there is now a strict convention regarding the placement of the * when declaring pointer types, and any instances of going against the convention will cause the build to fail (I believe there's a joke aboyt aryan variables and my coding-naziness hiding in there).

I've toyed with the idea of rather hooking the script into subversion's pre-commit hooks, so that no bad code even gets into the repository, and the checks don't add onto our already straining build time, but unfortunately I haven't had the time to play with such an idea. I can see great benefits though if a coding convention is enforced at this level across all our projects in all the languages we use (mostly C# lately), so perhaps I'll spend some more spare time over the coming weeks identifying little violations like this to build up a comprehensive library of checks and integrating them one by one. I've already installed a pre-commit hook to check that comments always accompany checkins, but even that just gets subverted by people typing in crappy, useless comments.

Thankfully though, it doesn't appear that the script is going to have any major effect on our build-time, which Riaan and I reduced by 15 minutes to 30 minutes total last week (I know, I know, 10 minute build :( ). The script runs on my laptop in about 1.6 seconds, so I think we're OK. For now, I'll just have to wait and see whether this has a positive or negative influence - and the next item on my hitlist is curly braces - both ones missing from single-line conditionals, and ones hiding on the right hand side of the expression, instead of on their own line below. Hopefully that won't meet with too much resistance either.

1. The exception cases included instances where a variable or iterator was being dereferenced, or where multiplication was taking place, or where a function call in a dll was declared.

Wednesday, January 3, 2007

The key to fixing problems quickly

In an ideal world, problems are solved instantaneously, or don't even arise in the first place. Here in the real world though, the key to fixing problems quickly is having enough information. I'm particularly frustrated with developers who, when an error occurs and they need to inform the user, come up with a really useless error message, something about as useful as

"error MSB3152: The install location for prerequisites has not been set to 'component vendor's web site' and the file 'DotNetFX\instmsia.exe' in item '.NET Framework 2.0' can not be located on disk. See Help for more information."

See help for more information.

To cap it all off, the online help is about as useful as matches to an astronaut. It actually says: "To correct this error - Determine whether the file exists on disk". The file does indeed exist on disk. It exists on disks all over the world. It even exists on both disks in the machine in question. It just doesn't appear to exist in the place where the program is looking for it.

I wouldn't need to see help for more information if the stupid program would just tell me where it was looking for the files. This is happening on my build server - the machine I'm setting up to monitor our subversion repository and republish this particular program every time someone checks something into the source tree. Up until now, this has been a manual process triggered by my colleague Karin using an existing Nant script which I threw together a couple of months ago. What this basically means is that all the hard work has been done already, I just need to hook up the monitor part (CruiseControl.NET) to the source control part (Subversion) and the build automation part (NAnt) and the ClickOnce technolgy (MSBuild). But somewhere along the line, the build machine is configured differently to Karin's machine, and MSBuild can't find the prerequisite files.

No problem, easy to fix, right? I'll just copy the files off Karin's machine onto the build server, presumably to the same path (they're just install files for the .Net Framework 2.0 after all), and away we go. Not so. Apparently there's some other place on this particular machine that those files need to be for MSBuild to pick them up - which in theory I'm fine with. There could be any one of a number of reasons for this - the source files and the MSBuild files are on different drive letters, Windows Server 2003 vs. Windows XP, etc, and I don't really care where the files are stored on the build server, as long as I get my automated build published every time someone changes something.

What does bug me is that when MSBuild fails inside of my NAnt task, it fails with that wonderful error message that I copy/pasted above. This is literally a 5 minute fix if they just put all the locations where they search for the files into the error message. Then, all I have to do is go and paste the files into one of those locations and I'm good to go. Or, at least then I can start to hunt around and see where those locations are stored and try and figure out why they're different on the build server and Karin's machine. Instead, I've spent the last two hours moving files all over my hard drives, trying to get MSBuild and ClickOnce to pick them up, to no avail - and I'm still no closer to a solution than I was when I started.

So my advice to programmers: Don't be scared to put as much information as you have into error messages - especially stack traces if you have them - but any other useful information as well. At least if you provide a decent message, then a competent user (which I know probably only accounts for about 3% of all your users) can at least do basic troubleshooting himself, without having to either:

  • Trouble your tech support with a problem that they will probably be no better equipped to solve than he is (because tech support also might not know where this particular configuration is looking for files).
  • Spend hour after frustrating hour scouring the web for solutions - this is NOT a great way to build customer relations and loyalty.

As an aside (and proof that I'm not overly anti-Microsoft), the system we're currently replacing (that we wrote) has an error message which pops up when something goes wrong on login and it looks like this:

"INCORRECT PASSWORD NO RIGHTS OR DATE"

This is an exact quote, even down to SHOUTING AT OUR USERS in ALL-CAPS. What this excellent piece of information basically means is either you typed the password in incorrectly, you don't have rights to send in a login message, or the date you passed through in your login message doesn't correspond with the server's date. Why all three error conditions are reported together I'll never know, but clearly providing too much information is almost as bad as providing too little (at least the problem can be accurately troubleshooted).

Every time this error comes up in our test environment, we need to open the management system and check the password, check the rights and then log into the server (which is Novell, not Windows) check that the server's date is the same as the client's date. Why not, as the developer who wrote this code, just take 5 minutes out of your busy schedule to return three different error codes (don't even get me started on error codes vs. exceptions {I'm pro-exceptions}, but this is an old C system), for the three different errors? That simple, once off 5 minute task would have saved 5 minutes of troubleshooting for probably 200 logins over the course of the last 2 years, which equates to around 800 minutes or almost 14 hours of time saved.

I attribute such techniques partly to laziness, but also partly to something I come across all the time - thinking strictly in terms of very short-term goals. "I have to get this login routine finished today" as opposed to "This needs to be as easy to use and troubleshoot as possible, because every single person who logs into the system in its entire lifetime is going to have to pass these checks, and if their login fails, it's going to frustrate every single one of them."

However, a discussion of short-term goals - of thinking too narrowly - is a discussion for another day.