Tuesday 22 November 2011

The power of scripting

Interesting problem last night: I wanted to find out how far it was
between a bunch of places. I had the location post codes buried away
in a stack of form letters in pdf form. So convert those to text,
open in emacs and extract the post codes. I couldn't get multi-line
replace-regexp to work, re-builder successfully highlighted the
line but replace-regexp failed to match the same regex. Oh well, I
just want to hack this out so keyboard macros did the job.

A quick google later and I came up with the static maps API
so threw the post codes at that to get back a JSON object with all the
distances. Then using json.el in ielm, convert that to a lisp
structure, pick that apart to extract the relevant fields and run my
calculations over that, plotting the results using R in org-mode.

We have some fantastic tools at our disposal: this is a fun game.

So, back to the multi-line regex that fails. I've got an input file in
the form:

a
1
b

a
2
b

and want to match the paragraph a..b Using re-builder I can construct
a regex that matches:

"a\\(.*
?\\)*?*b"

But using that regex in replace-regex results in no matches.
Digging in to this it turns out that, because I'm using replace-regexp
interactively the escaping rules are slightly different: I only need a
single backslash before the brackets.

When is a solution good enough? The keyboard macro approach did the
job quickly and easily. I then spent longer trying to debug the
multi-line regexp than it would have taken me to solve the original
problem by hand. But I learnt something in the process so surely that
is time well invested. Besides it was fun!

Friday 18 November 2011

On inheriting a code base

I've recently been given access to a new code base.  It represents a
reasonably successful product but there's something not quite right.
I spent a while browsing through the code trying to get a feel
of things.  The problem is that the code is adequate rather than
elegant. 

There's a fair amount of code duplication.  Not out and out cut and
paste (though there's a bit of that too), but similar functionality
acting on different variables or calling with slightly different
parameters.  It's all things that could be factored out but haven't
been.

Then there's the code style.  It's all incredibly defensive with lots
of checks for null values which are dutifully logged then the
application proceeds on its merry way.  So who's to know that
something went wrong?

Comments - well no not really. There's almost a complete absence of
comments, nothing to explain the "why" of what's going on.
So what's the reason for this?  The team who wrote the code are a
bright bunch and they've been doing this stuff for a while.  There's a
lot of  the other trappings of good code development --- unit tests,
continuous integration servers --- but the raw stuff of our craft,
the code, just isn't up to scratch.

I asked about code reviews.  They're on the list of things to set up
when the team have the time.  Unsurprisingly there's never time as
more features are requested.  And anyway who wants to go into a two
hour meeting to have their code picked apart by some smart alec who
thinks they can do better.

So what needs to be done to change this?  The goal must be to make the
developers proud of their code, happy to show it to all and sundry and
keen to receive feedback so that they can improve it. But at the
moment every one is sitting in their own cubical, beavering away on
the latest task passed down by the management. 

So here's what we're going to try:

- Pick a file that typifies the problem and set the team a challenge.
  What's the most elegant way we could write this code?
- Discuss what we mean by "elegant".  Obtain team agreement that these
  are our values.
- Institute light weight buddy code reviews of all check ins

A side thought: why is it that code reviewers always pick out the
negative?  There are a lot of studies singing the praises of positive
enforcement so why don't we try to praise the good bits rather than
pick holes in the bad?