Just A Summary

Piers Cawley Practices Punditry

A quick Javascript formatting tip 10

Posted by Piers Cawley Wed, 16 Apr 2008 09:01:00 GMT

IE’s a pain. The particular pain I want to write about is its pickiness about Javascript object literals. Consider the following Javascript object:

{ success: function () {...},
  failure: function () {...},
}

If you’re used to programming in Perl or Ruby, that trailing comma’s perfectly fine, in fact leaving it there is sometimes considered good practice because it makes it easy to extend the hash, just add a new row and leave another trailing comma.

The trouble is, it’s not strictly legal to do that in Javascript. Pretty much every implementation out there will allow it though.

Except IE.

So, I’ve taken a leaf out of Damian Conways Perl Best Practices and writing my object literals as:

{ success: function () {...}
, failure: function () {...} 
}

By sticking the comma at the beginning of the line, I’m never going to make an object that breaks IE, and adding a new line to the hash is straightforward too. Just stick the cursor in front of the }, type my leading comma, space, attribute name, and hit return when I’m finished.

I’ve also started using the same practice pretty much everywhere else that I’ve got a comma separated list of things:

var foo
  , bar
  , baz
  ;
$.each( anArray
      , function () { ... }
      );

It looks weird at first, but trust me, it grows on you.

Update

In the comments, I make reference to tweaking Steve Yegge’s excellent js2-mode to handle leading comma style a little more gracefully. Since then, I’ve made it work and attached a diff to this issue on the project’s issue tracker.

Code is data, and it always has been 3

Posted by Piers Cawley Mon, 07 Apr 2008 05:12:00 GMT

I’m just back from the first Scotland on Rails conference, and a jolly fine conference it was too. Much kudos is due to Alan, Graeme, Abdel and Paul. It was hard to believe that this was the first conference these guys have run and I think all my fellow delegates hope it won’t be the last. As I said in the Pub on Saturday night, I’d had a talk proposal knocked back and, in that situation, it’s terribly easy to find yourself sitting in a session thinking “Bloody hell, my talk would have been better than this!”, but not at this conference.

A phrase that cropped up a couple of times was the old saw that “Data == Code” – it’s very hard to avoid the idea once you start talking about code generation or Domain Specific Pidgins, parsing… I first came across the idea in The Structure And Interpretation of Computer Programs where it’s elegantly phrased as “Data is just dumb code, and code is just smart data”. Traditionally, the idea seems to be attributed to John McCarthy, the inventor of Lisp. But it’s older than that. Way older than that. The idea is actually older than Computer Science. It lies at the core of Turing’s original paper On Computable Numbers, with an Application to the Entscheidungsproblem in which invents computer science on the way to proving that it’s impossible to decide algorithmically whether a given statement of arithmetic is true or false.

In the course of the paper, Turing posits what has become known as the Halting Problem:

Given a description of a program and a finite input, decide whether the program finishes running or will run forever, given that input.

Turing’s proof runs something like this:

Suppose we have a subroutine halts?(code,data) which solves the halting problem. Let’s use that to write something like:

def counter_example(code)
  if halts? code, code
    for (;;)
    end
  else
    return
  end
end

counter_example(File.read(STDIN))

and ask the question “What happens when we run counter_example.rb < counter_example.rb”? If halts? reckons that counter_example would halt, given itself as input, then counter example will enter an infinite loop, but if halts? reckon that it would enter an infinite loop, then it would halt. Which is a contradiction. Which means that there can be no subroutine halts?, which means that maths is hard enough to be interesting and occasionally undecidable.

Look at how the proof works – it’s built around the idea that code can be treated as data. In fact, you could say that the Turing Machine looks like it does because Turing was working backwards from this core idea to describe a sufficiently powerful machine that could obviously treat it’s own description as data. Certainly when you compare the clarity of his proof that the halting problem is undecidable (given the idea of the universal Turing machine) with the contortions required to make mathematics swallow its own tail in similar fashion so that Gödel could prove his Incompleteness Theorem.

So, if you want to know who the idea that code is data is due to, the answer (as is so often the case in our field) is Turing.

Postscript

Incidentally, Turing is also responsible for the first ever bug – his original implementation of a Universal Turing Machine has a couple, one of which is probably a dumb typo (which even I could spot when I read the paper). Another is more subtle, but still fixable. Somewhat delightfully, a young grad student, (Donald W Davies, who invented packet switching) spotted these bugs and told Turing:

I … found a number of quite bad programming errors, in effect, in the specification of the machine that he had written down, and I had worked out how to overcome these. I went along to tell him and I was rather cock-a-hoop … I thought he would say ‘Oh fine, I’ll send along an addendum. But in fact he was very annoyed, and pointed out furiously that really it didn’t matter, the thing was right in principle, and altogether I found him extremely touchy on this subject.

Nearly fifty years later Davies wrote and published a debugged version of the code, which you can find in The Essential Turing. One lesson to draw from the above is that getting annoyed at people pointing out trivial bugs in example code is also at least as old as computer science. Rather splendidly, there’s also a story of the chap who wrote the first ever assembler getting a serious telling off from Turing because the computer’s time was too valuable to waste it on converting symbols into machine code when humans were perfectly capable of doing it themselves. Who knows, maybe Turing’s contention was actually true back in the days of the Manchester Baby…

Javascript scoping makes my head hurt 13

Posted by Piers Cawley Thu, 20 Mar 2008 16:41:00 GMT

Who came up with the javascript scoping rules? What were they smoking. Here’s some Noddy Perl that demonstrates what I’m on about:

my @subs; 
for my $i (0..4) {
  push @subs, sub { $i }
}

print $subs[0]->(); # => 0;

Here’s the equivalent (or what I thought should be the equivalent) in Javscript:

var subs = [];
for (var i in [0,1,2,3,4]) {
  subs[i] = function () {
    return i;
  }
}
alert subs[0]() // => 4

What’s going on? In Perl, $i is scoped to the for block. Essentially, each time through the loop, a new variable is created, so the generated closures all refer to different $is. In Javascript, i is scoped to the for loop’s containing function. Each of the generated closures refer to the same i. Which means that, to get the same effect as the perl code, you must write:

var subs = [];
for (var shared_i in [0,1,2,3,4]) {
  (function (i) {
    subs[i] = function () {
      return i;
    };
  })(shared_i);
}
subs[0]() // => 0

Dodgy Ruby scoping

I had initially planned to write the example “How it should work” code in Ruby, but it turns out that Ruby’s for has the same problem:

subs = [];
for i in 0..4
  subs << lambda { i }
end
subs[0].call # => 4

Which is one reason why sensible Ruby programmers don’t use for. If I were writing the snippet in ‘real’ Ruby, I’d write:

subs = (0..4).collect { |i|
  lambda { i }
}
subs[0].call # => 0

My conclusion

Javascript is weird. Okay, so you already know this. In so many ways it’s a lovely language, but it does have some annoyingly odd corners.

Joined up thinking: why your resources want links 4

Posted by Piers Cawley Fri, 22 Feb 2008 08:26:00 GMT

Remember the good old days? The days before Google? The days before Altavista? The days when a 14k4bps modem was fast? Did I say good old days?

In those days, the web had to be discoverable ‘cos it sure as hell wasn’t searchable. The big, big enabling technology of the web was the humble <a href='http://somewhereelse.com'>Go somewhere else</a>. Placing the links right there in the body of the document turned out to be exactly the right thing to do.

And it continues to be the right thing to do. Consider the two pieces of YAML below the fold.

Here’s something close to what I’ve been generating at work (We generate JSON, but YAML’s easier to write out by hand):


location: http://site/users/pdcawley/tunes/99
name: Bill Norrie
creator:
  name: Piers Cawley
  location: http://site/users/pdcawley
duration: 360
image: http://s3.amazonaws.com/...
stream: http://s3.amazonaws.com/...

Here’s that same snippet, recast in the URL free style that’s common in ‘RESTful’ APIs:


id: 99
name: Bill Norrie
creator:
  name: Piers Cawley
  id: pdcawley
duration: 360
image: http://s3.amazonaws.com/...
stream: http://s3.amazonaws.com/...

Which of those would you rather have served up to your API client? Obviously, the one with the URLs because then your client has to know so much less. If resources carry links within their body they’re discoverable, just like anything else on the web. A client only needs to know about a few resource directories and maybe a mechanism for searching, and most of the rest should follow from the RESTful principles1.

If you wanted to write a client to make use of resources in the second form, you’d need to know that a tune’s URL is of the form /users/creator.id/tunes/tune_id, that user urls have the form /users/creator.id and so on. You’d curse the API designer who designed that URL scheme. You’d curse again when, having written the Ruby API, you sat down to write the Javascript version (modern sites need their AJAX, right?) and had to teach it all about the URL scheme all over again.

Don’t Repeat Yourself

If there’s one thing worse than violating the DRY principle, it’s forcing other people to do it. Structuring your APIs representations around IDs rather than URIs is doing just that. The complexity that you dodge by punting URL generation to the user is multiplied by the number of API client implementations (and, arguably, the number of client that don’t get written ‘cos your API’s a PITA).

It’s worse than that though. You already have the code for mapping between resources and URLs. If you’re using Ruby on Rails, your mapper is bidirectional too2. Generating URLs within your representations should be a snap3. So, what are you waiting for? Start writing Joined Up APIs.

1 For values of ‘the rest’ that are concerned with the how of the API. You’re still going to have to document the what and the why. You can get a long way down that road with careful use of <link rel="ServiceDoc" href="/servicedocs/model" /> though.

2 Catch me in the right mood and I can rant for ages about the awfulness of RoR’s routing system, but the bidirectional nature of them trumps almost all my complaints. It’s like Jamie Zawinski’s line that Java doesn’t have free. Everything else about the routing system can suck almost as hard as it likes, but two way routing is a win.

3 Not quite the snap it could be in Rails because rails is of the opinion that URL generation is something the controller does. Which is fine as far as it goes until you find yourself writing Tune#to_json without a controller in sight and you can’t change to_json’s signature because it’s a standard method. And you just want to cry. Model.include ActionController::UrlWriter is wrong, but so tempting… In fact I’ve succumbed, just to get the test to pass. I’m in the process of refactoring by introducing a UrlPolicy singleton that will do all that stuff and at least isolate the bits where my models get to play like controllers.

Martin Fowler's big mouthful 8

Posted by Piers Cawley Fri, 18 Jan 2008 23:19:00 GMT

Martin Fowler is writing a book about Domain Specific Languages and, because you could never accuse Martin of a lack of ambition, he’s trying to write it in a reasonably (implementation) language agnostic fashion.

It’s fairly easy to write an implementation language agnostic book about old school DSLs, what used to be called little languages – there’s a fairly well established literature and theory to do with lexing, parsing and interpreting. These are all about algorithms, and algorithms are implementation language neutral by their very nature.

Where Martin has his work cut out for him is trying to talk about what he calls ‘internal DSLs’ and what I’ve been calling ‘pidgins’. These are the sorts of languages where you don’t write a lexer or parser but instead build a family of objects, methods, functions or whatever other bits and pieces your host language provides in order to create a part of your program that, while it is directly interpreted by the host language, feels like it’s written in some new dialect.

The Lisp family of languages can be said to be all about this. A good ‘bottom up’ lisp programmer will shape a language to fit the problem space, essentially building a new lisp which makes it easy to solve the problem at hand. Lisp’s minimal syntax, powerful macros and the way it blurs the boundary between code and data really support this style.

Once you move from Lisp to more ‘syntaxy’ languages, things get hairier. As Martin himself says

Another issue with book code is to beware of using obscure features of the language, where obscure means for my general reader rather than even someone fluent in the language I’m using. [...] this is much harder for a DSL book. Internal DSLs tend to rely on abusing the native syntax in order to get readability. Much of this abuse involves quirky corners of the language. Again I have to balance showing readable DSL code against wallowing in quirk.

He’s dead right. When I’m thinking about writing a pidgin in Ruby for instance, my first thought is usually to start with some kind of tabula rasa object which I can use to instance_eval a block. That lets me start to shape my language by lexically scoping the change:


in_pidgin do
  ...
end

But, though it’s easy to illustrate what I’d do with my tabula rasa, the implementation is somewhat tricky, and the tricks needed are unique to Ruby.

That sort of construct’s not really available to someone trying to write a pidgin in Java or Perl. In Perl, there are other odd corners of the language that can be abused to good effect. Dynamic scoping can let you ‘inject’ methods into a block even though there’s no Perl equivalent to instance_eval, or you can do some quite staggering things with the otherwise really annoying Perl function prototypes. For instance, here’s part of a Jifty definition of a persistent object:


column title => 
       type is 'text',
       label is 'Title',
       default is 'Untitled post';

column body => 
       type is 'text',
       label is 'Content',
       render_as 'Textarea';

Doesn’t look much like Perl does it? But it’s parsed and executed by perl with no source filters or eval STRING in sight. And there’s no unsightly :symbols scattered about the place either come to that.

These things all work by making the language do something unexpected, and generally, the way to do that is by knowing your host language inside out and playing with it. One of Damian Conway’s more inspired moments in recent years was List::Maker, in which the good doctor managed to find a corner of Perl where he could wedge a proper old school, complete with full on parser to build the AST, Little Language right in the heart of Perl without it looking like he was taking a plain old string and interpreting it. So, having found this odd little corner, he proceeded to implement a remarkably neat tool for building complex lists that are beyond the capabilities of Perl’s .. operator.


@odds   = <1..100 : N % 2 != 0 >;

@primes = <3,5..99> : is_prime(N) >;

@available = <1..$max : !allocated{N} >

You may not think that’s all that sexy, but, and trust me on this, it’s just gorgeous. Yet more proof that Damian Conway is an (evil) genius.

Frankly, once you’ve seen the best of the pidgins available in Perl, some of highly praised ‘DSLs’ in Ruby start to look a bit ordinary. Ruby makes a great deal of stuff that a pidgin breeder needs to do really easy. In Perl it’s often rather hard with a huge amount of hoopage to deal with. But some of the things that are hard in Perl are impossible in Ruby.

Anyhoo… coming back to my point. I do find myself wondering if Martin’s bitten off more than he can chew in attempting to write a book that covers implementing pidgins without getting bogged down in the nitty gritty of individual languages. The problem he’s facing is that different languages don’t just have different quirks, they have different idioms too. What reads naturally in the context of a Ruby program will read very weirdly in, say Java or a lisp. Any patterns of implementation beyond broad (but important) strokes like “Play to your host language’s strengths” will surely end up as language specific patterns. Designing and implementing a good pidgin is hard. Doing it effectively means getting down and dirty with your host language and its runtime structures. And that’s not the sort of thing you can cover effectively in a language agnostic book.

Martin, if you’re reading this, good luck. I think you’re going to need it. I look forward to being proved wrong.

Getting to grips with Javascript

Posted by Piers Cawley Fri, 18 Jan 2008 08:30:00 GMT

I’ve been busily adding AJAX features to the work website, and I got bored of writing Form handlers. I got especially bored of attaching similar form handlers to lots of different forms on a page, so I came up with something I could attach to document.body and then plug in handlers for different form types as I wrote them.

So, I wrote FormSender and set up my event handler like so:

FormSender.onSubmit = function (e) {
  if (canDispatch(e)) {
    YAHOO.util.Event.stopEvent(e);
    YAHOO.util.Connect.setForm(e.target);
    YAHOO.util.initHeader('Accept', 'application/javascript, application/xml');
    YAHOO.util.Connect(e.target.method.toUpperCase(), e.target.action,
    callbackFor(e));
  }
};

jQuery(document.body).each(function () {
  YAHOO.util.Event.addListener(this,  "submit", FormSender.onSubmit);
});
I decided to mark any Ajax dispatchable forms using a class of ‘ajax’, the idea being that it would be a simple thing to check jQuery(e.target).hasClass('ajax'), but there was a snag. We had two sorts of forms on our pages, forms built using form_for(..., :class => 'ajax') and forms built using button_to(..., :class => 'ajax'), and they attached their classes in different places. In the form_for case, the class was on the form tag, but in the button_to case, it was on the generated form’s submit field. One option would be to monkey patch button_to, or roll my own ajax_button_to, but I ended up writing canDispatch like so:
function canDispatch(e) {
  jQuery(e.target).find(':submit').andSelf().hasClass('ajax');
}

This uses jQuery to build a list of the form, and its submit button, and then checks to see if any member of that list has the class ‘ajax’.

So, we can now tell if the source of a submit event is a form we should be doing AJAX dispatch with. The next trick is to work out what needs to be done with the results of sending the form. One option is the Prototype trick of simply evaluating the returned javascript, but it often makes sense to keep the behaviour clientside and just have the server return a datastructure. I decided that the way to do this would be by adding a second class to a form which used a none default handler, and then keep a hash of callback constructors keyed by class. This made callbackFor look like:

function callbackFor(e) {
  var candidates = candidateClasses(e.target);
  for (var i = 0; i < candidates.length; i++) {
    if (FormSender.callbacks[candidates[i]]) {
      return new FormSender.callbacks[candidates[i]](e);
    }
  }
  return new FormSender.callbacks.ajax(e);
}

candidateClasses is, again, a little more complex than I’d like, by virtue of the differences between button_to and form_for differences, but still reasonably straightforward, thanks to jQuery:

function candidateClasses(element) {
  return
  jQuery(element).find(':submit').andSelf() 
    .filter('.ajax').attr('className')
      .replace(/ajax/, '').trim().split(/ +/);
}

JQuery gets the form and its submit button, then selects the tag that has the ‘ajax’ class and pulls out the full className string. The replace gets rid of ‘ajax’, trim chops any useless whitespace off either end, and split(/ +/) turns it into an array of classnames. The replace -> trim -> split pipeline has the feel of something that must already exist in some DOM interface somewhere, but I’m not sure where, so I rolled my own.

Once we have a list of classes it’s easy to just cycle through the candidates until we find one that matches a callback constructor, falling back to the default where nothing matches.

For completeness, I’ll show you my current default handler, which I expect to be extending to deal with a couple more media types and, in the case of the failure handler, more failure statuses.

FormSender.callbacks.ajax = function (e) {
    var form = e.target;
    this.scope = form;
};

FormSender.callbacks.ajax.prototype.success = function (o) {
    switch (o.getResponseHeader['Content-Type'].replace(/;.*/, '')) {
    case 'application/javascript':
    case 'application/x-javascript':
    case 'text/javascript':
        eval(o.responseText);
        break;
    default:
        YAHOO.log("Can't handle AJAX response of type " + o.getResponseHeader['Content-Type']);
    }
};

FormSender.callbacks.ajax.prototype.failure = function (o) {
    switch (o.status) {
    case 401:
        Authenticator.loginThenSubmit(this);
        break;
    default:
        switch (o.getResponseHeader['Content-Type'].replace(/;.*/, '')) {
        case 'application/javascript':
        case 'application/x-javascript':
        case 'text/javascript':
            eval(o.responseText);
            break;
        default:
            YAHOO.log("Can't handle AJAX failure response of type " + o.getResponseHeader['Content-Type']);
        }
    }
};

You’ll notice a reference to Authenticator.loginThenSubmit in the 401 handler, but that’s something I’ll save for another day.

A note on namespacing

Although I’ve been showing the various FormSender helper functions as if they were in the global namespace, in the real code they’re wrapped in a function call:

var FormSender = (function () {
  var candidateClasses = function (element) {...};
  var callbackFor = function (e) {...};
  ...
  var onSubmit = function (e) {...};
  return {onSubmit: onSubmit, callbacks: {}};
})();

I love the (function () {...})() pattern – it’s a great way of keeping your paws out of the global namespace until you really, really need to.

FormSender Benefits

Aside from the obvious benefit of drastically reducing the number of onSubmit event handlers registered with the browser, I found that using FormSender has simplified some of my response handlers. For instance, one form would get a chunk of html back from the server and would use that to replace the div that contained the form. But the new div also contained a form that needed to have Ajax behaviour, so a chunk of the handler code was concerned with reregistering onSubmit handlers for the new form (or forms). No fun. By switching to a single, body level, form handler, that problem simply disappears – so long as the new forms have the right class, they automatically get the appropriate behaviour. Result.

Obviously, FormSender is unobtrusive javascript, which is nice, and its pluggable nature means it’s easy to extend just by writing new response handlers and registering them with the FormSender object.

Future Directions

One obvious extension to FormSender is to pull out the meat of the onSubmit method into the callback object to allow for forms that don’t simply send themselves to the server. Another is to wrap my head around the workings of Javascript’s object model to make it easy to build handlers that don’t duplicate the behaviour of the default handler through the medium of copy and pasting code…

Your comments please?

I’m still very new to Javascript as a programming language and I’m sure I’m doing plenty of boneheaded things here. Please let me know if there’s things I can do to improve this, or point me at any libraries that already cover this ground.

Comprehensible sorting in Ruby 3

Posted by Piers Cawley Sun, 16 Dec 2007 09:00:51 GMT

Here’s a problem I first came across when I was about 13 and helping do the stock check at the family firm. The parts department kept all their various spare parts racks of parts bins. Each bin was ‘numbered’ with an alphanumeric id. We had printouts of all the bin numbers along with their expected contents and we’d go along the racks counting the bins’ contents and checking them off against the print out. What confused me at the time was the way the printouts were organized. Instead of the obvious ordering, “A1, A2, A3, ..., A99”, the lists were ordered like “A1, A10, A11, ..., A2, A20, A21, ...”. After a bit of thought I realised that the computer was sorting the numeric bits of the bin numbers as if they were just sequences of strange letters. A bit more thought made me realise why, post computerisation, people were starting to use bin numbers like “A01, A02, ...”. Computers were more important than people so, in order to make sorting things easier, just add spurious leading 0s to make the number field a fixed width and Robert’s your parent’s brother.

27 years later and computers are still crap at sorting things in a sensible fashion. Back before Moore’s Law was really kicking in, I suppose it was excusable, but surely we’ve moved past that now.

Over on the labnotes blog, there’s an example of some ruby code that attempts to do ‘human’ sorting:

module Enumerable
  def sensible_sort
    sort_by { |key| key.split(/(\d+)/).map { |v| v =~ /\d/ ? v.to_i : v } }
  end
end

It’s okay, as far as it goes. It certainly solves the parts bin problem I outlined above, but it’s not ideal. For example, you might expect ['-1', '1', '1.02', '1.1'].sensible_sort to leave the order unchanged, but what you actually get is ‘1, 1.02, 1.1, -1’. Not ideal. Let’s rewrite sensible sort as

module Enumerable
  def sensible_sort
    sort_by {|k| k.split(/([-+]?\d+(?:\.\d+)?(?:[-+]?[eE]\d+)?)/).map {|v| Float(v) rescue v}}
  end
end

That ugly regular expression should match a far wider selection of string representations of numbers. Certainly our ‘bad’ list is now sorted correctly.

But what about “a-1”, “a-2”. Using the implementation above, they’d get sorted as “a-2, a-1”, which can’t be right, can it? Let’s extend it a bit more and make sure we only worry about the ’+’ and ’-’ if they’re at the beginning of a line or preceded by whitespace.

module Enumerable
  def sensible_sort
    sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?\d+(?:\.\d+)?(?:[eE]\d+)?)/ms).map {|v| Float(v) rescue v}}
  end
end

And that works fine, until you find that “B” sorts before “a”. Let’s catch that as well:

module Enumerable
  def sensible_sort
    sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?\d+(?:\.\d+)?(?:[eE]\d+)?)/ms).map {|v| Float(v) rescue v.downcase}}
  end
end

Yay!

Oh, wait a minute, what about version numbers? How should we sort, say “perl 5.8.0” and “perl 5.10.0”? The 5.8.0 form should definitely come first… Hmm…

How about

module Enumerable
  def sensible_sort
   sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?\d+(?:\.\d+?(?:[eE]\d+)?(?:$|(?![eE\.])))?)/ms).map {|v| Float(v) rescue v.downcase}}
  end
end

How far down does this thing go?

I just noticed that ”.1” sorts after “1”. Time for another tweak…

module Enumerable
  def sensible_sort
    sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?(?:\.\d+|\d+(?:\.\d+?(?:[eE]\d+)?(?:$|(?![eE\.])))?))/ms).map {|v| Float(v) rescue v.downcase}}
  end
end

but that doesn’t work with version numbers like ”.8.2”, ”.10.2”...

Time passes… Thorin sits down and sings about gold

I was planning on giving an extension of the regex that caught this issue as well, but I’m afraid I’ve stumped myself – I can’t do it with a single regular expression unless I can use a fixed width lookbehind assertion, but they’re only available in Perl. Of course, it’s still possible to fix it, but doing so will take more thought than I have available to me at this time on a Sunday morning. And all this is before we get onto making sure that “1/2” sorts between “0” and “1”. And phone numbers. After all, “01915551238” is ‘obviously’ the same as “0191 555 1238” and “0191 555-1238”, so they should end up next to each other in the sorted list.

It looks like this is a ‘three pipe problem’ after all. I shall probably return to this…

Can I get a witness? 5

Posted by Piers Cawley Mon, 19 Nov 2007 19:17:21 GMT

Worrying about test coverage when you’re doing Test- or Behaviour-driven development is like worrying about the price of fish in Zimbabwe when you’re flying a kite.

Your tests are there to help you discover your interface and to provide you with an ongoing stream of small bugs to fix. If you write them cleanly, and keep them well factored (you are refactoring your tests, aren’t you?) they will help to document your intent too. Ensuring that every code path is exercised might be intellectually satisfying but that satisfaction costs time, and time is money. And that’s before you start worrying about your code’s malleability. Cover the happy path and the edge cases you know how to deal with and move on. If you’ve screwed something up, it will get found during acceptance testing (or out in the wild) and you can write a few more tests to isolate the problem, fix it, and move on.

If you deliberately add a test that passes without requiring you to write another line of code, ask yourself why you bothered with it. The test isn’t isolating a bug or specifying new behaviour. If you’re lucky, it merely confirms something you already know, and if you’re unlucky, you just introduced a bug in your test suite. Better to move on and either pull a new bug off the queue, or turn a feature request into a new bug – “Feature X doesn’t work!” – fix it, refactor and move on to the next. Further down the road, you may discover that the feature you were about to exhaustively test doesn’t even need to be there, or maybe it works differently than you expected. Aren’t you glad you didn’t worry about 100% coverage then? Maybe you will need to revisit the tests and cover more cases in the future. But future you knows more about the problem domain and can do a better job of working out what the behaviour needs to be. And if future you is likely to do a worse job than you right now, you may have bigger problems than the code to worry about.

Today's Noun is: Reticence 2

Posted by Piers Cawley Thu, 27 Sep 2007 23:30:29 GMT

What does the OED say reticence is?

Reticence: Maintenance of silence; avoidance of saying too much or of speaking freely; disposition to say little.

Pretty straightforward. When I chose reticence as one of my five nouns for programmers it was another reminder that objects are not the same as datastructures. Well designed objects keep their cards (instance variables) close to their chest. Client code tells objects what to do, it doesn’t ask them to kiss and tell. In Smalltalk Best Practice Patterns, (you don’t have a copy yet? Are you mad?) Kent Beck recommends that you put your accessor methods in the private protocol unless you have a very good reason for indicating that the accessors should be used by clients by putting the accessors in, say, the accessing protocol or some other, more suitably named, protocol. (In Smalltalk, all methods are public, but you can and should organize them into protocols/categories, either by choosing some existing protocol, or coming up with a new protocol name). In less flexible OO languages, you should probably at least mark your accessors as protected unless you have the aforementioned very good reason.

What about parameter objects then?

One obvious ‘exception’ to the rule of reticence is the parameter object. Say you have an unwieldy method that takes a bunch of arguments and fills the screen with its implementation. Being a conscientious programmer, you want to apply the ‘composed method’ pattern to the method so you’ll still end up with a screen’s worth of code, possibly more, but the individual methods will be much more focussed in what they achieve. What stops you is that bunch of parameters which are used through out the method. So, you introduce a parameter object. Bundle up the parameters into a single object, replace each parameter variable in the body of the code with an accessor call on the parameter object, and have at it. You can extract methods easily and you’ll only ever have to pass a single object.

It’s tempting not to bother creating a ‘real’ object, it’s very easy to just use an options hash, a la Ruby on Rails. And, because the options hash is a common pattern through your code, you can start adding helper methods like with_options, and you’ll get an awful lot of leverage out of it.

However, what’s often missed in discussions of the parameter object pattern is that it’s a waystation, not a destination. It may be handy, but if you persist in treating it as a datastructure, you’re missing out on the good stuff. Now you have a parameter object, you can start to move behaviour onto the parameter object and before you know it you’ll end up with a real object doing real, testable (mockable?), work.

Homework

Find a method where you’re using an options hash. Try using that hash to build a parameter object and then apply the principle of reticence. What happens to it? Where does the behaviour go? If, like me, you’ve come to OO from procedural coding styles, making your objects reticent is not a natural thing to do, it can feel extremely odd. But the more you practice it, the better you’ll get at thinking in a truly object oriented fashion. Sometimes an options hash really is the way to go, but not as often as you’d think.

Try it, maybe you’ll be converted.

My head hurts 3

Posted by Piers Cawley Sun, 23 Sep 2007 19:09:24 GMT

During DHH’s keynote at RailsConf Europe it was apparent that there’s a great deal to like in edge rails, so I thought I’d have a crack at getting Typo up on it.

Ow.

I’d expected the pain points to be related to routing, but it seems that the rails routing system is approaching the level of the Excel calculation engine – nobody dares touch it for fear of breaking things, so typo’s custom routes seemed to work quite happily. There were a few things that have been deprecated, pluginized or moved out of the set of modules that’s automatically included when you do a rake rails:freeze:edge, but they were pretty easy to sort – the deprecation messages are a good deal more informative now than they were last time went deprecation squashing. There’s a surprising amount of stuff that’s been removed without any deprecation warnings though, which isn’t very sporting. DHH said there would likely be a 1.2.4 release (possibly a day before 2.0) with a bunch more deprecation warnings covering everything that’s actually going away, so if you’re thinking of moving a maturish app to Rails 2.0 it might make sense to wait for 1.2.4, install that, squash warnings, and move on up to 2.0.

The real pain comes from themes. Typo’s themes rely on Rails internals working in a particular way, but they don’t work like that any more. In theory, the internals appear to be more theme friendly, related to allowing plugins to include views. The problem is, that it’s possible to change Typo’s theme without restarting the server, and the new themish internals don’t expect anything to change until the server’s restarted.

So, I’ve been playing with plugins. The most promising approach appears to be that of the themer plugin, which gets pretty close to doing what we need, and does it in a way that seems like it should work with both 1.2.3 and Edge Rails. It does appear to be making some radically different assumptions about the structure of the themes directory, but the basic framework is good and I should be able to make things work by making our current them object conform to Themer::Base’s interface and duck type my way to the sunny uplands of Edge Rails compatibility.

Which will be nice.

I like the themer approach a lot. Instead of monkeying about in the guts of rails, it monkeys about in front of Rails. It overrides render so that you can pass it a theme/lookup object. If it sees a lookup object, it uses that to rewrite the rest of the render arguments into a form that will render the right thing using the standard implementation of render. In a work project I’ve taken a similar approach to handling polymorphic routes for things like:

map.resources :pictures do |pics|
  pics.resources.comments
end

map.resources :users do |users|
  users.resources.comments
end

I ended up with a to_params method defined on my Comment model, and stuck an extended url_for in front of the default Rails version, which looks something like:

def url_for_with_to_params(*arguments)
  if arguments[0].respond_to?(:to_params)
    with_options(arguments.shift.to_params) do |mapper|
      mapper.url_for_without_to_params(*arguments)
    end
  else
    url_for_without_to_params(*arguments)
  end
end
alias_method_chain :url_for, :to_params

Which is so much neater than the last time I attacked this particular problem (see the acts_as_resource plugin).

One of the nice things about Rails is that, although it’s opinionated and somewhat liberal with the syntactic vinegar for things the core team don’t think is the Right Way, they’re pretty good at leaving the door open for people like me who have other opinions. Both the themer plugin and my as yet unpluginized extension of url_for work by using existing capabilities in new ways and, because those capabilities are documented we can expect them to continue to work over multiple versions of Rails. Plugins that achieve similar effects by monkeying with Rails’s internal interfaces are hostages to fortune. Internal interfaces are free to change at any time, even between point releases, so a plugin can be left high and dry with surprising rapidity. Just ask the Rails Engines folk.

Older posts: 1 2 3 ... 8



Just A Summary