Just A Summary

Piers Cawley Practices Punditry

Baby's first screencast 10

Posted by Piers Cawley Fri, 14 Mar 2008 19:29:00 GMT

If you follow the Ruby blogs, you will probably have seen a bunch of programmers attempting to do something akin to Haskell’s maybe, or the ObjectiveC style, message eating null.

Generally, by about the 3rd time you’ve written

if foo.nil? ? nil : foo.bar 
  ...
end

you’re getting pretty tired of it. Especially when foo is a variable you’ve had to introduce solely to hold a value while you check that it’s not nil. The pain really kicks in when you really want to call foo.bar.baz. You can end up writing monstrosities like (tmp = foo.nil? ? nil : foo.bar).nil? ? nil : tmp.baz (actually, if you were to write that in production code, you probably have bigger problems). One option is to just define NilClass#method_missing to behave like its Objective C equivalent, but I’ve never quite had the nerve to find out how that might work. I wanted to write

if maybe { foo.bar.baz }
  ...
end

and have nil behave like an Objective C nil for the duration of the block, but no longer. So I wrote it. Then I thought about how to present it. I wrote the thing test first using rspec and the whole thing just flowed, but writing up a test first development process for a blog entry is painful, so I’ve made a very rough (but blessedly short) screencast of the process instead.

That’s a slightly reduced thumbnail, the movie is substantially more readable. The bottom pane of the window is the output of autotest rerunning the spec every time either the spec or the implementation changes. The top pane alternates between the specs and the implementation. Generally, every time I edit the specs, a test starts failing and every time I edit the implementation it starts passing again. In the (any) real coding run, there were of course false starts, but generally the specs kept me pretty straight.

A word or two of warning: This is a completely unedited, silent, screen cast, there are typos, backtrackings and other embarrassments. I stopped recording once I’d got 4 tests passing, but this is far from release quality (it’s perfectly usable if you know its limitations, but it’s not entirely robust).

Please let me know what you think of this. I’m aiming to make a more polished version, complete with voice over and it would be good to know which bits are confusing and need addressing in more detail in the voice over.

I am not a rock star 16

Posted by Piers Cawley Sat, 23 Feb 2008 06:13:00 GMT

I am not a rock star. I am a computer programmer. I think I’m quite a good one.

You are not a rock star either.

387,000 matches to that query. Can we all just… I don’t know… grow up please?

Mutter… grumble… chunter… I’m 40 you know!

Updates

I have it on reliable authority that James O’Kelly is a Ruby on Rails Rockstar that would make a great addition to any team!

Typo 5 is out - and more on the future 7

Posted by Piers Cawley Sun, 30 Dec 2007 15:10:18 GMT

Right, we’ve cut a Typo 5 gem and it’s on rubyforge and heading to various mirrors I hope. Frédéric’s writing the release notification which will be appearing on Typosphere Real Soon Now.

It’s been a surprisingly tricky process – we’re now requiring Rails 2.0.2 because the workings of view_paths have changed in a way which means we can’t quite make themes with Rails 2.0 and 2.0.2 and working with the edge seems like the more sensible proposition. If you’re on the bleeding edge, you should find that you get the right Rails via svn:externals anyway.

Typo futures

Meanwhile, I’ve been playing with stet and I’ve come to the conclusion that, although there’s mileage to be had in a radically slimmed down approach to the way Typo works, I’m better off simply removing the misfeatures from Typo and building from there – there’s a surprising amount of stuff that needs to be done in a competent blogging engine that Typo gets right – starting again would be throwing the baby out with the bathwater I think.

However, this does mean that if you’re following the Typo SVN trunk, you’ll be seeing a reduction in features in the short term. We’ll be copying the current trunk to a 5-0-stable branch before we start with the featurectomies though, so if you’re just after bugfixes, you’ll be better off there.

Multiblogging

We’re aiming to have multiblogging in the next release, but we’re rethinking the how of it. Right now, the ‘Blog’ object adds a bunch of complexity to code that would be much happier simply assuming that it has the database to itself. So we’re going to look at switching to a database per blog approach, that way our core code can pretty much forget about the complexities of multiblogging, and (at least initially) anyone who wants multiblogging can get there by monkeying with configuration files – of course, we intend to add a web based admin interface once things settle down and we know how things are going to work.

Caching

Caching is always a bugbear in any typo installation. Because we want to be installable on the widest possible range of hosts, we can’t rely on the presence of handy tools like ‘memcached’. Also, some of our users are operating under some fairly severe memory and process constraints, so it makes sense to have the webserve serve static files as much as possible. Meanwhile, tools like Evan Weaver’s Interlock are pointing the way towards seriously effective fragment caching. I shall be looking into implementing something that conforms to the interlock interface, but which can use an arbitrary cache backing store for fragments and maintain a full page cache. It’ll be interesting to find out if this is doable…

Atom Publishing Protocol

ActionWebService is going to go away – it’s already in the ousted branch of the rails SVN repository, and including it in Typo to support the various different admin APIs is getting painful. So, we’re going to preempt it. We won’t be getting rid of the various XMLRPC APIs until the pain becomes too great, but we are going to be concentrating on implementing, and strongly favouring, the Atom Publishing Protocol.

Feeds for everything

In particular, we’ll be adding atom feeds for all sorts of administrative data as a means of enabling people to write external tools for, say, spam protection, comment moderation and notification tasks. Right now, there’s a great deal of computation happening on the server side every time someone, say, comments on a post – in the kind of resource limited environments some people are running Typo in, that’s too much work. Switching to a feed + APP approach should help enormously with resource utilization.

Speaking of resources…

Using the server to render article previews is… suboptimal. Expect to see a javascript based preview system akin to the one I use for comments here.

Rails 2.0 and the Future of Typo 4

Posted by Piers Cawley Sun, 16 Dec 2007 16:35:00 GMT

So, if you’ve been watching the Typo tree, you’ll see there’s been a fair amount of activity on it since Rails 2.0 got released. There’s a new default theme replacing the rather creaky ‘azure’, and a fair amount of work on getting our code compatible with the current state of Rails. As we work on this, it becomes apparent that Typo’s code is getting horribly brittle. I have said before that there’s been several places where we’ve zigged before Rails zagged, and we’re paying the price for that. It doesn’t help that our test coverage is distinctly ropy either – and I’m probably guiltier than most for letting things get into that state.

So, our goal is to get what we have cleaned up and working with Rails 2 before releasing Typo 5.0. Once that’s done, that line of code will go into maintenance mode – there are still plenty of bugs to fix and documentation to write, but I’m afraid that extending that base is becoming too much of a chore.

Which is why I have a new path in my local svk repository, //stet. I’m using this for experimental development of a new, slimmed down blogging engine that will be, first and foremost, a capable Atom Publishing Protocol host. Things like spam processing will be removed from the core of the application, but we’ll provide a suite of webservice clients that will consume the ‘unmoderated feedback’ webfeed and use APP to either approve or delete the feedback as appropriate.

Theming (at least initially) will probably be confined to Javascript and CSS changes, and I’m even thinking of exposing the sidebars as Atom collections – certainly I expect that, in the first cut, sidebars will be static – if you want content that looks dynamic you’ll have to do it via javascript.

My initial goal is to slim things down as far as I possibly can – I want to build a blogging engine that can cope with the tight memory constraints of shared hosting by off loading much of the heavy lifting to client boxes. After all, I have far more processing capability available to me on the laptop I’m typing this on than the slice of Site5’s hosting infrastructure that’s actually running the blog. By making things small and static, I also hope to wring good performance numbers out of the tool as well – expect aggressive page caching at the very least.

Another important goal is easy migration of Typo databases. I expect to be writing models and controllers from the ground up, but converting the database should just be a matter of running a migration.

Experimental

Of course, stet’s currently very experimental – about the only thing that’s actually written so far are a couple of routing plugins which should help radically simplify our routes.rb (expect an article’s url to change from /articles/2007/12/16/rails-20-and-the-future-of-type to /article/2007/12/16/rails-20-and-the-future-of-typo, but with a redirect in place to cater for the old style urls). I may have grandish plans for the thing, but I could equally discover that I’m off up a blind alley, in which case you can expect me to return to the current typo codebase with a few more lessons learned.

ActiveResource?

I remain unconvinced by ActiveResource as a technology. I agree with the authors of RESTful Webservices – good webservices are joined up. They take full advantage of what could be described as the defining technology of the world wide web, the URL based hyperlink to knit resources together in a discoverable fashion. An ActiveResource based webservice may well be a good HTTP citizen, but it’s still not really ‘webby’ enough for my taste. Which means the Atom Publishing Protocol will remain my friend for most of the things I hope to do with stet. It may be harder to write a good APP server, but I’m convinced that it’s a much better interface for clients, and you should always favour ease of use over ease of implementation. If nothing else, we’re aiming to have more users than developers. Many more.

Comprehensible sorting in Ruby 3

Posted by Piers Cawley Sun, 16 Dec 2007 09:00:51 GMT

Here’s a problem I first came across when I was about 13 and helping do the stock check at the family firm. The parts department kept all their various spare parts racks of parts bins. Each bin was ‘numbered’ with an alphanumeric id. We had printouts of all the bin numbers along with their expected contents and we’d go along the racks counting the bins’ contents and checking them off against the print out. What confused me at the time was the way the printouts were organized. Instead of the obvious ordering, “A1, A2, A3, ..., A99”, the lists were ordered like “A1, A10, A11, ..., A2, A20, A21, ...”. After a bit of thought I realised that the computer was sorting the numeric bits of the bin numbers as if they were just sequences of strange letters. A bit more thought made me realise why, post computerisation, people were starting to use bin numbers like “A01, A02, ...”. Computers were more important than people so, in order to make sorting things easier, just add spurious leading 0s to make the number field a fixed width and Robert’s your parent’s brother.

27 years later and computers are still crap at sorting things in a sensible fashion. Back before Moore’s Law was really kicking in, I suppose it was excusable, but surely we’ve moved past that now.

Over on the labnotes blog, there’s an example of some ruby code that attempts to do ‘human’ sorting:

module Enumerable
  def sensible_sort
    sort_by { |key| key.split(/(\d+)/).map { |v| v =~ /\d/ ? v.to_i : v } }
  end
end

It’s okay, as far as it goes. It certainly solves the parts bin problem I outlined above, but it’s not ideal. For example, you might expect ['-1', '1', '1.02', '1.1'].sensible_sort to leave the order unchanged, but what you actually get is ‘1, 1.02, 1.1, -1’. Not ideal. Let’s rewrite sensible sort as

module Enumerable
  def sensible_sort
    sort_by {|k| k.split(/([-+]?\d+(?:\.\d+)?(?:[-+]?[eE]\d+)?)/).map {|v| Float(v) rescue v}}
  end
end

That ugly regular expression should match a far wider selection of string representations of numbers. Certainly our ‘bad’ list is now sorted correctly.

But what about “a-1”, “a-2”. Using the implementation above, they’d get sorted as “a-2, a-1”, which can’t be right, can it? Let’s extend it a bit more and make sure we only worry about the ’+’ and ’-’ if they’re at the beginning of a line or preceded by whitespace.

module Enumerable
  def sensible_sort
    sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?\d+(?:\.\d+)?(?:[eE]\d+)?)/ms).map {|v| Float(v) rescue v}}
  end
end

And that works fine, until you find that “B” sorts before “a”. Let’s catch that as well:

module Enumerable
  def sensible_sort
    sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?\d+(?:\.\d+)?(?:[eE]\d+)?)/ms).map {|v| Float(v) rescue v.downcase}}
  end
end

Yay!

Oh, wait a minute, what about version numbers? How should we sort, say “perl 5.8.0” and “perl 5.10.0”? The 5.8.0 form should definitely come first… Hmm…

How about

module Enumerable
  def sensible_sort
   sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?\d+(?:\.\d+?(?:[eE]\d+)?(?:$|(?![eE\.])))?)/ms).map {|v| Float(v) rescue v.downcase}}
  end
end

How far down does this thing go?

I just noticed that ”.1” sorts after “1”. Time for another tweak…

module Enumerable
  def sensible_sort
    sort_by {|k| k.to_s.split(/((?:(?:^|\s)[-+])?(?:\.\d+|\d+(?:\.\d+?(?:[eE]\d+)?(?:$|(?![eE\.])))?))/ms).map {|v| Float(v) rescue v.downcase}}
  end
end

but that doesn’t work with version numbers like ”.8.2”, ”.10.2”...

Time passes… Thorin sits down and sings about gold

I was planning on giving an extension of the regex that caught this issue as well, but I’m afraid I’ve stumped myself – I can’t do it with a single regular expression unless I can use a fixed width lookbehind assertion, but they’re only available in Perl. Of course, it’s still possible to fix it, but doing so will take more thought than I have available to me at this time on a Sunday morning. And all this is before we get onto making sure that “1/2” sorts between “0” and “1”. And phone numbers. After all, “01915551238” is ‘obviously’ the same as “0191 555 1238” and “0191 555-1238”, so they should end up next to each other in the sorted list.

It looks like this is a ‘three pipe problem’ after all. I shall probably return to this…

Rails tip: Side effect filters

Posted by Piers Cawley Mon, 08 Oct 2007 12:13:00 GMT

Some bugs are easy to overlook. One that has a habit of catching me out is a Rails filter that returns false occasionally when it’s being evaluated purely for its side effects. Here’s how I’ve started working round the issue:

def side_effect_filter
  return if some_conditions_not_met?
  ...
ensure
  return true
end

What happens here is that the ensure catches any return and returns true instead. The catch is that if something throws an uncaught exception anywhere, it too gets caught by the ensure and true is returned. Which may not be what you were looking for. Here’s how to fix that issue:

def side_effect_filter
  error = nil

  return if some_conditions_not_met?
  ...
rescue Exception => error
ensure
  raise error if error
  return true
end

This catches the exception in a rescue and stashes it in the error variable, then the ensure checks to see if an exception was thrown and rethrows it, otherwise, it just returns true. Which is bulletproof, but ugly. Let’s wrap the ugliness up in a method:

def self.side_effect(method, &block)
  def_method(method) do
    error = nil
    begin
       instance_eval(&block)
    rescue LocalJumpError # catches an explicit return
    rescue Exception => error
    ensure
      raise error if error
      return true
    end 
  end
end

side_effect :side_effect_filter do
  return if some_conditions_not_met?
  ...
end

Again, not pretty inside, but all we actually care about anywhere else is that the interface is good and does what it’s supposed to do. Encapsulated ugliness has its own beauty. Especially if you get the interface right.

Homework

This should pluginize quite nicely, just install the method in ActionController::Base and ActiveRecord::Base and you have a very useful tool, but I’m still not sure that the method name is right, so I’m holding off on it. If someone were to come up with a bulletproof name and release a plugin, that would be wonderful though.

Updates

Fixed a scoping issue in the encapsulated version of the code. Replaced yield with instance_eval(&block)

My head hurts 3

Posted by Piers Cawley Sun, 23 Sep 2007 19:09:24 GMT

During DHH’s keynote at RailsConf Europe it was apparent that there’s a great deal to like in edge rails, so I thought I’d have a crack at getting Typo up on it.

Ow.

I’d expected the pain points to be related to routing, but it seems that the rails routing system is approaching the level of the Excel calculation engine – nobody dares touch it for fear of breaking things, so typo’s custom routes seemed to work quite happily. There were a few things that have been deprecated, pluginized or moved out of the set of modules that’s automatically included when you do a rake rails:freeze:edge, but they were pretty easy to sort – the deprecation messages are a good deal more informative now than they were last time went deprecation squashing. There’s a surprising amount of stuff that’s been removed without any deprecation warnings though, which isn’t very sporting. DHH said there would likely be a 1.2.4 release (possibly a day before 2.0) with a bunch more deprecation warnings covering everything that’s actually going away, so if you’re thinking of moving a maturish app to Rails 2.0 it might make sense to wait for 1.2.4, install that, squash warnings, and move on up to 2.0.

The real pain comes from themes. Typo’s themes rely on Rails internals working in a particular way, but they don’t work like that any more. In theory, the internals appear to be more theme friendly, related to allowing plugins to include views. The problem is, that it’s possible to change Typo’s theme without restarting the server, and the new themish internals don’t expect anything to change until the server’s restarted.

So, I’ve been playing with plugins. The most promising approach appears to be that of the themer plugin, which gets pretty close to doing what we need, and does it in a way that seems like it should work with both 1.2.3 and Edge Rails. It does appear to be making some radically different assumptions about the structure of the themes directory, but the basic framework is good and I should be able to make things work by making our current them object conform to Themer::Base’s interface and duck type my way to the sunny uplands of Edge Rails compatibility.

Which will be nice.

I like the themer approach a lot. Instead of monkeying about in the guts of rails, it monkeys about in front of Rails. It overrides render so that you can pass it a theme/lookup object. If it sees a lookup object, it uses that to rewrite the rest of the render arguments into a form that will render the right thing using the standard implementation of render. In a work project I’ve taken a similar approach to handling polymorphic routes for things like:

map.resources :pictures do |pics|
  pics.resources.comments
end

map.resources :users do |users|
  users.resources.comments
end

I ended up with a to_params method defined on my Comment model, and stuck an extended url_for in front of the default Rails version, which looks something like:

def url_for_with_to_params(*arguments)
  if arguments[0].respond_to?(:to_params)
    with_options(arguments.shift.to_params) do |mapper|
      mapper.url_for_without_to_params(*arguments)
    end
  else
    url_for_without_to_params(*arguments)
  end
end
alias_method_chain :url_for, :to_params

Which is so much neater than the last time I attacked this particular problem (see the acts_as_resource plugin).

One of the nice things about Rails is that, although it’s opinionated and somewhat liberal with the syntactic vinegar for things the core team don’t think is the Right Way, they’re pretty good at leaving the door open for people like me who have other opinions. Both the themer plugin and my as yet unpluginized extension of url_for work by using existing capabilities in new ways and, because those capabilities are documented we can expect them to continue to work over multiple versions of Rails. Plugins that achieve similar effects by monkeying with Rails’s internal interfaces are hostages to fortune. Internal interfaces are free to change at any time, even between point releases, so a plugin can be left high and dry with surprising rapidity. Just ask the Rails Engines folk.

A tiny ruby niggle 32

Posted by Piers Cawley Sun, 09 Sep 2007 07:51:00 GMT

You know what? I’m starting to miss compulsory semicolons as statement terminators in Ruby.

“What?” I hear you say. “But not needing semicolons is one of Ruby’s cardinal virtues! Are you mad?”

I don’t think so, but maybe you’ll disagree after I explain further.

Here’s a piece of code that I might write if semicolons were the only way of terminating a statement:

Category.should_receive(:find_by_permalink)
  .with('foo')
  .and_return(mock_category);

Or how about a complex find query

def find_tags_for(tag_maker, order = 'count')
  klass = tag_maker.class
  find :all
    , :select => 'tags.*, count(tags.id) count'
    , :group => Tag.sql_grouping
    , :joins => 
        "LEFT JOIN taggings ON "
      + "      tags.id = taggings.tag_id "
      + "LEFT JOIN bookmarks ON "
      + "      bookmarks.id = taggings.taggable_id "
      + "  AND taggings.taggable_type = 'Bookmark' "
      + "LEFT JOIN #{klass.table_name} ON "
      + "      #{klass.table_name}.id = bookmarks.#{klass.to_s.underscore}_id"
    , :conditions => conditions_for(tag_holder)
    , :order => (order == 'count') 
        ? 'count(tags.id) desc, tags.name' 
        : "tags.name"
    , :readonly => true
    ;
end

I first came across the idea of the leading comma in Damian Conway’s excellent Perl Best Practices. The idea is that, by leading with the comma it’s very easy to add a new argument to an argument list or hash specification without having to remember to stick a comma on the end of the preceding line if it was at the end, and also, the leading comma makes it very plain that the line is a continuation of its predecessor in some way.

To make the examples work in Ruby, you have to add a \ to the end of each line that has a continuation, so the first example has to be written:

Category.should_receive(:find_by_permalink) \
  .with('foo')                              \
  .and_return(mock_category);

Lining up the \s helps to stop them disappearing, but it’s an awful faff.

What tends to happen (in the rails source especially) is that ruby programmers simply don’t break their lines up. A quick search of the rails source finds plenty of lines more than 160 characters long.

Of course, some will argue that it doesn’t matter, that the old 80 column limit is a silly hangover from the days of steam when the only way to interact with your code was through an 80 column, green phosphor terminal. They have a point. An arbitrary line limit is silly, and we should get over it, especially in source code. However, unless you’re going to go around with every window open to its maximum width, lines will wrap, and they won’t do it nicely, or respect the indentation conventions of your language. Long lines are murder in diffs too, finding the point of difference is so much easier when your eye doesn’t have to scan an epic line.

It’s a shame there’s no way of forcing ruby’s parser to require semicolons as statement terminators for those programmers like me who think that the restriction that a statement must end with a semicolon is worth the freedom to break lines where we like without needing to escape every line break. It’s a shame too that popular tools like Textmate are so clumsy when it comes to dealing with line breaks. I would attempt to hold Emacs up as a paragon in this respect, but its Ruby mode tends to get a wee bit bemused once you start breaking lines, so that’s no good.

Domino theory

It’s amazing how far reaching seemingly simple language design decisions can be isn’t it? Just getting rid of the need to terminate statements with a semicolon has an enormous effect on they way code in ruby looks. I’m just not sure that they look better.

Maybe Smalltalk got it really right – they chose to use the most valuable syntactic character of all, the space, to denote sending a message. That freed up the . for use as a statement (sentence?) terminator. Then that freed up ; for use in one of Smalltalk’s most distinctive patterns – the cascade. Where a Rails programmer might write:

form_for(@comment) do |f|
  f.input(:author)
  f.input(:title)
  f.input(:body)
end

A Smalltalk programmer might eliminate the need for a temporary variable by doing:

Comment>>printOn: html

  (html formFor: self)
    input: #author;
    input: #title;
    input: #body.

All those input: ... messages get sent to the result of html formFor: self. Once you get the hang of it, it’s a really sweet bit of syntax.

Incidentally, there’s been some discussion on the squeak mailing lists of a companion to the cascade, which would use a ;; as a sort of ‘pipe’. The idea is to be able to replace code like:

((self collect: [:each | each wordCount) 
    inject: 0 into: [:total :each| total + each]) 
        printOn: aStream.

with

self collect: [:each | wordCount]
    ;; inject: 0 into: [:total :each | total + each]
        ;; printOn: aStream.

(NB: Please ignore what those code snippets do, because that’s gruesome. Concentrate on how they do it).

Nobody’s quite proposed going as far as Haskell does with its Monads, which can be thought of as a magical land where the meaning of the semicolon changes according to what sort of Monad you’re in. (In an IO monad for instance, the semicolon imposes an evaluation order. In some other monad, the semicolon could just as easily denote a backtracking point). Then again, there’s nothing to stop the dedicated Smalltalker implementing something Monadish – every Smalltalk class can specify how its methods should be compiled after all…

In conclusion…

I’m not sure I’ve got a real conclusion for all this. I’m mostly musing. However, I do think it’s useful to think carefully about restrictions and what they free us to do as programmers. Lispers will wax lyrical about the way that their language’s pared down syntax lets them do amazing things with macros. Smalltalkers will defend to the death the idea that the only way to do anything is to send messages to objects. Pythonistas love their syntactic whitespace. Haskellers love their static typing (admittedly, they have an incredibly flexible notation for expressing type that leaves most other programming languages standing).

And any English speaker with ears will know that a poem like Dylan Thomas’s Do Not Go Gentle Into That Good Night gains much of its power from it’s form, the villanelle, one of the most restricted forms of poetry there is. Two lines repeating through the poem and a staggering number of rhymes to find:

Do not go gentle into that good night,
Old age should burn and rave at close of day;
Rage, rage against the dying of the light.
Though wise men at their end know dark is right,
Because their words had forked no lightning they
Do not go gentle into that good night.

Good men, the last wave by, crying how bright
Their frail deeds might have danced in a green bay,
Rage, rage against the dying of the light.

Wild men who caught and sang the sun in flight,
And learn, too late, they grieved it on its way,
Do not go gentle into that good night.

Grave men, near death, who see with blinding sight
Blind eyes could blaze like meteors and be gay,
Rage, rage against the dying of the light.

And you, my father, there on the sad height,
Curse, bless me now with your fierce tears, I pray.
Do not go gentle into that good night.
Rage, rage against the dying of the light.

If that’s not making a virtue of a restriction, I don’t know what is.

Today's noun is: Reification 4

Posted by Piers Cawley Wed, 22 Aug 2007 23:22:00 GMT

Reification: The mental conversion of a person or abstract concept into a thing. Also, depersonalization, esp. such as Marx thought was due to capitalist industrialization in which the worker is considered as the quantifiable labour factor in production or as a commodity. – OED

In the sense that the OED has it, I’m not what you could call a fan of reification. At work, we have a rule that anybody who starts talking about ‘resources’ when they mean ‘people’ gets a (verbal) slap.

However, in OO circles (or maybe just in my head), reification is a good thing. It’s the process of taking something abstract and turning it into a ‘real’ object. Usually, the word gets used for big things like turning an intractable method into an object as a step on the way to refactoring that method. I tend to use it in a slightly broader sense. For me, reification is the process of turning something (a method or a data structure usually) into a full blown object with its own behaviour.

Back when I was working on Pixie (a cunning, but weird, object persistence tool written in Perl) we had a data structure which was used for keeping track of managed objects. It started life as a hash. Everything was fine at first, but over time we ended up with more and more code being repeated across the codebase that was concerned with manipulating the cache hash. So, we replaced the hash with a new object and pulled all the repeated code into methods on that object, which gave us cleaner code to extend, and a strong feeling that we should have turned the cache into an object much earlier in the game. (By leaving it so long, we had a lot more code to move about, some of it in fairly obscure places; tracking down the last bit took a while.)

Data structures like hashes and arrays are really useful in languages that have them. The catch is, they have this habit of acquiring code. When this starts to happen, it’s time to reify – to replace the hash with a task specific object. In Ruby, it’s easy enough to inherit from Hash, but Hash comes with a pile of methods that probably aren’t relevant to your particular need. Generally it’s better to delegate. The first cut doesn’t have to be that complicated, just decorate the hash with a new class and initialize an instance of the class at the point where you had just made the hash.

Once that’s done, you can go through your code and move the bits that treat the hash as a data structure onto your new class. As you gather all the common behaviour to the new class, you’ll start to see places where you can improve code quality by merging common behaviours, replacing complex conditionals with polymorphism (you’ll probably have to introduce a factory method if you do that) and pulling hash keys out into instance variables.

Stalled reification

Reifying your data structure isn’t an end in itself, it’s a step along the way as you refactor your code.

There’s an example of a stalled reification to be found in ActionController::Routing::Resources. Consider the implementations of map_resource and map_singleton_resource, which are the worker methods used whenever you do a map.resource or map.resources in your routes.rb.

def map_resource(entities, options = {}, &block)
  resource = Resource.new(entities, options)

  with_options :controller => resource.controller do |map|
    map_collection_actions(map, resource)
    map_default_collection_actions(map, resource)
    map_new_actions(map, resource)
    map_member_actions(map, resource)

    if block_given?
      with_options(:path_prefix => resource.nesting_path_prefix, &block)
    end
  end
end

def map_singleton_resource(entities, options = {}, &block)
  resource = SingletonResource.new(entities, options)

  with_options :controller => resource.controller do |map|
    map_collection_actions(map, resource)
    map_default_singleton_actions(map, resource)
    map_new_actions(map, resource)
    map_member_actions(map, resource)

    if block_given?
      with_options(:path_prefix => resource.nesting_path_prefix, &block)
    end
  end
end

There’s a lot of repetition there. The only differences are the classes of the resource object, name of the second function called in the with_options block. If we take a look at, map_collection_actions things start to look even fishier. Here’s map_collection_actions, for example:

def map_collection_actions(map, resource)
  resource.collection_methods.each do |method, actions|
    actions.each do |action|
      action_options = action_options_for(action, resource, method)
      map.named_route("#{resource.name_prefix}#{action}_#{resource.plural}", "#{resource.path};#{action}", action_options)
      map.named_route("formatted_#{resource.name_prefix}#{action}_#{resource.plural}", "#{resource.path}.:format;#{action}", action_options)
    end
  end
end

resource.collection_methods.each? Let’s see what happens if we the various map_foo_actions methods into methods on ActionController::Resources::Resource. While we’re about it, we can rename map_default_collection_actions to map_default_actions on Resource, and map_default_singleton_actions to map_default_actions on SingletonResource, which inherits from Resource. map_collection_actions becomes:

def map_collection_actions(map)
  collection_methods.each do |method, actions|
    actions.each do |action|
      map.with_options(action_options_for(action, method)) do |m|
        m.named_route("#{name_prefix}#{action}_#{plural}",
                      "#{path};#{action}")
        m.named_route("formatted_#{name_prefix}#{action}_#{plural},
                      "#{path}.:format;#{action}")
      end
    end
  end
end

(we move action_options_for onto resource as well, of course).

Once we’ve moved the various mapping helpers onto the resource classes, we can revisit map_resource and map_singleton_resource


def map_resource(entities, options={}, &block)
  resource = Resource.new(entities, options)

  with_options(:controller => resource_controller) do |map|
    resource.map_collection_actions(map)
    resource.map_default_actions(map)
    resource.map_new_actions(map)
    resource.map_member_actions(map)
  end

  if block_given?
    with_options(:path_prefix => resource.nesting_path_prefix, &block)
  end
end

def map_singleton_resource(entities, options={}, &block)
  resource = SingletonResource.new(entities, options)

  with_options(:controller => resource.controller) do |map|
    resource.map_collection_actions(map)
    resource.map_default_actions(map)
    resource.map_new_actions(map)
    resource.map_member_actions(map)
  end

  if block_given?
    with_options(:path_prefix => resource.nesting_path_prefix, &block)
  end
end

And now, we no longer have two method bodies that look very similar, apart from the resource class, we have to methods that look identical apart from the resource class. So, if we pull out the common bits and put them onto Resource, like so:


class ActionController::Resources::Resource
  def install_routes_in(map, &block)
    map.with_options(:controller => controller) do |m|
      map_collection_actions(m)
      map_default_actions(m)
      map_new_actions(m)
      map_member_actions(m)
    end

    if block_given?
      map.with_options(:path_prefix => nesting_path_prefix, &block)
    end
  end
end

Then map_resource and map_singleton_resource become


def map_resource(entities, options = {}, &block)
  Resource.new(entities, options).install_routes_in(self)
end

def map_singleton_resource(entities, options = {}, &block)
  SingletonResource.new(entities, options).install_routes_in(self)
end

Where’s the benefit?

Apart from making the active_record/lib/resources.rb a bit shorter (a laudable result in itself), where’s the benefit here?

From my own experience of implementing datestamped_resource, a routing plugin that we use in Typo, it makes the life of anyone writing a resource like routing helper for Rails a great deal easier. With datestamped_resource I ended up subclassing ActionController::Resources::Resource, doing the refactoring I’ve outlined here, but leaving the original Rails methods where they were and just implementing the ‘moved’ methods on DatestampedResource (well, not quite, map_collection_actions is pretty different from the default Resource implementation, but the other actions are pretty much the same.

In another project I’m working on, I’m trying to retain meaningful urls with (potentially) deep resource nesting, and it’d be really handy to have an inflected_resource route helper. The problem with using a meaningful to_param on your models is, avoiding permalinks that share a name with your actions. You could set up validations so that, say, ‘new’ is an illegal permalinks, but it’s clumsy.

However, if you arrange things so that your URLs are inflected, you can always tell that a URL that begins /resource/new will be a particular resource, with the permalink ‘new’, and /resources/new will be the virtual new resource.

If the resource system is factored as I outlined, this is almost trivial, you can introduce a InflectedResource subclass of Resource


class InflectedResource < Resource
  def member_path
    @new_path ||= #{path_prefix}/#{singular}/:id
  end
end

and you’re pretty much done. Admittedly, something like that (plus a small amount of copy and paste) would work with the current system, but then we’re looking at 3 substantially identical methods in ActionController::Resources and if it wasn’t time to refactor before, it’d definitely be time to refactor then.

Conclusions

Reification shouldn’t be something you do every day, but nor should it be something you do once a flood. Take a look at some of your projects and some of the places where you’re using hashes. Are those really hashes, or would they benefit from having some behaviour of their own? You can track down stalled reification by looking for anaemic classes; classes which have a lot of accessors but very little behaviour. Once you’ve found an anaemic class, look for all the places that instances of it get used. Try moving some of the client code into methods on your anaemic class. Do that a few times and you’ll end up with a real object.

If you’re fussy about never putting HTML in your models, you could end up with a mediating builder/presenter object as well, but until you start wanting to render the same structured info in different formats, I’d suggest biting the bullet and living with HTML in the model as a lesser evil than structural code. Your mileage may vary.

A cunning (evil?) trick with Ruby 4

Posted by Piers Cawley Thu, 16 Aug 2007 06:28:00 GMT

One of the handy tools that Ruby makes available to us Domain Specific Pidgin builders is instance_eval. With instance_eval you can take a block that was created in one context, with all its lovely closed over local variables, and evaluate it in the context of an instance of an arbitrary object. Which means that any references to instance variables and calls to methods are made as if your block were a 0 argument method of that class. It’s really potent, but at the same time, a little frustrating.

Frustrating? Why frustrating?

Well, it would be really cool if you could call `instance_eval` with a block that took some arguments. That way, you could inject some values into the block from still another scope. (Yes, it’s arcane, but in the places where it would be handy it would be really handy).

I just worked out how to do it:

def my_instance_eval(*args, &block)
  return instance_eval(*args,&block) unless block_given? && !block.arity.zero?
  self.class.send(:define_method, :__, &block)
  returning(__(*args)) do
    self.class.send(:remove_method, :__)
  end
end

There’s problems with that though, the most obvious one being that there’s no guarantee that __ won’t already exist as a method or that our block won’t call __. Here’s a safer, if scarier option:

def my_instance_eval(*args, &block)
  return instance_eval(*args,&block) unless block_given? && !block.arity.zero?
  old_method = (self.class.instance_method(:__) rescue nil)
  self.class.send(:define_method, :__, &block)
  block_method = self.class.instance_method(:__)
  if old_method
    self.class.send(:define_method, :__, old_method)
  else
    self.class.send(:remove_method, :__)
  end
  block_method.bind(self).call(*args)
end

This should work even in the face of being called with something along the lines of object.my_instance_eval(10) {|v| v + __}. Of course, there would be no point in calling like that. You’d only really need it when you want to do something like object.my_instance_eval(10,&a_block_param).

In the project I’m currently working, where the need for this arose, I shall probably extract the body of my_instance_eval to Proc#to_unbound_method, that way, instead of hanging onto Block objects I can hang onto UnboundMethod objects and avoid repeatedly shuffling methods.

But… how does it work?

I hope the code’s reasonably obvious. However…

The essence of it is that real methods get to take arguments, so what we need is some way of turning our block into a real method without altering the behaviour/interface of our object. Ruby does allow us create UnboundMethod objects which can be thought of as anonymous methods. To use them you have to bind them to a specific instance and then call them, which is the last thing we do in my_instance_eval. The intervening code is what turns our Proc into an UnboundMethod. First, we stash the old __ method, or a nil if there wasn’t one. Then we define a new __ using our block, and immediately use instance_method to get it back as an UnboundMethod. Then we either replace the old definition of __ or simply remove the new one, restoring the original behaviour of our class. Then we bind our new anonymous method to self and call it with our arguments. Easy.

Exercises for the interested reader

  • What happens if our class has method_removed or method_added implemented? Can we ‘hide’ from them? How?
  • What does Proc#to_unbound_method look like? Does it need to take an argument?

Older posts: 1 2 3 ... 6



Just A Summary