WTF? 10
A few days ago, Obie Fernandez:
commented on the advert found in the current Linux Journal which features a photograph of an attractive woman with the slogan “Don’t feel bad, Our servers won’t go down on you either.” Obie threw together a mock ad for a document management company which showed a bunch of latinos being harrassed by a US Customs and Immigration patrol with the slogan “Don’t worry. We’ll keep all your documents in order.” which made his point rather well I thought.
Then I read the comments.
A couple of respondents thought Obie’s mock ad was genuine and a good, funny ad. There were the usual cries of “Dude! Political Correctness gone mad” and some belittling of a female respondent who had explained how the Linux Journal ad made her feel. Laughing at someone and and telling her she “totally missed the joke” is such a sensitive response to someone’s hurt, don’t you think?
Another commenter wondered why there wasn’t an “obsession… with schemes to get men into teaching or nursing”, and wondered if it was because women were less interested in computers, a question which Obie felt was worth dealing with in a followup post. The response being, essentially, that there are several schemes aimed at improving the gender balance in those professions; the traffic is not all one way.
Now, if you’ve been reading this blog for any length of time, you’ll know my opinion: If women in most western cultures are indeed less interested in computers than men, then that’s a purely cultural thing. It’s a bloody embarrassment that our culture is that fucked up. Thankfully, the people who’ve commented here when I’ve sounded off haven’t descended to the fatuity of Obie’s respondents. The second response to Obie’s followup post closes with:
Anyway, on a personal not [sic] after seeing your picture I wonder in which western movie you played the bad guy? Please give me a hint :)
If you want to know why there’s a dearth of women and minorities working in IT, I reckon you’ll find all the evidence you need right there.
A cunning (evil?) trick with Ruby 4
One of the handy tools that Ruby makes available to us Domain Specific Pidgin builders is instance_eval. With instance_eval you can take a block that was created in one context, with all its lovely closed over local variables, and evaluate it in the context of an instance of an arbitrary object. Which means that any references to instance variables and calls to methods are made as if your block were a 0 argument method of that class. It’s really potent, but at the same time, a little frustrating.
Frustrating? Why frustrating?
Well, it would be really cool if you could call `instance_eval` with a block that took some arguments. That way, you could inject some values into the block from still another scope. (Yes, it’s arcane, but in the places where it would be handy it would be really handy).
I just worked out how to do it:
def my_instance_eval(*args, &block)
return instance_eval(*args,&block) unless block_given? && !block.arity.zero?
self.class.send(:define_method, :__, &block)
returning(__(*args)) do
self.class.send(:remove_method, :__)
end
endThere’s problems with that though, the most obvious one being that there’s no guarantee that __ won’t already exist as a method or that our block won’t call __. Here’s a safer, if scarier option:
def my_instance_eval(*args, &block)
return instance_eval(*args,&block) unless block_given? && !block.arity.zero?
old_method = (self.class.instance_method(:__) rescue nil)
self.class.send(:define_method, :__, &block)
block_method = self.class.instance_method(:__)
if old_method
self.class.send(:define_method, :__, old_method)
else
self.class.send(:remove_method, :__)
end
block_method.bind(self).call(*args)
endThis should work even in the face of being called with something along the lines of object.my_instance_eval(10) {|v| v + __}. Of course, there would be no point in calling like that. You’d only really need it when you want to do something like object.my_instance_eval(10,&a_block_param).
In the project I’m currently working, where the need for this arose, I shall probably extract the body of my_instance_eval to Proc#to_unbound_method, that way, instead of hanging onto Block objects I can hang onto UnboundMethod objects and avoid repeatedly shuffling methods.
But… how does it work?
I hope the code’s reasonably obvious. However…
The essence of it is that real methods get to take arguments, so what we need is some way of turning our block into a real method without altering the behaviour/interface of our object. Ruby does allow us create UnboundMethod objects which can be thought of as anonymous methods. To use them you have to bind them to a specific instance and then call them, which is the last thing we do in my_instance_eval. The intervening code is what turns our Proc into an UnboundMethod. First, we stash the old __ method, or a nil if there wasn’t one. Then we define a new __ using our block, and immediately use instance_method to get it back as an UnboundMethod. Then we either replace the old definition of __ or simply remove the new one, restoring the original behaviour of our class. Then we bind our new anonymous method to self and call it with our arguments. Easy.
Exercises for the interested reader
- What happens if our class has
method_removedormethod_addedimplemented? Can we ‘hide’ from them? How? - What does
Proc#to_unbound_methodlook like? Does it need to take an argument?
Domain Specific Pidgin 9
So, I’m busily writing an article about implementing an embedded little language in Ruby. It’s not something that’s going to need an entirely new parser, it borrows Ruby’s grammar/syntax but does some pretty language like things to the semantics and ends up feeling far more like at declarative language than the usual Ruby imperative OO style.
Because I tend to chromatic’s view of many ruby programmers’ ability to cry “Wolf!” “DSL!”, I don’t want to claim that it’s a full blown Domain Specific Language, but it’s sufficiently language like that ‘API’ doesn’t seem to fit as a description either.
Then it hit me… it’s a pidgin.
A pidgin can be thought of as a mashup of two languages, taking vocabulary from both its parents and its grammar (usually simplified) from one parent. Historically, pidgins have arisen to help with trade and colonization; grammars have tended to be lifted and simplified from the ‘native’ language and then spiked with words from the colonizing language with a leavening of native words where they make sense. All quite politically incorrect nowadays, but they served their purpose. Pidgins are, by their nature, domain specific; fine if you wanted to talk trade or order your coolies about, but not what you’d write poetry in. Poetry tended to get written in the creoles that evolved from some pidgins. A creole is a general purpose language, with a grammar of its own; they seem to evolve from pidgin, getting invented by the kids of parents who speak that pidgin.
In my little language, much of the vocabulary pinched from my problem domain’s language and the grammar and some terminology is lifted from ruby. Casting the problem domain as the colonial power and ruby as the native language, it’s obvious that I’ve invented a pidgin language.
Let’s embrace that term. We’re not writing domain specific languages, we’re writing pidgins. ActiveRecord’s family of class methods isn’t a DSL, it’s database pidgin. RSpec is a testing pidgin, Parsec is a parsing pidgin, so are any number of APIs that make their host language feel like a new one.
Updates
In the comments, Aristotle Pagaltzis points out that Parse::RecDescent isn’t a parsing pidgin because it uses a full blown parser (itself, obviously) to parse any grammar declarations.
Monads 6
I’ve been following Adam Turoff’s excellent Haskell tutorial and he’s just reached the part where he explains Monads.
To listen to a lot of people, Monads are the bit of Haskell that breaks their brains. As they’re usually described, Monads are the part of Haskell that allow you to write code that has side effects. You know, stuff like reading a file or generating a random number.
What most of the tutorials I’ve read don’t do is explain why Monads let you use side effects. Or, to put it another way, why you can’t successfully use side effects in Haskell without them.
Monads wrap up the concept of a sequence of actions in something that looks, from outside the Monad, like an other datatype. It clicked for me when I remembered something from a course on Set Theory when I was an undergraduate mathematician. The thing about sets is, they’re not ordered. {a, b} is the same set as {b, a}. However, when you’re busily constructing all the entities that get used in mathematics from the Zermelo/Fraenkel axioms, you eventually need to build ordered pairs of numbers. Coordinate systems for example, make heavy use of ordered pairs.
So, if you’ve got sets, which are not ordered, and you need to build a set which can be interpreted as an ordered pair, how do you proceed?
The standard method is to represent the ordered pair, (x,y) as { {x}, {x, y} }. Look at the Wikipedia article if you want the gory details.
So, an ordered pair is just a handy notation for a slightly more complicated, unordered, set.
Monads are a generalization of this principle; in a sense, the details of how they work are irrelevant (in the same way that the innards of an ordered pair are irrelevant), what’s important is that they provide a box in which ordered execution can happen. And ordered execution is what you need if you want to write code that uses side effects or works with impure ‘functions’ that don’t always return the same value given the same input.
Monads can seem so mindbending because most of us come from a programming background where ordered execution is all there is. In pretty much every mainstream language, the idea that the programmer doesn’t control the order in which code is evaluated seems utterly outlandish. Monads look weird, then, because we’ve never thought of the problem they solve as a problem in the first place. It’s just how programs are.
Maybe it’s time we tried to let go of that assumption. Or maybe I’ve completely misunderstood monads.
It’s probably the latter.
Updates
Someone commenting on this post at reddit.com points out that monads aren’t solely used for wrapping sequential processing; they can be used to wrap pretty much every model of computation you could come up with. Which, I must admit, hadn’t quite clicked with me, despite realising that Parsec, a Haskell parser combinator library, was monadic but wasn’t really about sequential processing because the Parser monad also handled backtracking.
So, it seems that I’ve incompletely misunderstood monads.
Doing the fixture thing 5
Fixtures suck! Mocks rock! Don’t you dare let your tests touch the database!
Well… yes… I suppose. Except, mocking can be a complete pain in the arse too (made slightly less of a pain in the arse if you use the null object options) – it’s awfully easy to end up with huge long setup methods that spend all their time faking out a mock object and about two lines testing what you need.
I’m sure someone’ll be along to argue that this is evidence of lazy design on my part. Well, “Yah boo sucks!” to the lot of ‘em. Fixtures can be exceedingly useful.
Spooky action at a distance
The biggest problem I have with Rails’ current implementation of fixtures is… oh, where to start… it’s probably the action at a distance aspect of them. Your test is here, but your fixture is defined some way over there, in a random collection of yaml files.
Then there’s the problem of remembering which fixtures you need to have loaded for a particular test, and it all starts getting horrible very quickly.
One way of addressing this is to use fixture scenarios and fixture scenario builder, which works around the problem of remembering what to load and means you don’t have to write your fixtures in YAML. What it doesn’t work around is the action at a distance issue (but I’m betting it wouldn’t be too hard to repurpose the fixture scenario builder to let you declare the fixtures you need for a particular test/spec in the same place as you do the testing).
At work though, we came up with another way of making it easy to build your fixtures right in the test file. Our current approach is still a little bit clunky, and it’s nowhere near the point where I can turn it into a library, but I think it’s worth discussing anyway.
Exemplars
The question to ask is, what do you use a fixture for? Most of the time, what a test needs is a mostly generic instance of your class which will pass the validations, and which has maybe a couple of attributes set to particular values.
Let’s say you have a user class. As is common with such things, your user has a username, an email address and a password. As is so often the case, the usernames and email addresses must be unique, and the password must not be blank. Let’s say you’re working on a tagging system (isn’t everyone). Here’s the sort of specification you might write:
context "A taggable object" do
setup do
# makes a taggable object and a couple of users
end
it "Should aggregate taggings" do
@first_user.tag(@taggable, "tag")
@second_user.tag(@taggable, "tag")
@taggable.save!; @taggable.reload
@taggable.should have(1).tags
end
endIn this context, the only thing you need from those user objects is that they behave like @users, are distinct and valid. You could simply set up a couple of users your users.yml fixture, but the approach we took at work went something like this:
class User
class << self
@@exemplar_count = 0
def exemplar(overrides = {})
@@exemplar_count += 1
with_options(:username => "user#{@@exemplar_count}",
:email => "user#{@@exemplar_count}",
:password => "fredisabadpassword") do |maker|
maker.new(overrides)
end
end
def create_exemplar(overrides = {})
returning(exemplar(overrides)) {|user| user.save}
end
def create_exemplar!(overrides = {})
returning(exemplar(overrides)) {|user| user.save!}
end
end
endThis lets us write setup code like:
setup do
@first_user = User.create_exemplar!
@second_user = User.create_exemplar!
@taggable = ...
endIn tests where you need an exemplar with a specific property, you can write Model.exemplar(:tested_attribute => specific_value) – the way the overrides work means you only have to describe the ‘interesting’ bits and the obscuring dust involved in simply building a valid object is swept under the carpet.
Homework
- If you’re familiar with the Object Mother pattern, this might seem a little familiar, with the wrinkle that, instead of having a factory class, we just push the exemplar builder directly onto the model class.
- If you start implementing exemplars yourself, you’ll probably spot a good deal of repetitive coding. I’ve not extracted a library yet because I’ve not quite come up with an interface that I like. Can you come up with a good way of doing it? Can you implement it?
- What did I miss?
Deja vu all over again 9
Back when I was still programming Perl, one of the common mistakes that you’d see when people wrote lazily initialized object accessors was code like:
sub get_content {
my($self) = @_;
$self->{content} ||= 'default content';
}Code written like this would trundle along quite happily, until you had an attribute where, say, the empty string or 0 were legal values for the attribute. The problems were especially painful when the default value wasn’t something that perl treated as false. The correct way of writing that code would be:
sub get_content {
my($self) = @_;
unless (exists($self->{content})) {
$self->{content} = 'default content'
}
$self->{content}
}Which is substantially uglier, but safe.
Safety’s important, especially in building block code like accessor methods. An accessor method that works 99.99% of the time is like a compiler that produces correct code 99.99% of the time – useless.
Why déjà vu?
Recently, the usually spot on Jay Fields wrote up the lazy initialization pattern for Ruby. I’m not entirely sure that I agree with his motivation for the pattern, but I am concerned by his suggested code transformation. He suggests writing your lazy initialization as:
def content
@content ||= []
endDoes that look familiar? This is subject to exactly the same potential bug as the perl code above. Admittedly, the number of possible ‘bad’ values is reduced to nil and false, but it only takes one. Here’s the fix:
def content
unless instance_variable_defined? :@content
@content = []
end
return @content
endThis code is guaranteed to work in all circumstances.
Going a little bit further…
As Mark Jason Dominus has argued persuasively elsewhere, patterns are a sign of weakness in a programming language, so how can we go about incorporating this boilerplate code into our language1.
How about something like this (as yet untested):
class Module
class << self
alias_method :attr_reader_without_default_block, :attr_reader
def attr_reader_with_default_block(*args, &default_block)
unless block_given?
return attr_reader_without_default_block(*args)
end
unless args.size == 1
raise ArgumentError, "Expected 1 argument"
end
var_name = "@#{args.first.to_s}"
self.define_method(args.first) do
unless instance_variable_defined?(var_name)
self.set_instance_variable(var_name, default_block.call(self))
end
(class << self; self; end).attr_reader_without_default_block(args.first)
return self.send(args.first)
end
end
alias_method :attr_reader, :attr_reader_with_default_block
end
endThis code is a little more complex than the boilerplate code. When the generated method is called, possibly initializing the attribute, the (class << self; self; end).attr_reader_without_default_block(args.first) part replaces the instance’s accessor with the default attr_reader implementation and calls that instead. This is arguably premature optimization, but it’s not all that evil…
Assuming I’ve not screwed anything up, that should allow you to write.:
class Article
attr_reader :content {'default content'}
endand have your content lazily initialized. Extending this to let attr_writer take a block too is a reasonably obvious next step.
Extending this lazy initialization approach to work with ActiveRecord based classes is probably the next step after that. Making it work right probably involves a little bit of fossicking around in the workings of ActiveRecord, but it’s far from impossible.
1 I tend to take the Smalltalkish view that it’s pointless to separate language from library, especially in a dynamic language. A sufficiently expressive language lets you blur the boundary between them.
Holiday Reading
Mmm… back from Scotland with a chunk of reading done:
Harry Potter And The Deathly Hallows. Mmm… top notch stuff. Wraps up the series perfectly.
The Book Thief. Wow! Seriously… wow. Whatever you do, don’t read the last chapter of this in public. I was a wreck. Beautiful. Sad. Really Sad. Life affirming stuff. Read it.
The Complete Polysyllabic Spree. Nick Hornby writes exceedingly good litcrit/columns.
Compilers: Principles, Techniques and Tools aka The Dragon Book. Good, crunchy computer science full of stuff I’ve skirted around learning for ages now. I’m still far from finished with this book. It does feel old fashioned though, and a little light on techniques for implementing highly dynamic languages.
Programming Language Pragmatics. More seriously crunchy computer science; I’ve only really skimmed the surface of this one so far, but I’m liking it a lot so far.
The Amazing Adventures of Kavalier & Clay. Another wow. What’s not to like? Golden Age Comic book history. Close up magic and escape artistry. Nazis! Citizen Kane! Astonishing, page turning, storytelling. No wonder it won a Pulitzer. It might well make you cry too (though not as much as The Book Thief)
The Essential Turing. You know how everyone who knows anything about computing tells you that Alan Turing was a genius? They’re right. I knew they were right before I read On Computable Numbers, with an Application to the Entscheidungsproblem (the first paper in this book and the only one I’ve read so far), but the paper drove home how much of a genius he was. The way he bootstraps from the idea of the Turing Machine to a Universal Turing Machine is just beautiful – one minute he’s describing the basic workings of a seemingly simple machine and a few conventions he intends to use and then within a few pages he’s implementing a Universal Turing Machine and describing it using the kind Higher Order Functions that get ‘metaprogrammers’ so excited whenever the come across a language like Ruby. Great, great stuff.
Off on holiday 2
The bags are packed, there’s a pile of reading matter (mostly the classics, Turing’s papers, The Dragon Book…), the iPod is charged and we’re off to Scotland for a week with no connectivity.
See you in a week. Don’t wreck the joint.
Theres Good Clever, and Bad Clever
Have you noticed the difference between Good Clever and Bad Clever?
For instance, I recently spent a couple of hours working out to make a Genre model which acts_as_nested_list work in such a way that when ask one of the trunk genres for its tracks it finds all the tracks associated directly with the trunk or with any of the genre’s sub genres. It’s made doubly complicated by the fact that the genre is related to its tracks through a relationship model, and triply complicated that we’re going to want to be able to be able to search within the basic set of results (because, usually, we don’t want 300 tracks on a page…)
The solution I ended up with runs to about two emacs screenfuls and it’s ugly as hell. We’re having to declare custom ‘select_sql’ for getting the basic collection, and then extending the association with a custom ‘find’ and method_missing, and you can forget about making genre.tracks.create work because life is far too short.
There’s no doubting that the solution is clever; it took a couple of hours of concentrated thought and experimentation to come up with it. However, it’s definitely Bad Clever.
What is Bad Clever?
Bad Clever is when something was hard to do, but it looks hard to do and, worse, is hard to understand. Bad clever is when the complexity of finding a solution pokes through into the form of that solution. Bad Clever is when you end up feeling like you haven’t been clever enough. Bad Clever is the proof that Four Colours Suffice, Java’s static type system, WS-Deathstar and inline Javascript. Bad Clever is “Dammit, if it was hard to do, it should be hard to understand!”.
What about Good Clever then?
Good Clever delights. Good Clever is modest, only revealing exactly how clever it is when you look more closely. Good Clever has a simple set of Just Stories – the complexity of the problem domain is exposed gradually. Good Clever has the Quality Without a Name. Good Clever is what happens when you have that simplifying insight that turns a bit of ugliness into something that isn’t going to embarrass you.
Most of the time, Rails is Good Clever, Smalltalk has the Good Clever nature in abundance. Good clever is Cantor’s proof that the real numbers are uncountable, Turing’s proof that the Halting Problem can’t be solved. Good clever is the Atom Publishing Protocol, Haskell’s static type system, and Unobtrusive Javascript. Good Clever is Origin of Species and Mick Jaggers phrasing on Sympathy for the Devil. Good clever rocks.
If only I could achieve it…
You never know when a coracle will come in handy 1
A few years back, we got dad a place on a coracle building course as a birthday present. He had a great time and came back with a half finished coracle that he never quite got round to finishing. Until now, it seems.
Despite what you might think, the photo above isn’t taken from inside my parents’ boathouse, it’s taken from inside their garage. That’s not a river, that’s the road.
You have to love this June weather don’t you?


