A tiny ruby niggle 32
You know what? I’m starting to miss compulsory semicolons as statement terminators in Ruby.
“What?” I hear you say. “But not needing semicolons is one of Ruby’s cardinal virtues! Are you mad?”
I don’t think so, but maybe you’ll disagree after I explain further.
Here’s a piece of code that I might write if semicolons were the only way of terminating a statement:
Category.should_receive(:find_by_permalink)
.with('foo')
.and_return(mock_category);Or how about a complex find query
def find_tags_for(tag_maker, order = 'count')
klass = tag_maker.class
find :all
, :select => 'tags.*, count(tags.id) count'
, :group => Tag.sql_grouping
, :joins =>
"LEFT JOIN taggings ON "
+ " tags.id = taggings.tag_id "
+ "LEFT JOIN bookmarks ON "
+ " bookmarks.id = taggings.taggable_id "
+ " AND taggings.taggable_type = 'Bookmark' "
+ "LEFT JOIN #{klass.table_name} ON "
+ " #{klass.table_name}.id = bookmarks.#{klass.to_s.underscore}_id"
, :conditions => conditions_for(tag_holder)
, :order => (order == 'count')
? 'count(tags.id) desc, tags.name'
: "tags.name"
, :readonly => true
;
endI first came across the idea of the leading comma in Damian Conway’s excellent Perl Best Practices. The idea is that, by leading with the comma it’s very easy to add a new argument to an argument list or hash specification without having to remember to stick a comma on the end of the preceding line if it was at the end, and also, the leading comma makes it very plain that the line is a continuation of its predecessor in some way.
To make the examples work in Ruby, you have to add a \ to the end of each line that has a continuation, so the first example has to be written:
Category.should_receive(:find_by_permalink) \
.with('foo') \
.and_return(mock_category);Lining up the \s helps to stop them disappearing, but it’s an awful faff.
What tends to happen (in the rails source especially) is that ruby programmers simply don’t break their lines up. A quick search of the rails source finds plenty of lines more than 160 characters long.
Of course, some will argue that it doesn’t matter, that the old 80 column limit is a silly hangover from the days of steam when the only way to interact with your code was through an 80 column, green phosphor terminal. They have a point. An arbitrary line limit is silly, and we should get over it, especially in source code. However, unless you’re going to go around with every window open to its maximum width, lines will wrap, and they won’t do it nicely, or respect the indentation conventions of your language. Long lines are murder in diffs too, finding the point of difference is so much easier when your eye doesn’t have to scan an epic line.
It’s a shame there’s no way of forcing ruby’s parser to require semicolons as statement terminators for those programmers like me who think that the restriction that a statement must end with a semicolon is worth the freedom to break lines where we like without needing to escape every line break. It’s a shame too that popular tools like Textmate are so clumsy when it comes to dealing with line breaks. I would attempt to hold Emacs up as a paragon in this respect, but its Ruby mode tends to get a wee bit bemused once you start breaking lines, so that’s no good.
Domino theory
It’s amazing how far reaching seemingly simple language design decisions can be isn’t it? Just getting rid of the need to terminate statements with a semicolon has an enormous effect on they way code in ruby looks. I’m just not sure that they look better.
Maybe Smalltalk got it really right – they chose to use the most valuable syntactic character of all, the space, to denote sending a message. That freed up the . for use as a statement (sentence?) terminator. Then that freed up ; for use in one of Smalltalk’s most distinctive patterns – the cascade. Where a Rails programmer might write:
form_for(@comment) do |f|
f.input(:author)
f.input(:title)
f.input(:body)
endA Smalltalk programmer might eliminate the need for a temporary variable by doing:
Comment>>printOn: html
(html formFor: self)
input: #author;
input: #title;
input: #body.All those input: ... messages get sent to the result of html formFor: self. Once you get the hang of it, it’s a really sweet bit of syntax.
Incidentally, there’s been some discussion on the squeak mailing lists of a companion to the cascade, which would use a ;; as a sort of ‘pipe’. The idea is to be able to replace code like:
((self collect: [:each | each wordCount)
inject: 0 into: [:total :each| total + each])
printOn: aStream.with
self collect: [:each | wordCount]
;; inject: 0 into: [:total :each | total + each]
;; printOn: aStream.(NB: Please ignore what those code snippets do, because that’s gruesome. Concentrate on how they do it).
Nobody’s quite proposed going as far as Haskell does with its Monads, which can be thought of as a magical land where the meaning of the semicolon changes according to what sort of Monad you’re in. (In an IO monad for instance, the semicolon imposes an evaluation order. In some other monad, the semicolon could just as easily denote a backtracking point). Then again, there’s nothing to stop the dedicated Smalltalker implementing something Monadish – every Smalltalk class can specify how its methods should be compiled after all…
In conclusion…
I’m not sure I’ve got a real conclusion for all this. I’m mostly musing. However, I do think it’s useful to think carefully about restrictions and what they free us to do as programmers. Lispers will wax lyrical about the way that their language’s pared down syntax lets them do amazing things with macros. Smalltalkers will defend to the death the idea that the only way to do anything is to send messages to objects. Pythonistas love their syntactic whitespace. Haskellers love their static typing (admittedly, they have an incredibly flexible notation for expressing type that leaves most other programming languages standing).
And any English speaker with ears will know that a poem like Dylan Thomas’s Do Not Go Gentle Into That Good Night gains much of its power from it’s form, the villanelle, one of the most restricted forms of poetry there is. Two lines repeating through the poem and a staggering number of rhymes to find:
Do not go gentle into that good night,
Old age should burn and rave at close of day;
Rage, rage against the dying of the light.
Though wise men at their end know dark is right,
Because their words had forked no lightning they
Do not go gentle into that good night.
Good men, the last wave by, crying how bright
Their frail deeds might have danced in a green bay,
Rage, rage against the dying of the light.
Wild men who caught and sang the sun in flight,
And learn, too late, they grieved it on its way,
Do not go gentle into that good night.
Grave men, near death, who see with blinding sight
Blind eyes could blaze like meteors and be gay,
Rage, rage against the dying of the light.
And you, my father, there on the sad height,
Curse, bless me now with your fierce tears, I pray.
Do not go gentle into that good night.
Rage, rage against the dying of the light.
If that’s not making a virtue of a restriction, I don’t know what is.

My theory is that the Rails source has such incredibly long lines because people in Rails core are using Apple Cinema Displays. I got one the other day and picked up the uberlongline habit immediately. It changes the code you write, it damn near changes the way you think.
I have a 30” Cinema Display at work which is where my editing windows (Emacs, until I did a foolish cvs up and recompile only to discover that I could suddenly outtype it. It was like being back at University trying to use Emacs on a Vax shared with a bunch of other students) live. A slave screen has a couple of nailed up iTerms running autotest, a couple of log tails, and a rails console.
I still try and keep lines short – I use the extra width to let me use windows side by side
My gut feeling is that Rails has such epic lines in part because Textmate sucks at indenting broken lines. One of these fine days I’m going to sit down and try and distil my gut feelings about how to lay out Ruby code into a set of patterns akin to Kent Beck’s patterns for laying out Smalltalk code.
Once I’ve done that, it’s ‘just’ a matter of writing an Emacs mode or Testmate bundle that does the work. It’s a shame that Ruby doesn’t appear to have anything analogous to
perltidyjust yet. It should be a good deal easier to implement – Ruby’s a good deal more parseable than Perl.>The idea is that, by leading with the comma it’s very easy to add a new argument to an argument list or hash specification without having to remember to stick a comma on the end of the preceding line if it was at the end, and also, the leading comma makes it very plain that the line is a continuation of its predecessor in some way.
I agree on both counts. I finally figured out that, cognitively, I process the code (irrespective of language) not as individual lines, but as several-line blocks of text. Anything I can do to give my code a more regular, tabular appearance is a good thing, and helps to point out typos early on.
On the Domino affect… One of my favourite language of old Pop-11 used ”->” for assignment. So Perl’s ”$answer” would be “42 -> answer”.
I still really like this. At a stroke it removes the whole assignment vs. equality confusion so many folk new to programming encounter, remove the =/== typo bug, and we get to use ”=” for equality.
Of course, some will argue that it doesn’t matter, that the old 80 column limit is a silly hangover from the days of steam when the only way to interact with your code was through an 80 column, green phosphor terminal. They have a point. An arbitrary line limit is silly, and we should get over it, especially in source code.
I’d have to disagree, that limits on (arbitrary?) line length is silly, au contraire: For centuries typographers have known that the optimal column width is around 66 (it only took around 400 years to figure it out). There is some minor dispute about it. Some have gone as far as to claim it is more close to 54, others that is a bit longer, but I have never encountered anybody, claiming it is good with overly long lines just because of writing convenience.
The 80 column width limit is not arbitrary although a historical coincidence from the days of punch cards. However coincidental one may deem it, it is not and never will be silly. Reading is important, even more so for code of important software, especially when it plays an ever more important part in the infrastructure of society.
It is a common experience that reading is easier with short lines. News papers have narrow columns which of course is no coincidence, since a large mass of people need to easily be able to read it.
Of course, programmers are not people in the general sense and have particular needs outside the scope of newspapers narrow columns, but nevertheless we programmers still need to be easily able to read code and grasp it at a glance.
Code formatting plays an important role in supporting that ability. Your very own post suggests this. Column widths silly? Certainly not! Please, let’s wait at least a hundred years before we make any judgements on column widths for program texts. But for now, may I suggest that history itself has taught us, that long lines are silly!
Currently don’t have time to read this Yegge-sized post, but my initial reaction to the opening paragraph is to just end each line with a period, rather than starting with one. That way the compiler is happy, and you can break your code up nicely.
It seems to me that you are using edge cases (really long lines) as a motivation for changing the syntax of every case, including the default cases – which to my mind really benefit from not having semicolons.
If you want to get rid of both semicolons and ambiguous line endings, there’s no better solution than Haskell and Python’s: make layout syntactically significant. It’s a zero-burden solution (programmers already indent their code) that increases code comprehension (no more misleading layout1) and reduces code noise (closing braces and end keywords, which are redundant, go away).
After having spent years in layout-significant and layout-ignorant languages, I’m firmly convinced that layout significance is the better approach.
Cheers. —Tom
[1] For example, how many C/C++ programmers have been fooled by layout that doesn’t match the actual parse tree?:
It happens. But not in languages where layout is significant.
“Maybe Smalltalk got it really right”
There’s no maybe about it, Smalltalk did get it right. It’s no accident that Smalltalk expressions are sentences.
What better way to end a thought than a period, just like in writing.
Imagine how silly you’d think writing were if you had to explicitly continue a line with a special character rather than just simply wrapping the expression.
Or.how.silly.a.paragraph.would.look.using.a.period.to.separate \ .messages.instead.of.using.a.space.while.ending.your.thoughts \ .with.a.semi.colon;
Smalltalk syntax is nearly perfect. Though a pipe operator would be nice.
Desmond: I know about the 66 character line being optimal for readability. It’s about what I’ve aimed at as my main column width for this blog after all.
Sam: I think we’ve had this discussion over a drink before now. You’re wrong, but I won’t hold it against you. My point here isn’t that I want to change the language for everyone, but it would be nice if there were some sort of
strict :statements;pragma which would allow programmers to declare that, for the current code block, statements have to be terminated with a semicolon.Tom: Python? Ick. But the Haskell offside rule is pretty damned gorgeous, especially when it’s combined with the rest of Haskell’s syntax. The only reason I’m not spending far more time with Haskell right now is that I’ve got this blog software to maintain in my free time, and rails code to write in my office time…
Ramon: You’re probably right, but I also reckon that Smalltalk gets to have such a tiny syntax because the Smalltalk environment does an awful lot of the stuff that has to be handled with syntax in block structured languages. If Smalltalk were laid out in text files like pretty much everything else, the syntax would start to feel awfully clunky. Which is arguably just one more reason why Smalltalk got it right…
Further to Sam’s suggestion of ‘Just put the periods on the end’. That would lead to code that looked like:
and if you can’t see how awful that is there may be no hope for you. A slight improvement might be:
But those periods still get lost remarkably easily.
Looks to me like you could slap an instance_eval in the form_for method, and then the Ruby version suddenly looks as clean or cleaner than the Smalltalk version.
As for the rest of this discussion, well, I find the whole, “How does the chained method notation hold up in multi-line statements?” a bizarre metric to judge against.
Mind you, that’s probably because I consider the entire “should(x).with(y).and_return(z)” method chaining mock/test style to be seriously misguided (in any language). If I wanted to program in English, I’d use Cobol.
Yeah, an
instance_evalwould definitely help theform_for– life gets a little tricky if you need to generate a single input for another object though – you end up having to nest another block method. Admittedly, in that case the cascade breaks down for the Smalltalk version without a little bit of low cunning. Just implement aninput:at:method on the proxy object and do:Method chaining mocks are a matter of taste, but it’s really in the ‘big hash as parameter object’ case that I find myself missing statement terminators the most. Once you get used to the leading comma/operator style, you really don’t want to go back.
Maybe I just need to train emacs and textmate to insert a
\at the end of the preceding line if I type an operator character as the first non whitespace character on the current line. Combine that with greedy deletion which takes out the backslash if its accompanying linebreak goes away and you’d be laughing.Gah! I’m just plain wrong, and apparently there’s no hope for me.
Piers, I struggle to understand how you can be so objective over such a clearly subjective subject.
Hopefully, one day you might find some space in your heart to accommodate differing views – even ones you don’t agree with ;-)
I agree with the comment that blames Rails’ long lines to the 30” cinema displays.
I generally value readability over extreme conciseness, and the Rails source varies substantially in readability depending on whose code it is.
As much fun as it is to be extremely concise, writing maintainable code can add up to 10% more lines, just to help it read easily.
Sam: I’m not being objective. I’m being hyperbolic. And if you can’t tell the difference…
Ah… maybe I won’t do that one twice.
For the record: when I’m blogging about something that you think is subjective, I’m right and everyone who disagrees with me is wrong and should be shot out of cannons.
Unless I change my mind, in which case I shall not be shot out of a cannon myself.
Unless being shot out of a cannon looks like fun.
You totally stole the cannons bit from When I Am Emperor.
Yup. Firing people out of cannons is too good an idea to keep to yourself mate.
Anyway, what are you going to do about it? Fire me out of a cannon?
Piers wrote:
Well, if you like Haskell’s layout rule but can’t stand Python, you can at least use layout in Perl via the hacktastic LayoutRule.pm.
Can’t say I ever convinced anybody to use that module, however …
Cheers. —Tom
I’ve had times where I wasn’t sure what the most readable format was for long/chained statements. Luckily there’s so much to love in Ruby, this isn’t all that big a deal.
This very thing came up at work today, and I was surprised that my coworker thought something like
would work.
With Perl, I was used to ending all hash and array assignments with an extra comma since it wasn’t an error and it made adding extra elements on extra lines simple.
And as far as making a virtue of a restriction, your Dylan Thomas example is wonderful (as he is one of my favorite poets), but I have to respond with something a little lower-brow: an explanation of the daily dino comic
There’s no maybe about it, Smalltalk did get it right.
Why did Ramon’s comment not surprise me? ;-)
There is actually a perltidy for Ruby:
http://www.arachnoid.com/ruby/rubyBeautifier.html
To be fair, though, I haven’t tried it yet.
Yeah, Ramon! You’re so predictable!
Go LazyWeb! Thanks Giles.
We have a policy whereby no line should be longer than 120 characters. This works well as a sensible limit, we also setup Textmate’s “right margin indicator” to 120 and turn off word wrap, that way we can easily see when we run over.
I have found that generally, if we exceed those 120 chars through method chaining etc. it is generally because we are doing something wrong and need to re-factor.
But wouldn’t an obligatory semicolon reduce another of ruby’s sweet spots: Domain Specific Languages?
“Domain Specific Languages”? I don’t think that phrase means what you think it means.
Assuming you mean what I’ve been calling pidgins, then no not really; your pidgin will just need semicolons as a statement terminator if you’re using the pidgin in a scope with required semicolons.
Once more: I don’t want to force everyone to use semicolons as terminators, but I would like the option to tell the parser that, for the rest of the current file, semicolons will be the only valid statement separators.
So, say you have a pidgin which lets you declare your instance variables and their accessors, you could do:
Heck, if you code it right you could make strictness bind to a scope so you could be strict on a per method basis. It would probably make sense to have an ‘anti-strict’ as well, so one could write
liberal :statementsto turn off the requirement for explicitly terminated statements.Everyone reading that can tell that the lines starting with a dot are meant to be continued. So why not just make that particular construct valid Ruby? If the first non-whitespace character of a line is a dot, it is a chained method. No ugly semicolons needed.
I got your niggle right here: I want to read more stuff like this, but your category links are broken.
The problem, Piers, is the strawman of “long lines.” I understand the value of leading with the separator, I’m a big fan of Conway’s best practices myself. But the alternative is not 160-column-wide lines. It’s separators on the end. Sam is right about that. You’re right for wanting to mess with Ruby’s syntax. I’d like to be able to do that as well, if it was scoped. You both make fine points, and it’s okay to acknowledge that. No need to act so threatened on your own blog.
Jay, thanks for the bug report – it’s fixed now. It turned out that one of the (necessary, dammit) controller refactorings I’ve been making in Typo broke one of Scribbish’s assumptions about partials.
We’ve yet to come up with a way of testing themes I’m afraid.
@Zack, if only there were a language that supported lexical grammar overrides. (No, not Lisp. I meant a language with actual syntax.)
@Piers, wouldn’t obligatory parentheses around method arguments reduce another of Ruby’s sweet spots?
(ouch!)
@chromatic “if only there were a language that supported lexical grammar overrides.” – Pop-11. Not as easily as Perl 6 will, but possible.
@chromatic: Smalltalk lets you specify the compiler for a class’s methods. If you wanted to change the syntax for the duration of a block, say, you’d have to swap in a compiler/parser that supported changing the compiler on a block by block basis. I wouldn’t fancy coding it though.
I’m not entirely sure what you’re getting at with the reference to obligatory parentheses though.