A Billion Monkeys Can’t Be Wrong

Election 2008 – some fun

I’ve been following the U.S. Presidential election on the excellent site www.fivethirtyeight.com which features sophisticated statistical aggregation of all the published polls done in the U.S and also at the prediction market Intrade.com. But as it comes down to the election I know I’ll need a frequently updated dashboard for watching the results. So I came up with this. (N.B. Requires Firefox 3 and at least as much screen realestate as a 15" Powerbook.)

Inspired by a page put together by Randall Munroe of xkcd fame, my dashboard periodically fetches the market price of Intrade’s state-by-state election markets, which represent the probability, as assessed by the Intrade traders, that a given candidate will win a given state. From those probabilities I compute the overall probability of various scenarios and color the map appropriate shades of blue and red. I also provide some dials and knobs (sliders) actually, to allow you to play some real-time “what if” games with the results.

There are some flaws some flaws with my statistical methodology (most notably the almost certainly erroneous assumption that the state probabilities are all independent) but it should nonetheless be a good way of tracking the results as the come in: Assuming the Intrade traders, who’ve got real money on the line, stay on the job the Intrade values will head toward certainty as exit polls and actual results become available and my dashboard will reflect that.

Last updated 2008-11-01T15:59:19+7:00.

Practical Common Lisp Japanese translation

I was recently notified – somewhat to my suprise – that the Japanese publisher Ohmsha is publishing a Japanese translation of Practical Common Lisp which should now be avaliable in bookstores and on amazon.co.jp. I had known that there was a small group working on a translation but hadn’t realized they had found a publisher. My thanks to those translators and to Masako Omata at Franz who, as I understand it, did a fair bit of work to make it all happen. I got my copy in the mail the other day. Looks good though I can’t say much about the quality of the translation other than that it seems to contain quite a number of Japanese characters. I’ll be interested to hear from any Japanese readers what they think of it.

Last updated 2008-08-04T13:45:48+7:00.

California bans home schooling!?

This is the kind of thing, I imagine, that turns people into right-wing lunatics. Walking the dog today I saw the front page headline of the San Francico Chronicle, “Homeschoolers suffer setback: Appeals court rules parents who teach children at home must be credentialed.” Uh-oh. Our daughter is only a year and a half old so we’ve got a few years before we have to officially decide whether we’re going to home school but that’s the current plan.

Except that all the sudden that may no longer be an option unless this appeals court ruling is overturned, the legislature defies the teachers unions and changes the state’s education laws to specifically allow home schooling by uncredentialed parent-teachers, or we leave the state. Equally suddenly, I’m on the side of the right-wingers ranting about judges legislating from the bench and the nanny state trying to take over our lives. Heck, suddenly James Dobson of Focus on Family, who spent his radio show today decrying the ruling, is my ally.

We are not religous so that’s not our motivation for wanting to home school, but we are not really all that different from the homeschoolers who are. While we don’t object to the secularism of public schools—that’s one of their good points as far as I’m concerned—we object to other parts of mainstream culture: the relentless consumerism, the regimentation of academic instruction, and the emphasis on competition and working for extrinsic rewards. I’m sympathetic to the need for society (i.e. the state) to look out for the welfare of kids whose parents aren’t taking proper care of them. But to have the state tell me I have to send my daughter to the schools the state has approved and to be taught only in the way the state thinks is best makes me start thinking about holing up in a compound somewhere with too many guns and a couple years worth of canned food in the root cellar.

The quote from the Chronicle story that really killed me was from Leslie Heimov, the executive director of the Children’s Law Center of Los Angeles. She said her organization was mostly concerned that children be “in a place daily where they would be observed by people who had a duty to ensure their ongoing safety.” Uh, wouldn’t parents have a duty to ensure the safety of their children. To say nothing of looking after their education and moral development. Hmmm, I really must be turning into a crazy right-wing nutjob.

Last updated 2008-03-08T06:43:22+8:00.

Grammar School Grammar

Once again following around after Language Log’s Geoffrey Pullum yields food for thought. Long ago Pullum wrote a blog entry entitled “More timewasting garbage, another copy-editing moron” in which he heaped scorn on the copy editors who edited Mark Pilgrim’s Dive into Python for their many grammatical incorrections. That post was the one that made me a Language Log fan since I had written a book for the same publisher and had been made batty by all the same incorrections.

John McIntyre, the assistant managing editor for the copy desk at the Baltimore Sun must have seen that post because when he recently wrote a piece for his blog about the old that vs. which usage bugaboo, after defending Fowler’s made-up rule, he tried some preemptive self-defense saying, “That will probably bring down on my head the wrath of the linguists at Language Log …, who appear to hate copy editors’ guts.” He then went on to say:

But I’m just a simple country boy from Kentucky who learned English grammar in Mrs. Jessie Perkins’ fifth- and sixth-grade classes at Elizaville Elementary School and who just tries to get by on what is reasonable and useful.

Why is it that English grammar is one of the few fields where what we learned in fifth and sixth grade is considered state of the art? I doubt the Sun’s assistant managing editor in charge of political reporting would explain the paper’s approach to election coverage by saying: “I’m just a simple country boy from Kentucky who learned about U.S. politics in Mr. Bobbie Smith’s fifth- and sixth-grade Social Studies classes at Elizaville Elementary School.”

We all understand that the way we teach politics, and just about everything else, to ten- and eleven-year-olds is simplified, if not over-simplified. But that’s usually okay since most folks who grow up to be newspaper editors or, for that matter, newspaper readers, probably go on to high-school, college, and maybe even graduate school where they are exposed to less and less simplified versions of our collective understanding of things.

Yet when it comes to grammar and the use of the English language, most folks, including professionals, seem happy to work with the version they learned when they still thought members of the opposite sex had cooties.

Last updated 2008-03-07T07:25:11+8:00.

Singular they

I see via a Geoffrey Pullum Language Log post that yet another otherwise intelligent person—this time David Gelernter, a Yale computer science professor—has been found ranting in public about the imminent destruction of the English language due to folks using they as a singular pronoun.

Pullum does his usual fine job highlighting the absurdities of this kind of rant: in this case the wild disparity between the magnitude of the social devastation allegedly being wrought and the venality of the linguistic sins, if any, being committed. Pullum—co-author of the massive and comprehensive Cambridge Grammar of the English Language—also points out that Gelernter is simply wrong about many points of grammar and historical linguistics. Pullum, however, doesn’t really take Gelernter’s argument seriously, presumably because it’s absurd and ignorant and doesn’t deserve to be. On the other hand, Gelernter’s rant is such a fine example of an anti-singular they diatribe, that it merits a closer analysis.

Gelernter’s thesis is that “the English language has become a wholly-owned subsidiary of the Academic-Industrial Complex”. In particular he claims “feminist authorities” effected a significant change to the rules of grammar such that “agreement between subject and pronoun was declared to be optional” allowing they to be used as a singular pronoun. As Pullum and regular Language Log readers of course know, this claim is deliciously ironic. The last time the Academic-Industrial Complex unilaterally changed the rules of grammar was in the 18th century, when grammarians, taking a bit too much of a cue from Latin, made up a rule that pronouns had to agree in number with their antecedents, a “rule” which, in fact, had been regularly violated by such writers as Chaucer, Shakespeare, and Jane Austen to say nothing of thousands of less notable authors and, no doubt, hundreds of thousands of plain old native English speakers.

Having made up their rule, these grammarians were then forced to choose a singular pronoun to use with indefinite antecedents (e.g. everyone, nobody) and singular nouns that could refer to a person of either gender (e.g. a person). Given the times and the fact that the grammarians were mostly men, the “natural” solution was to use he, his, and him. But one must wonder whether sentences like, “A grammarian should always keep his inkwell full”, sounded natural to an 18th-century grammarian because he really felt his was gender neutral or because he was envisioning a grammarian much like himself and all the other grammarians he knew who was, of course, also a “him”. Hard to say. I don’t have any 18th-century grammarians’ writings at hand but Gelernter gives up the game a bit with this sentence: “Who can afford to allow a virtual feminist to elbow her way like a noisy drunk into that inner mental circle where all your faculties (such as they are) are laboring to produce decent prose?” Surely that should be “elbow his way”.

Of course we’ve come a long way since the 18th century. Now that grammarians really are as likely to be women as men, maybe it’s silly to get hung up on gender-neutral he. On the other hand it’s also worth considering how recently we’ve really made progress on that front. Gelernter is full of praise for Strunk & White’s The Elements of Style (excepting editions published since E.B. White’s death, which have softened the guide’s position on the absolute correctness of the gender-neutral he and are thus “a disgrace to his memory”.) But, as Pullum pointed out in a 2004 Language Log posting, when Strunk was passing along the rule that singular they should be changed to he, women in the United States still didn’t have the vote. And when E.B. White wrote about Strunk’s “little book” in his “Letter from the East” for the July 27, 1957 New Yorker that essay—which later became the introduction to White’s revision of the guide—appeared next to an advertisement with this text: “Traveling men get juicy steaks on ‘The Executives’—United’s for-men-only nonstops to Chicago.”

Gelernter does, however, inadvertently if somewhat belligerently, get to the real point of the singular-they debate when he asks: “Why should I worry about feminist ideology while I write? Why should I worry about anyone’s ideology? Writing is a tricky business that requires one’s whole concentration, as any professional will tell you; as no doubt you know anyway.” Yes, writing is a tricky business. Largely because it’s about communicating with other human beings. Unlike computer programming, where a deterministic and strictly logical computer is the ultimate arbiter of the meaning of your creation, writing is about conveying thoughts from your mind to that of your reader, a process that is neither deterministic nor entirely logical. For better or worse, readers can be thrown for a loop by clumsy diction, abstruse vocabulary, or violation of what they consider norms, whether grammatical or social. The harsh reality of a writer’s life is that some readers will be jolted out of their concentration by a singular they while others, who would have glided right past it, will trip over a gender-neutral he.1 There’s no good way out of this mess, at least in the short term. I do believe that for Gelernter, and for many others, sentences using a singular they really do “skreak like fingernails on a blackboard.” That they do so for silly, historical reasons is no consolation. (That Gelernter considers them evidince that feminism is destroying the possibility of rational thought is, however, just stupid.) On the other hand, I’m quite certain that singular they will prevail in the long run. It was standard usage long before the 18th-century grammarians put their oar in and it is even more attractive now for people who wish to avoid implying that executives and grammarians are always men. It also follows the pattern set when the plural pronoun you drove out the singular thou.

So I do my part by using singular they in my own writing. And any writer who wants to reclaim the English language from the dead hand of the 18th-century Academic-Industrial Complex, well, they should do likewise.


1. And some of us will be distracted by both, noting both singular theys and gender-neutral hes as instances of a writerly choice.

Last updated 2008-03-05T21:43:46+8:00.

Coders at Work sweet sixteen

Woohoo. It's all over except the interviewing. And transcribing. And editing. And reinterviewing. And more editing and then the publishing. But I've got my sixteen interviewees! I’ve just received an email from Miguel de Icaza agreeing to be interviewed, which fills out my roster. The complete list (in alphabetical order) is:

There’s some more information about Coders at Work on the website and anyone should feel free to leave a comment if you have thoughts about stuff I should ask these folks.

Last updated 2007-11-26T22:35:19+8:00.

Uh oh!

Today I interviewed Donald Knuth for my book Coders at Work. The contents of the interview will, of course, be published in the book itself, but before the interview proper started he told me something that's a bit worrisome. According to Knuth (who may have been relating something someone told him) there are three kinds of people: people who have written no books, people who have written one book, and people who have written many books. I guess unless I stop now my fate is sealed.

Last updated 2007-09-25T19:34:34+7:00.

John Carmack and Linus Torvalds, please call your offices.

Things are going well on the Coders at Work front. On Tuesday I did my second interview (with Jamie Zawinski) and since I last posted about signing up Donald Knuth I have added Anders Hejlsberg, Guy Steele, and Dan Ingalls to my list of interviewees. The always-up-to-date list of who’s agreed to participate is on the Coders at Work website.

Two folks I’ve been trying to sign up for a while are John Carmack and Linus Torvalds. I emailed them both a few weeks ago and have pinged them each once since but haven’t gotten any reply. No doubt these guys get a ton of mail and mine may have gotten lost in the shuffle or been eaten by some spam filter. So I come to you, gentle readers, for help. If you happen to be close, personal friends with either of these guys and think, as I do, that their insights into the art, science, and/or craft of programming would be an interesting addition to those of Armstrong, Cosell, Deutsch, Hejlsberg, Ingalls, Peyton Jones, Kay, Knuth, Norvig, Steele, Thompson, and Zawinski, please drop them a line and ask them to get in touch with me.

Last updated 2007-09-20T12:29:31+7:00.

Bomb me--please!

I wrote Practical Common Lisp because I felt that Common Lisp needed a new introductory book that could ease folks raised on other languages into Common Lisp and then show them what it’s really all about. Based on emails from readers, reviews on Amazon, word of mouth in the Lisp world, and the fact that the online version of PCL is the top hit when you Google for “lisp book”, I’ll say I succeeded tolerably well. So imagine my dismay when someone pointed out to me today the Google results for “lisp tutorial”.

The top hit is a page which apparently hasn’t been updated since around 1999 and isn’t really a tutorial anyway, so much as a large list of links including a link to the Hyperspec when it was hosted at harlequin.com.1 The next few “lisp tutorial” hits are — with all due respect — exactly the sort of dated, dry tutorials that inspired me to write Practical Common Lisp in the first place and to do a deal with Apress to allow me to keep it online even after the dead tree version was published. Practical Common Lisp doesn’t appear anywhere, as far as I can tell, in the results for “lisp tutorial”.

With that in mind I did a small bit of search engine optimization today to make sure that the phrase “Common Lisp tutorial” appears on the main page of the Practical Common Lisp web site. If you also think Lisp might be better served if PCL was at least one of the results returned to a would-be Lisper searching for a Lisp tutorial you can help out: if you have a web page where it would be reasonable to do so, consider linking to the url http://www.gigamonkeys.com/book/ with a link text of “lisp tutorial” or “common lisp tutorial”. Yes, I’m asking you to participate in a Google bombing. But it’s for a good cause. Think of the children.

Update: Based on the first couple folks I’ve seen providing links to the PCL website (thanks, guys!) I must not have made myself quite clear enough. The name of the game in a Google bombing is for everyone to use the same text for the link. If you want to play along, your HTML should look like this:

<a href="http://www.gigamonkeys.com/book/">Common Lisp tutorial</a>
        

or

<a href="http://www.gigamonkeys.com/book/">Lisp tutorial</a>
        

1. Harelequin doesn’t exist anymore. The canonical home of the Hyperspec is at Lispworks which is now two spin-offs removed from Harlequin: Global Graphics bought Harlequin and spun off Xanalys which in turn, in 2005, spun off Lispworks.

Last updated 2007-09-20T07:09:51+7:00.

Donald Knuth!

Today I got a phone call from Donald Knuth who has agreed to be interviewed for Coders at Work. Yipee! I’ll be interviewing him later this month. Which means I’ve got half of my sixteen interviewees signed up. The folks who have agreed so far are:

And my spam filter on the comments page seems to be working well — it has correctly identified as ham the half dozen or so real comments that have been posted since I installed it and unerringly id’d the couple hundred spams that have been posted in the same period.

Last updated 2007-09-04T10:22:19+7:00.

Coders at Work: more interviewees signed up

As of today, I’ve got seven interviewees signed up (assuming that other publisher doesn’t try to hoard Simon Peyton Jones). They are:

I haven’t done any more interviews but I’ve been working on the transcription of my first interview with Peter Norvig. It’s all coming back to me how terribly painful it is to transcribe audio. Even with my snazzy homebrew transcription software, I’ve been proceeding at a rate of about 4:1 real time to interview time. That is, to transcribe 15 minutes of interview takes me about an hour. Which means that for every two hour interview I do, it’s going to take me a full day just to prepare the raw transcript. Yes, I know you can hire people to do this but a) I don’t really have a budget for that and b) correcting a transcript can be about as much work as preparing it, particularly if the material is technical and your transcriptionist is not. The good news is that as tedious as the transcribing is, it does give me a chance to start mulling the material over in my head, getting ready to edit it or to prepare for subsequent interviews.

Finally, I got around to putting a spam filter on the Coders at Work comments page so all the stupid link-bombing spammers’ efforts should now be completely for naught. It’s been very satisfying watching it eat up the steady stream of spam comments. Of course these days the spammers are about the only comments I get so I haven’t had any live tests of my filter’s ability to recognize ham. If you have any comments about or questions for the folks I’ve signed up to interview or suggestions of who I should try to get to fill in the last ten spots, feel free to leave a comment.

Last updated 2007-08-30T20:25:32+7:00.

First Coders at Work interview done

The interviewing has begun! Yesterday I sat with the people’s choice, Peter Norvig for our first interview session. I’ve also signed up Joe Armstrong, inventor of Erlang, and Bernie Cosell, one of the software geniuses behind the original ARPANET IMPs. Both of these guys were not only kind enough to agree to be interviewed but also invited me to stay in their homes when I travel to interview them. Simon Peyton Jones has also agreed to be interviewed assuming the publisher of another book of interviews that has already signed him up for a group interview about Haskell doesn’t object. I’ve got a half-dozen other queries out with more on the way. Hopefully soon I’ll have my full roster of sixteen interviewees signed up and can start figuring out how to stretch my rather small travel budget to get me all the places I need to go to do the interviews.

On a geekier note, after I did a practice interview with a friend and went to try out the transcription software I had downloaded, I discovered that it didn’t understand the WMA files that my digital voice recorder produces. Turns out on GNU/Linux the best way to convert WMA to MP3 involves using the program Mplayer to convert the WMA to a WAV file. While skimming through the Mplayer man page I discovered it has a “slave mode” where you can type commands at it. “Hey,” I thought, “If I can control it by typing commands at it that means I can also control it by having Emacs type commands at it.” A bit of elisp hacking later and I turned Emacs+Mplayer into a quite nice bit of transcription software. A few features:

I experimented with having it automatically pause whenever I hit the backspace key and resume when I started typing again but that didn’t turn out to be as handy as I thought it would. The nice thing, of course, about having written my own software to do this is that as I get into the work of transcribing all these interviews I’ll be able to spend an endless amount of time procrastinating by fiddling with my transcription software. Er, that is, I’ll be able to tweak it just the way I want it to maximize my transcribing efficiency. (Ironically, because of either a limitation of the WMA format or a bug in Mplayer the jumping to timestamps doesn’t work quite right when transcribing WMA files so I still end up having to convert to MP3 files.)

Last updated 2007-08-24T20:18:35+7:00.

Coders at Work: looking for information

Since I started work on my new book, Coders at Work, in mid-June, hundreds of people have suggested names of programmers I should interview and helped me sort them in a variety of ways. Thanks! Now, having digested everyone’s feedback, I’m getting ready to contact the folks I think I’d like to interview.

To prepare for this next step, I’ve done a big cleanup of the Coders at Work website and have added a way for anyone who wants to to add information about individual coders in a more structured way than leaving a comment or emailing me. For an example, take a look at Peter Norvig’s page; on the left side of the page is the information I’ve already got and on the right forms where anyone can add more. At the moment most other coders’ pages will have much less information that Norvig’s; my main task now is to rectify that, at least for the folks I’m most interested in interviewing.

With that in mind, adding information about your favorite coder is a great way to increase the chance that I end up interviewing them for the book because a) most people get more interesting the more you know about them and b) the more information I have at hand about someone, the easier it will be to prepare for an interview and with at least sixteen interviews to do, I’ve got to apply a certain amount of tactical laziness.

Last updated 2007-08-02T09:38:45+7:00.

Compiling queries without EVAL

Gary King recently blogged a question about whether a use of EVAL he had recently made in some Common Lisp code was legit. This caught my eye because I suspect this question came up while Gary was working on a project that I once worked on. Anyway, since he asked, the short answer is, “No. The old rule that if you’re using EVAL you’re doing something wrong is still true.” A slightly longer answer follows.

The problem Gary set out to solve is that he’s got a a bunch of chunks of text coming out of some sort of database and he wants to query them with logical expressions like this one:

(or 
 (and "space" "mission")
 (and (or "moon" "lunar") "landing"))
        

These expressions are obviously not Lisp but rather a query language where the logical operators AND and OR (and presumably other ones like NOT) have their usual logical meaning and literal strings should evaluate to true if the text we’re matching against testing contains the string and false if not. Thus the expression above would match text containing the words “space” and “mission” or either of “moon” or “lunar” along with “landing”. Gary considered, and rejected, writing “a simple recursive evaluator to deal with ands and ors”. Instead he wrote some code to munge around expressions like the one above into a form that he could EVAL. In other words, instead of writing his own interpreter he just munged his code into a form that Lisp could interpret for him. Which is okay. Except for the rule that if you’re using EVAL you’re doing something wrong.

What he should have done is written a compiler. Luckily Common Lisp comes with a Lisp compiler built in so all we have to do to write our own compiler is translate our source language into Lisp. Here’s how I’d do it.

First define a function that takes a query expression and returns two values, an equivalent expression with all string literals replaced with GENSYM’d variable names, and an alist of the strings and the corresponding GENSYM’d names. In this phase we also detect if the same literal string appears more than once so we only generate a single binding for each unique string.

(defun translate (tree &optional bindings)
  (typecase tree
    (cons
     (multiple-value-bind (newcar newbindings)
         (translate (car tree) bindings)
       (multiple-value-bind (newcdr newbindings2)
           (translate (cdr tree) newbindings)
         (values (cons newcar newcdr) newbindings2))))
    (symbol (values tree bindings))
    (string 
     (let ((binding (assoc tree bindings :test #'equal)))
       (cond
         (binding (values (cdr binding) bindings))
         (t 
          (let ((sym (gensym)))
            (values sym (acons tree sym bindings)))))))))
        

With this function we can translate an expression like:

(or (and "a" "b") (and (or "c" "d") (or "a" "b")))
        

Into this expression:

(OR (AND #:G1 #:G2) (AND (OR #:G3 #:G4) (OR #:G1 #:G2)))
        

and this list of bindings:

(("d" . #:G4) ("c" . #:G3) ("b" . #:G2) ("a" . #:G1))
        

Now we just need to use those two values to build up a bit of Lisp. Gary’s solution was to build up an expression that he could EVAL. But it’s better to generate a LAMBDA expression because then we can compile it. Here’s the function:1

(defun make-lambda-expression (expr)
  (multiple-value-bind (tree bindings) (translate expr)
    (let ((string (gensym)))
      `(lambda (,string) 
         (let (,@(loop for (word . sym) in bindings collect
                      `(,sym (find-word-in-string ,word ,string))))
           ,tree)))))
        

If we pass the same expression to this function we get the following lambda expression back.

(LAMBDA (#:G9)
  (LET ((#:G8 (FIND-WORD-IN-STRING "d" #:G9))
        (#:G7 (FIND-WORD-IN-STRING "c" #:G9))
        (#:G6 (FIND-WORD-IN-STRING "b" #:G9))
        (#:G5 (FIND-WORD-IN-STRING "a" #:G9)))
    (OR (AND #:G5 #:G6) (AND (OR #:G7 #:G8) (OR #:G5 #:G6)))))
        

We could use FUNCALL to evaluate this expression which would at least get us out of using EVAL.2 But the real advantage of this approach is that we can compile this expression. Since Gary said he wanted to find all the strings in his database that match a given expression, he’s probably going to be evaluating his query once per string in his database. In that case it’s probably worth it to take a small up front hit in order to speed up the execution of the query since we’re going to be executing it many times. Luckily compiling a lambda expression is about as trivial as EVALing any other expression:

(defun compile-expression (expr)
  (compile nil (make-lambda-expression expr)))
        

This function, fed a query expression, returns a compiled function that takes a single string argument and returns true if the query expression matches and false if not. On Lisps with native compilers the returned function will be compiled down to machine code just the same as if we had written it by hand in our source code. We can FUNCALL this function such as in this code that collects all the strings returned by a cursor function that match the query expression:

(defun query-strings (query database)
  (loop with predicate = (compile-expression query)
     with cursor = (string-cursor database)
     for string = (next-string cursor)
     while string
     when (funcall predicate string) collect string))
        

We can also use the query function with all of Lisp’s higher-order-functions, such as REMOVE-IF-NOT:

(remove-if-not (compile-expression query) *all-strings*)
        

Now, one could argue that there’s a whole heck of a lot of difference between using EVAL and wrapping something in a lambda expression and compiling it — in both cases you can generate, and then evaluate, arbitrary Lisp code. But there is an important difference, namely that EVAL just evaluates — it takes some data and interprets it as Lisp and gives you an answer straight away whereas compiling a lambda expression gives you a function, something that can interact with the rest of your code, as an argument to higher-order functions, and so on. Or, if you don’t buy that, at least you avoid breaking the no EVAL rule.

Update: It hit me as I was brushing my teeth that while nicely avoiding EVAL, my first solution had a big problem — because all the FIND-WORD-IN-STRING calls are done in the LET before the boolean expression is evaluated we lose all the advantage of AND and OR’s short-circuiting behavior. That is, the generated code searches for all the strings and then combines the results of all those searches. Much better (and simpler) would be to implement TRANSLATE and MAKE-LAMBDA-EXPRESSION this way:

(defun translate (tree &optional (stringvar (gensym)))
  (typecase tree
    (cons
     (cons (translate (car tree) stringvar) (translate (cdr tree) stringvar)))
    (symbol tree)
    (string `(find-word-in-string ,tree ,stringvar))))

(defun make-lambda-expression (expr)
  (let ((string (gensym)))
    `(lambda (,string) ,(translate expr string))))
        

This has the slight disadvantage that if the same literal string appears in the pattern more than once, we will potentially search for it more than once. But that’s probably much less of an issue than the problem this code fixes. Obviously if we cared to, we could generate code that caches searches once they are done to get the best of both worlds. But I’ll leave that as an exercise for the reader.


1. This function generates calls to a helper function FIND-WORD-IN-STRING. Gary’s version was somewhat more complex but for the purposes of illustration this definition should do:

(defun find-word-in-string (word string)
  (search word string :test #'char-equal))
            

2. Gary’s solution was haired up a bit because he didn’t simply generate a LET form to EVAL but instead generated just the boolean expression and then wrapped the call to EVAL in a PROGV to establish dynamic bindings at run-time. Which, while sort of clever, was another sign from the gods that he had gone down a wrong path somewhere.

Last updated 2007-07-28T00:07:59+7:00.

Coders at Work questions list

As I mentioned the other day I’ve started working on a set of general questions for my Coders at Work interviews. They are now up on the Coders at Work web site. Some of them are, no doubt, dumb and there may still be some redundancies in the list but it’s a start. As always I’m interested in what other folks think. Are there questions you’d like me to ask some or all of the programmers on my list?1 Email them to me or leave a comment on the Coders at Work comment page.


1. Technically, that’s not really my list; that’s the combined wisdom of the 150+ people who have sorted the complete list of names. Since I’ve seen a number of comments about this list, I should point out that while it’s useful (and fascinating) for me to see how other people rank the programmers on my list, in the end I’ll be making the final decision of who to approach for interviews. So don’t get too upset if you look at that list and see someone way too high or too low for your tastes. If there’s someone you’d really like to see interviewed the best thing to do is to let me know why you think they’d be so interesting to hear from.

Last updated 2007-07-19T09:18:29+7:00.

Latest, greatest Coders at Work name sorter

It’s been a while since I’ve posted anything about Coders at Work. Last week I put up, but did not broadly announce, a new, improved web page for sorting the nearly 300 suggested names of people to interview I’ve received. I mostly created it for my own benefit but anyone who wants to play with it can feel free to make their own sorting. (I’m mostly announcing it now because I think it’s sort of a nifty UI — much more fun to play with that the previous one — though I am also interested to see how other folks would sort the list of names I’ve got. Unfortunately, like the previous sorter, this page probably doesn’t work in IE and is only known to work for sure in Firefox.) One feature of this page that the other short list selector didn’t have, is the ability to effectively down vote names — you can put folks who you are particularly uninterested in hearing from at the bottom of the list and that tells me something.

If you think you’ve got a sorting that would make for a great book, click the Save button then “Permalink” on the next page to get a URL for your specific sorting and send it to me. I’m also still interested in new suggestions of folks to interview though as I’m getting pretty close to starting to contact folks and since I’m ultimately only going to interview on the order of sixteen people, new names have to be pretty obviously better than over 250 of the names I’ve already got to break into serious contention. As always, I’m also interested in your comments about why you think someone would (or wouldn’t) make a good interview subject.

I’ve also started serious work on a set of general questions. So far I’ve got 191 questions in 22 categories. Which is a lot — obviously I’ll have to pare that down unless I’m going to interview people for 10 or 20 hours. And that’s not even counting questions more directly tailored to the individual subjects. But it’s still better to have too many than too few so if you’ve got a question you’d like me to ask either everyone I interview or specific people, feel free to email or leave it on the Coders at Work comment page.

Update: I didn’t mention before but you can see the combined results of everyone who has submitted a sort via this new page here.

Last updated 2007-07-17T14:16:38+7:00.

In theory, practice is no different from theory, in practice …

Yesterday I was taking my 10-month old daughter to our parent/infant swim class at the Berkeley YMCA. She happened to be wearing a blue shirt. The woman riding down with us in in the elevator from the parking garage asked how old she was and said, “Oh, what a cute little boy.”

“Girl,” I said.

“Oh, sorry!”

“No worries. It’s the shirt. And the short hair.”

“I know, we’re all so color coded.”

As we were getting out of the elevator she said, “You know, I should be the last person in the world to do that … I teach feminist theory.”

Last updated 2007-07-12T09:59:19+7:00.

Categorizing potential Coders at Work interview subjects

Since I put up the short list submission page1 for Coders at Work a week and a half ago, I’ve received 162 short lists of sixteen names. You can see which programmers appear on the most short lists on this page. Thank you to everyone who took the time to submit a list.

For the next step in my selection process I’ve put together a page on which I’ve lumped the potential interview subjects into various categories such as “Old school Unix hackers”, “New school Unix hackers”, “Language designers”, and so forth. My theory is that it’ll be more interesting to read interviews with different kinds of programmers than a bunch of interviews with people who’ve all done the same basic kind of work. It’s also a useful way to identify potential interviewees — I can look at each category and ask, is there someone not on this list who’d be a better representative of the category?

As usual, there are ways you can help me out, if you’re so inclined. I enumerate them on the categorization page but basically they are to send me email or leave a comment nominating someone as the best representative of a category, helping me categorize the folks I’ve already got, and suggesting interesting categories that I don’t have yet.


1. I’ve still not been able to get this page working in IE, mostly due to the lack of any easy way to run IE myself. If any Javascript guru wants to let me know what changes I need to make to my code to make it work in IE I’ll be forever grateful.

Last updated 2007-07-07T14:14:05+7:00.

Make a Coders at Work short list

My previous post provoked an outpouring of suggestions of people to interview for my book Coders at Work, to have approximately three to four binary orders of magnitude more than I can possibly interview for the book. So I put together a web page with a bit of Javascript on it to help me sort through the names and build a short list of (for now) sixteen names. Since it’s a sort of an interesting exercise, and because I’m curious what other people think, I’ve put the page up on the Coders at Work website. If you want to try your hand at picking sixteen names out of the almost 256 nominees I’ve got, go for it1. I’ve also put up a comments page so people can submit suggestions and flames about the book in a more public forum than my email inbox.


1. I don’t have an easy way to test that page in IE but I’ve got a sneaking suspicion it doesn’t work properly. If anyone can confirm that, one way or another, that’d be helpful. Even more helpful would be if some Javascript/HTML wizard could tell me what I have to do to make it work portably.

Last updated 2007-06-26T05:48:38+7:00.

Coders at Work website is up

I’ve put up a preliminary web site for my in-progress book Coders at Work that I mentioned the other day at www.codersatwork.com. At the moment it consists of an ever growing but still somewhat arbitrary list of potential interview subjects and a tiny bit of information about each of them. I’m going to be adding names to this list fairly quickly and then getting to work filling out details about each person while also winnowing it down to a reasonable number to actually interview. So now’d be a good time to email me suggestions of programmers who you’d like to read an interview with. Or if there are folks already on my list that you’re particularly interested in, feel free to let me know why and to point me to good sources of information about them. The more background I have going in, the better the interviews are likely to be.

Last updated 2007-06-19T07:52:39+7:00.

Slight change of plans

In my first post to this blog I said I was working on a book about programming in groups. A few weeks ago I met with Gary Cornell the CEO of Apress and my editor on Practical Common Lisp to talk about my book idea. His reaction was, more or less, “That’s interesting but I’ve got another book you should do first.” So, as of today I’m working on a book, tentatively titled Coders at Work, which will be a collection of Q&A interviews with interesting programmers.1 It will also be a companion volume to Apress’s recently published Founders at Work by Jessica Livingston, which is a collection of interviews with founders of high-tech startups. I hope soon to have a web site set up where I’ll be putting up pages listing people I’m hoping to interview and collecting suggestions for possible subjects and questions to ask them. I’ll post here when it’s up. In the meantime, if there’s anyone you think would make an interesting interview subject or questions you think I should ask let me know.


1. I’ve been unofficially working on it more or less since I talked to Gary which is why I’ve been remiss in my posting here.

Last updated 2007-06-12T20:56:33+7:00.

Brooks’s Law and intercommunication: you talkin’ to me?

Surely everyone involved in developing software has heard of Brooks’s Law. First presented in the eponymous chapter of Frederick P. Brooks, Jr.’s classic The Mythical Man Month, it states: “Adding manpower to a late software project makes it later.” This “law” is much beloved by software developers as a handy bucket of cold water with which to cool the ardor of overly enthusiastic managers and executives. Lately, however, I’ve been thinking about Brooks’s Law and rereading The Mythical Man Month and I’m no longer as impressed with Brooks’s analysis as I once was. This is the third in a series of posts discussing some of the reasons why. The first post in the series discussed training costs and the second talked about sequential constraints.

In addition to training costs and sequential constraints, part of the justification for Brooks’s Law is the claim that as a team expands the cost of communication between team members grows faster than total productivity of the team. As with the issue of training costs, Brooks has a point — the number of possible pairwise communications paths between a team of n people is n(n - 1)/2; that’s a simple fact of math. If you then assume that every possible pair will need to spend a certain amount of time communicating or, equivalently, that the whole team will have to get together for meetings whose total length is determined by the number of people on the team, then it’s true that the number of person-hours spent communicating will grow as the square of the number of people while total productivity, again measured in person-hours, will only increase linearly. We can all see where that’s heading — pretty soon the amount of time spent on communication will be greater than the total amount of time available to work on anything at all and nothing else will get done. But how soon?

To take a concrete example, suppose we’ve got a six person team that we’re thinking of expanding to eight; should we be concerned that the increasing communication costs will eat up any additional productivity we might get from the two extra people? We can figure it out. Suppose that each pair on our team gets together for a pure-overhead, one-hour tête à tête every week. Assuming a week is five eight-hour work days, the whole team spends 30 person-hours on communication per week out of 240 person-hours worked, leaving 210 person-hours of productive work. What happens if we expand the team to eight? Each person will now spend seven hours a week in pairwise communication and the team as a whole will spend a total 56 person-hours a week communicating. But the team will also now be able to do a total of 320 person-hours of work, leaving 264 person-hours to be spent on productive work, or 54 more hours. This calculation does demonstrate Brooks’s larger point, that a “man month” is not a useful measure of productivity — if it were, then expanding a team from six to eight, a 33% increase in size, would likewise increase productivity by 33%, not the approximately 26% we actually get. But this example doesn’t justify, by itself anyway, a blanket claim that adding people to a project will always slow it down — the eight person team can, in fact, get more done than the six person team and therefore should finish the same amount of work sooner, all other things being equal. Of course all other things are not necessarily equal — training costs can reduce the initial productivity of new team members and it’s conceivable the sequential constraints introduce a long leg that can’t be reduced by adding people. In a later post I’ll discuss whether these three factors together might be enough to justify Brooks’s Law.

So, for this team, a jump from six to eight people probably won’t add more communication costs than the productivity of the new people. But obviously Brooks’s is right that eventually quadratic growth will outpace linear growth.1 With a bit of math, we can figure out exactly when the costs of communication start outpacing gains in the ability to do useful work for a given amount of per-pair communication overhead and a given number of hours worked per week. If we have a team of n people that work h-hour weeks and in which each pair spends c hours per week on communication overhead, then the cost to the existing team members of adding one person to the team is n×c because each of the n current team members will now be part of a pair with the newcomer and will have to spend c extra time tending to that pair’s communication needs. Meanwhile the newcomer will also spend n×c hours per week on pairwise communication which, when subtracted from the h total hours they’ll work each week, gives us the amount of non-communication work per week they’ll add to the team’s total capacity. When the amount of productivity lost from the existing team members is the same as the amount of productivity added by the newcomer there’s no point in expanding the team. So we can solve this equation for n:

/blog/7e577cb0f17a775facb53e487039cb26.png

to get this:

/blog/e8b31157416bf073bac03bc4358c7a22.png

With this formula we can determine that for a team with one-hour per week of per-pair overhead, working 40-hour weeks, the the biggest the team can get without loosing more productivity than it gains, is 20.

But all of these computations may be beside the point, as they’re based on the assumption that communication has to be overhead. What if we had a team of six people that spend not one, but eight hours per day on pairwise communication because they spend all their time pair programing? If we add two people to that team, there is no change in time spent per person on pairwise communication — the only change is, assuming the team rotates partners, that each person will pair with each other person less often. But the communication time obviously can’t be all overhead — the only time anything gets done is when two people are communicating.

So how can this possibly square with Brooks’s analysis? One possibility is to join the ranks of the XP skeptics and simply deny that the pair programming team could possibly get anything done. I’ve had good experiences with pair programming though so I can’t buy that. I think the problem is with Brooks’s underlying assumptions. As I’ve mentioned previously, Brooks assumes that an n-person team will partition the task of writing whatever software they need to write into n pieces, each to be written by one person. To the extent that those pieces of software need to talk to each other, so do the people writing them and this communication is extra work on top of the base amount of work required to write the software. His arguments about training costs, intercommunication, and sequential constraints are all aimed at demonstrating that a task that a single developer could complete in x months will take n developers more than x/n months because the amount of work required for n developers is no longer simply x but x plus overhead.

But there’s another possibility. What if, as I suggested in my post about Sisyphus, n people don’t have more than x work to do but less than x because n people working together and communicating a lot are much more likely to discover a better solution than any one of them working alone? In that case, time spent communicating is not extra work but a way of reducing the total amount of work done.


1. One thing to note about the growth of intercommunication costs is that it is quadratic, not — as some writers have described it — exponential. Quadratic growth is faster than linear, for sure, but nowhere near as fast as exponential. Populations with no limits on their growth — bacteria in Petri dishes or rabbits in Australia — grow exponentially. If communication costs did grow exponentially with the size of the team, then a team would go from spending just slightly over half it’s time on communication to being able to do nothing but communicate, just by adding one person. One author who should certainly know better is Steve McConnell who described the growth of communication paths as “exponential” in Software Estimation, (p. 57). In fact he did know better — in his earlier book, Rapid Development, he described the growth, correctly, as “multiplicative” (p. 311).

Last updated 2007-05-29T09:54:06+7:00.

A little person with a sense of humor

Today my wife was downstairs playing with our eight-month-old daughter, Amelia, and all I could hear was the sound of Amelia laughing, laughing, laughing. I mean, really cracking up then settling down a bit and cracking up all over again. I’m sure I’m far from the first person to have had this feeling but it gives me some small measure of hope for the human race that this little person who barely knows her own name and doesn’t know enough not to crawl off the edge of the bed, has, if nothing else, a sense of humor.

Last updated 2007-05-28T18:28:46+7:00.

Practical Common Lisp going into 3rd printing

I just found out that Apress has decided it’s time for a third printing of Practical Common Lisp. If I recall correctly, the first printing was 5,000 copies, the second 3,000 more. New printings are called for when the publisher thinks they’re going to run out of copies to sell to distributors so this must mean I’m not crazy to dream of someday having a 10k-copies-sold party.

This also means now would be a good time, if you’ve read the book and noticed any errors that you’ve not emailed me about, to send a note. If you put “pcl errata” in the subject it’ll make my life a bit easier. Note, however, that this is just a new printing not a new edition. For a new printing we just fix minor typos and so forth so now is not the time to tell me that there should really be a chapter about how to connect to RDBMSes or what have you.

Last updated 2007-05-26T16:44:36+7:00.

Brooks’s Law and sequential constraints: one damn thing after another

Surely everyone involved in developing software has heard of Brooks’s Law. First presented in the eponymous chapter of Frederick P. Brooks, Jr.’s classic The Mythical Man Month, it states: “Adding manpower to a late software project makes it later.” This “law” is much beloved by software developers as a handy bucket of cold water with which to cool the ardor of overly enthusiastic managers and executives. Lately, however, I’ve been thinking about Brooks’s Law and rereading The Mythical Man Month and I’m no longer as impressed with Brooks’s analysis as I once was. This is the second in what I expect will be a series of posts discussing some of the reasons why. The first post in the series discussed training costs.

As part of his “demythologizing of the man-month” (p. 25) Brooks points out that developing software is subject to what he calls “sequential constraints”. Brooks actually makes two points about sequential constraints, but he doesn’t draw a particularly clear distinction between them in his exposition, so I’ll start by teasing them apart. They are:

  1. All tasks, including software development, have some sequential constraints on their subtasks that determine the minimum time in which the whole task can be completed.
  2. Even if the rest of the work can be parallelized, communication needed to coordinate work can act as a sequential constraint, putting a lower bound on the time needed to complete a task.

Point one is a simple matter of logic. Virtually every task, from harvesting a field of crops, to having a baby, has some sequential constraints on some of its subtasks that determine that certain subtasks can only be done after other subtasks are complete. It’s impossible to complete the whole task in less time than the time it takes to do the longest sequential chain. By definition, subtasks that are not sequentially constrained can be done in parallel and so more people working on them at the same time will get them done sooner than fewer people.

Tasks vary in the both the nature and extent of the sequential constraints that apply to their subtasks. Brooks gives harvesting crops as an example of a task with very few sequential constraints and bearing a child as one that nothing but a long sequentially constrained chain. It’s worth noting, however, that all real-world tasks are sequentially constrained at some level — even harvesting a field of crops, for instance, requires that someone get to the farthest corner of the field, harvest what’s there, and bring it back. No matter how many field hands you hire and no matter how minuscule the part of the field each is responsible for there’s still no way to harvest the whole crop faster than that long leg could be completed.

For practical purposes, however, Brooks is right: harvesting crops is almost entirely parallelizable and bearing a child is almost entirely not. Before we get to how communication itself can act as a sequential constraint, let’s consider another task which is more sequentially constrained than harvesting crops but less so than having a baby, namely baking a cake. Consider for instance this simple cake recipe:

  1. Pre-heat the oven to 350°.
  2. Prepare the cake pans — greasing, lining, and flouring.
  3. Sift together flour, baking soda, and salt.
  4. Cream the butter, shortening, and sugar until light and fluffy.
  5. Add dry ingredients to butter/shortening/sugar mix.
  6. Mix in three eggs.
  7. Pour batter into cake pans.
  8. Bake for 25 to 30 minutes.
  9. Cool on racks for 10 minutes.
  10. Remove from pans and continue cooling.

As with most recipes, there are both opportunities for parallelism and unavoidable sequential constraints. If you had a three cooks in the kitchen one of them could prepare the cake pans while another sifts together the dry ingredients and a third creams the butter, shortening, and sugar. After that, the next three steps, up to pouring the batter into the cake pans, while sequentially constrained relative to each other, could be done in parallel with the oven heating. Thereafter, everything is sequentially constrained. No matter how many cooks you have, you have to heat the oven before you bake the cake and bake the cake before it can cool. Thus there’s no way to decrease the total time it takes to make a cake below about 45 minutes.

Now let’s consider how software development is sequentially constrained. As anyone who has written software knows, there are sequential constraints but where do they come from? There are certainly no physical constraints such as the one that keeps a baker from pouring batter into cake pans before the batter has been made. If, somehow, we knew at the beginning of a development project all the lines of code that needed to be written, we could type them in any order we wanted — the software would work just as well in the end. But the notion that we could know in advance all the code that needs to be written and type it like we were taking dictation is just crazy. Programming isn’t primarily a typing problem, it’s a thinking problem. And thoughts need to be thought in the proper order.

In fact, the only way to figure out how a software system ultimately fits together is to build it. In order to know, in detail, how part X is going to work we need to know how part Y, with which it interacts, is going to work. And the only way to know how Y is going to work is to build it. It may be that we can completely build X and then build Y or we may need to alternate — build a bit of X in order to develop enough information to build a bit of Y from which we learn enough to build another bit of X, and so on. It might also be equally possible to start by building X and then build Y or to start with Y and then build X. But however we do it, the flow of information about the system we are building are the inherent sequential constraints we operate under. Note that this has nothing to do with communication — even if the system were being built by a single developer these constraints would still constrain the order in which various parts of the system could be built.

Now, keeping in mind these inherent sequential constraints, let’s consider Brooks’s second point, that the need to communicate can itself act as a sequential constraint. This is the software development equivalent of Ahmdal’s Law from parallel computing which says that no matter how much you can parallelize a computation, it’ll never complete faster than the time it takes to combine the results of all the parallel computations. In both software development and parallel computing this is because communication is inherently sequential. If ten people — or ten CPUs — each have six minutes worth of information to convey to each other, it’s going to take at least an hour of elapsed time no matter how you slice it; ultimately each person is going to have to spend six minutes “transmitting” their information and fifty-four minutes “receiving” information from the other nine people.

To see how this effect plays out, imagine we have an idealized software development task whose coding can be partitioned among however many developers we like but for every ten hours a developer is going to spend coding, they need to spend one hour writing down what they’re going to do for the benefit of the rest of the team and everybody has to read everyone else’s notes. In other words, before each ten hours’ worth of coding, a developer spends an hour writing an email about what they are about to implement and sends it to all the other developers. Then they have to read the other developers’ emails, spending an hour to absorb each one. After all that communication, the developers can each code for ten hours. For simplicity, we’ll assume that even a developer working alone would spend the hour writing notes for themself documenting what they plan to do in the next ten hours.

Suppose the total coding time needed to develop the system is 100 person-hours. A single developer could do it in 110 hours, ten chunks of an hour of note writing followed by ten hours of coding. Two developers could do it in five chunks of work with each each chunk consisting of twelve hours of work: an hour writing notes, an hour reading the other developer’s notes, and ten hours coding. Thus for the team of two, the total elapsed time would be 60 hours, of which 10 would have been spent on communication. Five developers could complete the project in only two chunks but each chunk would be fifteen hours: one hour writing, four reading, and ten coding. Thus their elapsed time would be 30 hours with 10 hours spent on communication. Ten developers would be done in 20 hours elapsed time — ten hours of development after ten hours of communication.

Even if we could scale down the communication cost proportionally, so less than ten hours of individual work requires a proportionally smaller amount of email writing and reading, the elapsed communication time still stays at ten hours: twenty developers would spend a half-hour writing their emails and nine and a half hours reading nineteen emails before coding for five hours. Indeed, as the number of developers approaches infinity, the amount of time spent coding approaches zero as does the amount of time each developer spends writing their own email while the amount of time spent reading the infinite number of infinitesimally short emails from other developers approaches ten hours and the project as a whole still takes a minimum of ten hours to complete. Thus even when there are no other sequential constraints — when we assume that an infinite number of developers can each be given an infinitesimally small part of the project to work on in isolation — communication remains the one activity that must be performed sequentially.

In real software projects, of course, things are more complicated. The inherent sequential constraints — those that would affect a single developer working alone — interact with communication induced constraints in all sorts of complicated ways. For one thing, if we assume — as Brooks seems to — that the overall task is partitioned into subtasks, each to be developed by a single developer, then the way we do the partitioning can have dramatic affects on the amount of communication needed. If we split the system at its natural joints, then communication will be minimized — if subsystems are naturally decoupled then developers can work on their bit for a while, developing lots of information about how their part of the system works, which only they need to know, and just a little bit of information that they need to share with other developers. On the other hand, if the partitioning is poor, each developer’s part of the system will depend on many details of other parts and the developers will either need to communicate much more often or, more likely, will all go off in their own directions for a bit too long and then discover, when they compare notes, that they need to backtrack and rework things in order to make everything fit together.1

Another issue, which Brooks doesn’t mention, is that the need to communicate can stall productive work. One of the idealized aspects of the hypothetical project above is that the developers work in perfect lockstep — everyone communicates and then works for exactly ten hours and the cycle repeats. At no point is anyone stalled waiting for someone else. In real projects, some subtasks will be bigger than others leaving developers whose pieces happens to be smaller waiting after they’ve finished their work to communicate with developers whose pieces are larger. Every hour that they spend waiting is an hour that gets added to the total number of person-hours it takes to complete the project.

That all said, there’s nothing that says the only way to divide up a task is by partitioning it into pieces that are each implemented by a single developer. In fact there are all sorts of reasons, which I’ll talk about in a later post, that that might be a bad idea. For now, let’s just note that if we could avoid a strict upfront partitioning, and could let developers share ownership of the system as a whole, working together frequently and sharing ideas about how it all fits together, they could probably much more closely emulate the order of development that we would see if we watched a single developer build the whole system, constrained only by the inherent constraints of needing to build enough of X in order to know enough to build Y and discovering, as they go along, enough bits that can be naturally carved off and done in parallel to keep everybody busy.2

So how does all this relate to Brook’s Law? In the concluding paragraph of the chapter, right after he has stated his Law, Brooks goes on to say:

The number of months of a project depends upon its sequential constraints. The maximum number of men depends upon the number of independent subtasks. From these two quantities one can derive schedules using fewer men and more months. ... One cannot, however, get workable schedules using more men and fewer months. (pp. 25-6)

The last sentence is only true if the number of workers currently on the project is sufficient to take advantage of all the opportunities for parallelism. For instance, suppose we have a project consisting of forty individual tasks, each of which will take a weeks’s worth of work by one person. Now suppose ten of those tasks are inherently sequentially constrained while the other thirty tasks can be done at any time, in any order. Because of the ten sequentially constrained tasks, the project can’t be completed in any less than ten calendar weeks. But suppose the project has been assigned to a two-person team. It will take them twenty weeks to do the whole project, ten weeks longer than the minimum. Clearly in this case, we can get “workable schedules using more men and fewer months” by adding one or two people to the team. A team of three would finish in a bit over thirteen weeks and four would finish in the minimum time of ten weeks. To say that we can’t reduce calendar time because of sequential constraints would only be correct if we had originally assigned the project to a four person team.

In general, given that Brooks’s Law is talking about late projects, that is, ones we badly underestimated in the first place, what’s the likelihood that our estimate of how many people we needed was exactly right? The real question, if we’re concerned about sequential constraints, is whether or not there’s work that could be done in parallel. Sometimes there is and sometimes there isn’t and assuming that there never is is just as foolish as assuming that there always is.


1. In other words, the only thing worse than paying the costs of communication is not paying the costs of communication. Because we will pay them eventually, with interest.

2. Obviously if the team pair programs then the partitioning problem is made quite a bit easier as n people need only n/2 tasks to keep everyone busy, rather than n.

Last updated 2007-05-22T08:56:21+7:00.

Brooks’s Law: training costs, but not as much as you might think

Surely everyone involved in developing software has heard of Brooks’s Law. First presented in the eponymous chapter of Frederick P. Brooks, Jr.’s classic The Mythical Man Month, it states: “Adding manpower to a late software project makes it later.” This “law” is much beloved by software developers as a handy bucket of cold water with which to cool the ardor of overly enthusiastic managers and executives. Lately, however, I’ve been thinking about Brooks’s Law and rereading The Mythical Man Month and I’m no longer as impressed with Brooks’s analysis as I once was. This is the first in what I expect will be a series of posts discussing some of the reasons why.

When Brooks says that adding manpower makes a late project later, he doesn’t specify what he means by later. Later than it already is? Almost certainly, but so what? Later than your new wildly optimistic estimate? Probably, but again not all that interesting. The slightly paradoxical interpretation that makes Brooks’s Law such a perennial on amusing quotation lists is: later than it would have been if you had just left well enough alone.

Of the various reasons Brooks gives in the chapter “The Mythical Man Month” for projects running out of calendar time, the only one that has specifically to do with adding staff to an existing project is the cost of training the added staff. There are other costs associated with having a bigger team that such as potentially increased intercommunication costs and the need to repartition tasks. I’ll discuss those costs in later posts but for now I’m concerned only with whether Brooks’s own analysis of the costs of training holds water.

If we were to take Brooks’s Law as literally true, then we would have to believe that the costs of training new staff will always be higher than any capacity for productive work they might eventually develop. That seems unlikely. However, Brooks’s Law only refers to “late” projects so perhaps there’s something about being late that makes it true. Unfortunately, he doesn’t define “late” any more than he defines “later” so if we want to apply Brooks’s Law wisely we’re on our own — we need to ask, when can we get more done by adding staff than by not?

Much more often than Brooks lets on, it seems. In the section “Regenerative Schedule Disaster” Brooks uses a hypothetical project, originally estimated to be twelve person-months of effort and assigned to a three person team, to demonstrate how training costs affect our ability to speed up a project. In his scenario the project has been divided into four milestones, each of which should be completed in one calendar month by the team of three, i.e. three person-months per milestone. Unfortunately it takes the team two calendar months, or six person-months, to finish the first milestone, so there are only two months left to complete the remaining three milestones. Brooks then considers two sub-scenarios — one where only the first milestone was mis-estimated, in which case there are nine person-months worth of work left and two months in which to do it, and another where the underestimation was systematic so the three remaining milestones are all, like the first, six person-months of work leaving eighteen person-months of work. The question he then poses is, what happens if a manager attempts to get the project finished in the remaining two calendar months by adding staff.

In the first sub-scenario, a manager who ignores training costs would calculate that they need four and a half people to do nine person-months of work in two months. Rounding up to five, subtracting the three they’ve already got, and they add two people. In the second scenario, eighteen divided by two is nine, subtract the three they’ve got, and they’d need to grow by six. Brooks then analyzes the first sub-scenario, making the rather conservative assumption that it’ll take one month of full-time work by one of the existing team members to train the two newcomers before they’ll be able to do any work. Under that assumption, only two people will do productive work during the third month so only two more person-months of actual work will be done, leaving seven. In the fourth, and final, month, the new people will start contributing and the trainer can get back to real work but it’s too late — they’ll get five person-months worth of work done but with seven left to do the schedule is blown.

But there’s another way to look at it. With the two newcomers, the team managed to complete a total of thirteen person-months worth of actual (non-training) work, or almost 87% of the originally planned functionality (assuming the revised estimate of fifteen person-months for the whole project is correct.) What would have happened if they had heeded Brooks’s Law and just kept going with the original three-person team? They’d have completed only twelve person-months, or 80% of the originally planned effort. Or, if it’s more important to deliver 100% of the functionality as soon as possible, the original team would have needed another month, blowing the schedule by 25% while the augmented team would only need an additional two-fifths of a month, or about 10% over the original schedule.

In Brooks’s second sub-scenario, where the actual project size is assumed to be twenty-four person-months, the benefits of adding staff are even more pronounced. Assuming the same one-month of full-time training, the augmented team finishes almost 71% of the originally planned effort in four months compared to only 50% by the original team. Or they can finish the whole project in a bit less than five months total, extending the original schedule by about 20%, compared to the 100% by which to the original team would blow the original schedule.

The problem is not that adding staff to the project didn’t help; it’s that it didn’t help quite enough. You might ask, why not account for the training costs when figuring out how many new staff are needed? Brooks briefly considers that idea and rejects it on the grounds that the seven person team needed in the first sub-scenario to finish the remaining seven person-months worth of work after training would be too different in kind from a five person team for it to be feasible. That may be true but the question remains, what’s the alternative? Brooks considers attempts to finish the project on the original four-month schedule “disastrous” and recommends that we should instead “reschedule” or “trim the task”. Both of those are probably wise strategies but even with Brooks’s conservative assumptions about training time, the expanded team would still get more done needing either less of a schedule slip or less trimming of functionality.

At any rate, it’s not the case, in either sub-scenario, that training costs on their own would cause the project to finish later with additional staff than it would have without. Brooks does, however, make one important point when he says:

Notice by the end of the third month, things look very black. The [second] milestone has not been reached in spite of all the managerial effort. The temptation is very strong to repeat the cycle, adding yet more manpower. Therein lies madness. (pp. 24-5)

It is important not to lose one’s nerve. If you’ve already used up two months of a four-month schedule, it’s going to be queasy-making to reduce your productivity by a third for another whole month. If you do, you’ve got to stick with it to reap the benefits as your new workers get up to speed. It also suggests two bits of tactics. One: make sure you add enough new staff. If you’re going to take the hit of losing the output of one or more of your currently productive workers to training you want to make sure you get as big a return on that investment as possible — add as many people as you can afford and as you think can be trained in a reasonable amount of time. Second, make sure you invest enough in training. In his own reappraisal of Brooks’s Law Steve McConnell called Brooks’s assumptions about training costs “absurdly conservative” and they may be. But notice that even with those conservative assumptions the investment can still pay off quickly. It can be tempting to try to cheat, adding staff without explicit training, hoping they’ll somehow get up to speed on their own. If it works, great, but more likely they’ll just nibble away at the productivity of the current staff without ever becoming productive enough to offset the cost. Better to plan conservatively and then end the training ahead of schedule if they’re ready to get to fully productive work sooner than planned.

Last updated 2007-05-18T09:46:31+7:00.

One woman can't have a baby in nine months

As all software developers know, nine women can’t have a baby in a month. Or in Fred Brooks’s more elegant phrasing: “The bearing of a child takes nine months, no matter how many women are assigned.” (The Mythical Man Month, p. 17) The point, of course, is that some tasks are, as Brooks would say, “sequentially constrained”. They’re going to take a certain amount of time no matter what — the time can’t be reduced by having more people work on them.

On the other hand, is it actually the case that one woman can have a baby in nine months? Suppose we have just been put in charge of Project New Baby that must produce a brand new baby in nine months. How should we staff the project. Easy enough — nine women can’t have a baby in a month, right? No point in overstaffing so we’ve just got to find a couple, make sure they’re both fertile and interested in having a kid, and we’re good to go. But wait a sec’, what’s the chance they’ll miss our nine month deadline by more than a month? Pretty high, it turns out.

Typically a couple trying to get pregnant has about a 16% chance in any given month. Once they’ve conceived, there’s, sadly, about a 15-20% chance of miscarriage, usually within the first three months. So the chance our couple will produce a baby nine months from now is only .16 × .85 or 13.6%. If we wanted to we could compute the average time we should expect it to take for one couple to have a baby, using math similar to that in an earlier post. But suppose the deadline is hard — we really, really need to finish Project New Baby in nine months — is there anything we can do?

Sure. Throw bodies at it. While a single couple has a 86.4% chance of missing our deadline, if we had two couples, the chance that they’d both miss it is only .8642 or about 74.6%. With three couples, the chance of blowing it is down to 64.4%. To figure out how many couples we need to have a P chance of hitting our deadline, just plug P into this formula:

/blog/fb860ae073bf284ad0ece6a815936d48.png

Of course, this could get expensive if we need to be really certain of hitting that deadline — to have a 90% chance of hitting it we’d need sixteen couples. But depending on how important Project New Baby and it’s deadline are, it might be worth it.

So what in software is like making babies? Let’s take a look at how Brooks himself tied making babies to making software:

The bearing of a child takes nine months, no matter how many women are assigned. Many software tasks have this characteristic because of the sequential nature of debugging. (p. 17)

Unfortunately, I haven’t been able to find anywhere where he explains what he means by “the sequential nature of debugging” but I can see how debugging is like having a baby. And not that they both can be incredibly painful and that you have a great feeling of relief when you’re done. The similarity that I see is that the time it takes to find a bug has a large random component, like trying to conceive a child. Basically when you’re looking for a bug, there’s some probability p that you’ll find the bug for each unit of time that you spend looking, just like a couple has a 16% chance of getting pregnant for each month they spend trying. If you’re a skillful debugger and know your code really well p will be higher but there’s always a random element — if you go down the wrong path it can take you a while to realize it and all that time is lost whereas if you had tried a different path first, you might have found the bug right away. This is why it’s almost useless to try to estimate how long it will take to find a bug. You could find it in the next five minutes or five weeks from now. Once you find the bug of course you also have to fix it but that tends to be less random — unlike a pregnancy, which always lasts about nine months after conception, different bugs will require more or less work to fix, but once you’ve found it you can usually characterize how big a job it’ll be. And for many, if not most, bugs finding them is the hard part — once you’ve well and truly tracked them down, the fix is often trivial.

All of which suggests we can use the same technique to speeding up debugging as we did on Project New Baby — throw bodies at it. Suppose we’re ten days from the end of a release and there’s one last serious bug to be tracked down. Suppose my chance of finding it is 10% per day. The chance that I won’t find it in the next ten days is (1 − .1)10 or about 35%. But if there’s someone else who can also look for it — say a pair programming partner — who also has a 10% chance of finding it per day, and we both work at it separately. Then the chance that the bug will remain at large by the end of the release drops to 12%. If we can throw even more developers at it, then the chances of the bug escaping drop even more: 4% chance with three developers, 1% with four, 0.5% with five.

Obviously, to be able to take advantage of this strategy requires having multiple developers with enough familiarity with the code to be able to pitch in. Which seems to me a strong argument for practices such as pair programming and collective code ownership. An interesting side question is whether, if you do have developers to throw at debugging in this way, it is better for them to work independently or should they pair up for the debugging on the grounds that two heads are better than one?

Last updated 2007-05-10T20:57:55+7:00.

If Sisyphus had only had a partner

While working on another blog entry (still in progress) about Brooks’s Law, I got to thinking about pair programming and how it’s possible that two people working together, sitting at one computer, can be more productive than the same two people working on their own and combining their work. I certainly believe they can, based on my own experiences with pair programming. But after immersing myself in Brooksian notions of how communication costs quickly eat up all available productivity it seems a bit of a paradox. To the extent that writing software is like carrying rocks up a hill — and doesn’t it often feel that way? — here’s an explanation.

Suppose you have a hundred heavy rocks that you need to carry up a hill. They’re not so heavy that you can’t do it but they’re heavy enough that moderately often you’ll lose your grip and the rock will roll back to the bottom of the hill. Let’s say on each attempt to carry a rock up the hill there’s a 70% chance you’ll lose your grip. Assume that when you don’t drop the rock it takes five minutes to carry it up the hill and a minute to walk back down. First question: how long will it take you to get all the rocks to the top of the hill? Obviously in practice it depends on how often that 70% chance of the dropping the rock actually bites you, but we can figure out an expected value. If the drops are randomly distributed — sometimes near the bottom of the hill and sometimes near the top — you’ll lose an average of three minutes per drop. But once you drop a rock you have to start all over again with it and there’s a chance you’ll drop it again. Thus the amount of time you should expect to spend on each rock is six minutes plus the sum of this infinite series:

/blog/d98a70a117e0b187503839f1468ed3e6.png

Add that seven minutes to the six minutes to get it to the top of the hill without dropping and we get an average of thirteen minutes per rock, or 1,300 minutes for all one hundred rocks.

Now suppose you had a partner. Assuming there’s room for two people to carry rocks at the same time, one way to reduce the time it takes to get all the rocks to the top of the hill would be to simply each carry fifty rocks — the 1,300 minutes would be cut in half, to 650 minutes. But there’s another possibility — since the rocks are just a bit too heavy for one person to manage 100% reliably perhaps the two of you working together would be strong enough to never drop a rock. In that case, you could carry all hundred rocks up without dropping any and the whole job would take only 600 minutes, even better than splitting the work.

Of course if the chance of one person dropping a rock was lower, then working separately might be a better bet. In fact we can figure out exactly what probability makes it better to work separately or together by solving this inequality for p:

/blog/874b05e9cc12a1c8b5039d21e03dae03.png

The numerator of the left hand side represents the expected time it’ll take for one person to get one rock to the top if it takes x minutes with no drops. We divide by two to account for the fact that there are two people working at it. The right hand side represents the time taken with both folks working together and never dropping a rock. After some algebra the xs all go away and it turns out that when the probability of dropping a rock is greater than ⅔ it’s better to pair up than to work separately.

Now, a ⅔ chance may seem fairly high but it’s worth thinking about where that probability comes from. Let’s consider how a ⅔ chance of dropping the rock over five minutes relates to the chance that we’ll drop it in any single minute. To back out the per-minute chance of dropping, given the total probability of dropping and the number of minutes, we start by recognizing that the probability of dropping is equivalent to one minus the probability of not dropping. And to not drop for five minutes we need to not drop for one minute, five times in a row. More generally, to not drop for m minutes, we need to not drop for one minute, m times in a row. If h is the probability of holding (i.e. not dropping) for one minute, and the probability of holding in any one minute is independent of any other minute (i.e. dropping is more or less random and not the result of fatigue), then the combined probability of m minutes is hm. Thus if D is the probability that we’ll drop a rock any time in m, then we can figure out h, the probability that we can hold a rock for a minute, and from there, trivially, d, the probability that we’ll drop it in any one minute, for a given D and number of minutes m as shown here:

/blog/a29da71bcc33349730ccac7585f2b394.png

/blog/5f8647f0d3243af51e4f8a1f6703d282.png

/blog/8634826f8bca4df946658d38f229273a.png

/blog/f8f8204585792cf0efbc198f92190a47.png

Plugging our ⅔ chance and 5 minutes into this formula we find that that a ⅔ chance of dropping over five minutes is equivalent to about a 20% chance of dropping in any single minute. If we want to find out the probability that we’ll drop a rock over a m minute trip, given d, we can use this formula:

/blog/92f34e361302d4db7aeb189932c011c8.png

Or perhaps more to the point we can solve this inequality:

/blog/67c5d9c0f931b061619909072d3f6c02.png

to determine the relationship between d and m that determines when the total probability of dropping is greater than the ⅔ chance that makes it worthwhile to pair up rather than working separately:

/blog/58d289192941d66edfa0dd3a12e87b35.png

/blog/ad9073a69f62e020d44c4fdd6d802645.png

/blog/8cea23a6d60772998d4f30fbde63e8ac.png

With this formula we can see that if it only took us three minutes to climb the hill, we could live with up to a 31% chance of dropping per minute before pairing would make sense. But if it took us 20 minutes, then we’d do well to pair up even if every minute we had a 95% chance of keeping our grip.

So is developing software like carrying rocks? I’d argue that in many ways it is. Programming requires keeping a bunch of things in mind and if you lose your mental grip on any of them for a moment you either have to backtrack to re-figure out how things fit together or, worse yet, you proceed with a faulty understanding and introduce a bug which later requires a lot of time to track down and fix. In fact programming is in some ways worse than carrying rocks because the cost of a momentary slip of concentration can be much more than simply the equivalent of a rock rolling back to where you started. A bug that you create a few hours into a programming session may take many hours or even days to track down and fix. Luckily pairing can help there too — while one partner is focusing their mind on the next thing the other partner’s mind may linger for a moment and have a “Wait a sec’” moment that catches a bug before it gets too far away, the equivalent of catching a dropped rock before it rolls all the way back down the hill. Or when bugs do get in, a pair can often find them faster than a single programmer, much the way two people would be able to find a dropped rock if it didn’t just roll back to the bottom of the hill but bounced off in some random direction into thick weeds.

Last updated 2007-05-08T21:00:26+7:00.

Software estimation considered harmful?

A few weeks ago I was in the midst of reading Steve McConnell’s Software Estimation: Demystifying the Black Art when I had the conversation with my friend Marc that I have written about previously. Marc, as I mentioned before, is the founder of and Chief Product Officer at a small startup called Wesabe. When I told him what I was reading, he asked, “Do you believe it?”

“Sure,” I said. By which I meant nothing in the book had struck me as patently bogus. A lot of it is good sense about the limits of estimation, the relation of estimation to planning, and why estimation is so hard. Parts II and III of the book present specific estimation techniques that, assuming one had the relevant inputs and historical data, seem likely to be capable of producing fairly accurate estimates. Now, the descriptions of some of these techniques made me think — wow, if that’s what it takes to produce good estimates, no wonder we all muck along with crappy ones. But it did for the most part seem like the kind of thing we Serious Software Professionals™ should be doing.

Marc — it turns out — is far more skeptical about the whole enterprise of software estimation. He tells me that at Wesabe they never make schedule estimates. He manages his developers, as he has explained in a blog entry, by trying to get them excited enough about the features he thinks should be added to Wesabe that they decide to go ahead and add them. Or they may get excited about their own ideas and add them instead. Marc retains final authority over what gets added to the product and a Wesabe developer who consistently gets excited about developing things that Marc refuses to allow into the product should probably make sure their resume is up to date. But he never asks them for estimates. He does encourage his developers to spend most of their time working on things that they can finish quickly and get into the product, but that seems to be as much about what he thinks most likely to keep his developers happy as anything else. When they need to, Wesabe developers will tackle bigger projects, still without estimating how long they will take. Marc’s point of view, I take it, is that the only reason to do these big projects is because you have to, and if you have to, it doesn’t really matter how long it’s going to take.

Now, Marc’s a smart guy and he’s been managing software developers for as long as I’ve known him (more than a decade) and I know he thinks a lot about how he does what he does. On the other hand, Steve McConnell’s also a smart guy whose books I’ve been a big fan of for about as long. So, how to reconcile these two points of view? Is Marc’s approach only tenable in a startup? Or maybe McConnell’s approach to estimation is only worthwhile in big organizations, that are doing more or less the same thing over and over again.

So I returned to reading Software Estimation with a new question in mind: Why estimate? The nearest McConnell comes to an answer is in section 1.7 “Estimation’s Real Purpose”, where he gives this definition of a good estimate:

A good estimate is an estimate that provides a clear enough view of the project reality to allow the project leadership to make good decisions about how to control the project to hit its targets.

I think the key word in McConnell’s definition is “targets”. The reason Marc can get away with not estimating is because he’s found a way to manage without setting targets. So the question “Why estimate?” is better expressed as “Why set targets?”

Sometimes we set targets in order to convince others, or ourselves, that something can be done. We may set targets to inspire ourselves to do more, though it’s not clear that’s a winning move, and even less so when managers set a target to “inspire” the folks who work for them. (See DeMarco and Lister’s Peopleware, chapter 3 and the discussion of Spanish Theory managers.) We may also set targets to give ourselves a feeling of control over the future, illusory though that feeling may be. After the fact, a target hit or missed can tell us whether or not we did what we set out to do. However if we missed a target, we can’t know whether that’s because the target was unrealistic or because we didn’t perform as well as we should have. Setting and hitting targets does make it look like we know what we’re doing but we need to keep in mind that targets rarely encompass all the things we care about — it’s much easier to set a target date for delivering software than a target for how much users will love it.

If the goal is simply to develop as much software as we can per unit time, estimates (and thus targets), may be a bad idea. In chapter 4 of Peopleware, DeMarco and Lister discuss a study done in 1985 by researchers at the University of New South Wales. According to Peopleware the study analyzed 103 actual industrial programming projects and assigned each project a value on a “weighted metric of productivity”. They then compared the average productivity scores of projects grouped by how the projects’ estimates were arrived at. They found that, as folks had long suspected, that programmers are more productive when working against their own estimates as opposed to estimates created by their boss or even estimates created jointly with their boss, averaging 8.0 on the productivity metric for programmer-estimated projects vs 6.6 and 7.8 for boss-estimated and jointly-estimated. The study also found that on projects where estimates were made by third-party system analysts the average productivity was even higher, 9.5. This last result was a bit of a surprise, ruling out the theory that programmers are more productive when trying to meet their own own estimates because they have more vested in them. But the real surprise was that the highest average productivity, with a score of 12.0, was on those projects that didn’t estimate at all.

There is, however, one other reason to estimate: to coordinate our work with others. The marketing department would like the developers to estimate what features will be included in the next release so they can get to work writing promotional materials. Or one developer wants to know when a feature another developer is working on will be ready so he can plan his own work that depends on it. Note however, that in cases like this, estimates are really just a tool for communication. Marketing needs to know what’s going to end up in the release and the developers, by virtue of being the ones building it, have the information and somehow that information has to be communicated from the developers to the marketers. But there are lots of ways that could happen. In a small company it might happen for free — everyone knows what everyone else is working on so the marketers will have as good an idea as anyone what’s actually going to be ready for the release. If water-cooler conversations are insufficient, then marketing and development could meet to talk about it on a regular basis. Or the developers could maintain an internal wiki about what’s going on in development. Some of these methods may work better than others but it’s not a given that using estimates is always the best way.

To decide whether estimates are a good way to communicate, we need a way to compare different methods of communication. I’d argue that all methods of communication can be modeled, for our present purposes, with the following five parameters:

The idea is that communication happens in this pattern: one or more people spend some amount of time preparing to communicate. This would include activities such as thinking, writing, estimating, etc. Then the communication proper happens, which takes some time. This may require time from both the sender and the receiver (conversations, meetings, presentations, etc.) or only the receiver (reading an email, looking at a web site).

After the communication is complete, some amount of information has been conveyed and also, sadly, some amount of misinformation. The misinformation may arise from faulty preparation, from misstatements by the sender, or from misunderstandings by the receiver. Obviously different methods of communication will be able to convey more or less information in a given amount of time and will be more or less prone to miscommunication.

Finally, different methods of communication can have benefits beyond the information conveyed and costs other than the time spent and the misinformation conveyed. For instance, chatting around the water-cooler may build team cohesion while highly contentious meetings may have the opposite effect. Another important kind of benefit is that the preparation and communication phases may itself improve the communicators’ understanding of what they are communicating about. For example, writing clearly on any topic invariabl