Statistics and Scrabble, Together At Last

Sitting on my to-do list for a while now has been an exploration of Scrabble from an experimental design point of view; how to better design a tournament to make the variance as small as possible while still preserving the appearance of the home game to its players. One goal was to figure out a way to carry out a true "duplicate" version of Scrabble so that multiple pairs could have access to the same tiles, rather than the currently popular version in Europe that has no defensive element to it.

I'm proud (relieved?) to say that I've finally finished the first draft of this work for two-player head-to-head games, with a duplication method that ensures that if the game were repeated, each player would receive tiles from the reserve in the same sequence: think of the tiles being laid out in order (but unseen to the players), so that one player draws from the front and the other draws from the back. Like Lady and the Tramp with spaghetti:

tramp.jpg

I modified the Scrabble simulator Quackle to accept a predetermined tile order, then simulated over one million matches between Quackle's "Speedy Player"s using each of 10,600 tile orders 100 times. One goal of this was to figure out how much of the variance in score comes from the tile order and how much comes from the board, given that a tile order would be expected. It turns out to be about half-bag, half-board, so that if this scheme could be used in tournaments, it would visibly cut down the number of matches needed to figure out the best player (though it would need a Goldbergian apparatus to implement in live games.)

Some other findings from the simulations:

  • The blank is worth about 30 points to a good player, each S about 10.
  • The Q is a burden to whichever player receives it, effectively serving as a 5 point penalty for having to deal with it due to its effect in reducing bingo opportunities, needing either a U or a blank for a chance at a bingo and a 50-point bonus.
  • The J is essentially neutral pointwise.
  • The X and the Z are each worth about 3-5 extra points to the player who receives them. Their difficulty in playing in bingoes is mitigated by their usefulness in other short words.
I have yet to make any other conclusions about how I think the game should be modified, mainly because it's premature without testing these ideas out on human players. Any volunteers?

Recent Entries

Prediction, The Big Discovery and Heartbreak
It's a year old, but I only just heard the story of what happened to a baseball researcher I first…
Statistics and Scrabble, Together At Last
Sitting on my to-do list for a while now has been an exploration of Scrabble from an experimental design point…
Geek talk: Running R processes remotely through ESS
I've been wondering for a while how to use the convenience of Emacs on a local machine, while running R…