The New York Times has an article on Wordnik, the online dictionary. But some the reporting, and even some of the commentary, provides the wrong impression, letting the reader believe that the process is entirely automated, done “without the arbiters.” It may be so on Wordnik’s part, but the databases on which it relies upon for its definitions have been populated by human lexicographers. Wordnik does not “pre-select and pre-prune,” as Erin McKean, one of Wordnik’s founders puts it, but the databases it relies on have been pre-selected and pre-pruned by others. Wordnik is a great resource, but not one I would recommend without some trepidation.

Now don’t get me wrong. I really like Wordnik. It has become my go-to dictionary when I need to quickly look up a word and don’t need the detailed info that the OED provides. It’s fast and provides a wealth of information, but it does require some sophistication to use it well. While I like it for my own use, I would be hesitant to recommend the site to my undergraduates.

First, it’s important to differentiate between the two main parts of a Wordnik entry, the definitions and the usage citations. For the definitions, Wordnik’s algorithms search and cull from various public domain or licensed dictionaries. These definitions are not “without arbiters,” rather the arbiters are one step removed. The definitions have all been created by human lexicographers, just not by ones employed by Wordnik. And because the sources are largely public domain, the definitions are mostly outdated. The American Heritage Dictionary, fourth edition, is more current that Webster’s 1913 or the Century Dictionary, both excellent dictionaries in their day but woefully outdated now, but even the AHD 4th isn’t the latest edition of that dictionary. Wordnik also often supplies definitions from Wiktionary or other web sources, which while current are of wildly varying accuracy. Having the old definitions is really useful, but the user must know that they can’t be relied upon for current usage.

The usage citations are better, and here the selection is truly done without human arbiters. But unlike traditional dictionaries, Wordnik does not usually provide a date for the citation. Many of the citations of “davenport,” to use the example quoted by Geoffrey Nunberg in the NY Times article, are from nineteenth-century novels found on Gutenberg.org. They are good citations, but you need to know the source in order to interpret them correctly. The site does not tell you, for instance, that Madeline Payne, The Detective’s Daughter, which it uses for three citations, was penned in 1883 and is not likely to represent current usage. Wordnik provides a link, so it’s easy to look the source up, but the user must be sophisticated enough to know that it’s important to click the link.

Wordnik does represent at least one way that dictionaries will go in the future, and as such it does represent the cutting edge of lexicography. But it is not truly “without arbiters” and it does need some work to make it friendly for the casual user, the person who just wants the answer and doesn’t know enough or want to ponder how the entry was constructed and how it should be interpreted.

The NY Times article also contains this amusing correction, at least for now. Once the editors discover it, I’m sure it will be corrected. But will they issue a correction to the correction?:

Correction: December 31, 2011

An earlier version of this article misspelled the given name of Wordnik’s cheif executive. He is Joe Hyrkin, not Joel.

Disclaimer: Erin McKean was editor of my book, Word Myths.

