Wednesday, July 18, 2007

the sentence dictionary

If I may brag just a little bit, I am really excited to have just been hired as a freelance lexicographer for Oxford University Press. They — no, we! — are beginning a new collaborative online project to compile something called the Sentence Dictionary. Essentially it will be a web-based dictionary based on the already-existing New Oxford Dictionary of English (NODE) and New Oxford American Dictionary (NOAD), Oxford's major single-volume dictionaries of contemporary (British and American, respectively) English. As I understand it, the Sentence Dictionary will include headwords, pronunciations, definitions, and etymologies identical to NODE and NOAD, but will include much more thorough documentation of each word's usage "in the wild," as it were — in a list of real-life examples for each headword. This will be particularly useful for foreign learners of English, and for native speakers looking up unfamiliar words that they wish to use idiomatically in their own writing.

The examples are to be culled from a corpus of contemporary English, the billion-plus-word Oxford English Corpus, which contains many thousands of books, web sites, magazine articles, stories, journals, blogs, and the like, all of which have been hand-selected and tagged for their country of origin and vetted for quality to some degree. Oxford's corpus software is able, in many instances, to tell when a word is being used as a particular part of speech and in a particular sense; for example, it can tell with reasonable accuracy when ground is being used as a noun meaning "earth," a past participle of "grind," or a verb meaning "stop (a plane, etc.) from flying." But the software's accuracy is far from 100%, and even when it does make the correct choice, a human is needed to determine which are the "best" examples. An example sentence that disparages a particular politician, for instance, is undesirable both because it expresses a strong opinion that may distract from the sentence's illustrative function and because it contains a reference that could soon seem dated or obscure.

This is where we freelance lexicographers come in. In a test exercise I did for Oxford as part of the application process, I went through hundreds of potential example sentences (all for words beginning with adv-) and determined which were the most suitable. (Let me say that I have a much fuller appreciation for the subtleties of the word "adventure" than I did before!) Apparently my soon-to-be bosses agreed with my decisions in enough cases that they want me to do more such work for them. It's not the most dynamic sort of task, but it is kind of dorkily influential. I am now a lexicographical taste-maker of sorts, however anonymously. How exciting!

2 comments:

Jared Alessandroni said...

Your nerdiness exceeds all expectations. Congrats!

Shifra said...

You are so cool. Congratulations on being the biggest badass ever!