Work of the Week: Week 4
Posted in: Linguistics News
Work of the Week>
Welcome to Week Four of “Work of the Week”! Each Monday, we will post a list of the ten most frequent words in a well-known work of fiction or non-fiction, along with a count of the number of times each word occurred. The challenge, if you choose to accept it, is to figure out what the work for the week is. At the end of the week, we’ll post the answer along with the ten most frequent words from another work.
Some caveats:
Most of the works we’ve chosen were written in English, but a few are widely read translations into English.
The function words (pronouns, prepositions, auxiliary verbs, etc.) have been removed. These words would dominate the top ten lists but are usually uninformative.
If a top ten word would completely give away the name of the work, e.g., the word Moby, we’ve deleted it manually.
If a word occurred in different forms, e.g., spear and spears, or throw and threw, the forms have been merged together for the count.
Here is the answer for Week Three’s “Work of the Week”. See if you guessed right!:
monica 612
miss 509
make 458
rhoda 448
barfoot 420
woman 398
man 342
day 302
widdowson 291
time 281
Answer: The Odd Women by George Gissing
Now, here are the ten most frequent words, with their counts, from this week’s work:
en 344
de 298
tom 293
wilson 221
make 163
dat 155
man 129
time 120
good 102
house 102
Can you identify this work? Find out next week!
Except for the deletion of words like Moby, the lists have been automatically generated by programs written by students in the Computational Linguistics Certificate Program at Montclair State University from texts obtained from Project Gutenberg.