I discount heavily for the fact that the researchers are not linguists and are not used to dealing with this sort of thing. Mark Liberman at Language Log has (as you might expect) a good take on it; in his recent post about it he sums up thus:
In other words, essentially all of the “drastic increase in the death rate of words” is in fact due to changes in the rate of mistakes at various stages of the data production process – spelling, editing, type-setting, and optical character recognition – that leads from the history of language to the lists of strings in the Google unigram corpus. And some portion — maybe most — of the “dramatic” decrease in the birth rate of g-words is also really a dramatic decrease in such book production and data processing errors, which have nothing to do with the life or death of “words” in the linguistic sense at all.
Which is not to say there’s no there there, but the results are much less dramatic than the WSJ piece makes them appear.
More on “culturomics” at the Log here.