I’m glad they came to the sensible decision, but they couldn’t resist clinging to the old idiocies along the way:

strictly-speaking, data is a plural term. Ie, if we’re following the rules of grammar, we shouldn’t write “the data is” or “the data shows” but instead “the data are” or “the data show”.

For “the rules of grammar” read “the rules of Latin grammar,” and it makes no sense to try to follow the rules of Latin grammar in English.  (I recently had someone tell me in all seriousness that if you’re talking about five sputniks [to pick an example—I’ve forgotten the actual loan word in question] you should say “five sputnikov,” using the Russian genitive plural because that’s what you use after the number “five” in Russian!  I asked whether we should count “one sputnik, two sputnika, three sputnika, four sputnika, five sputnikov,” the way they do in Russian, but never got an answer.)

To be fair to the Guardian, though, they do mention the OED’s discussion of the term “data”, which, itself, says that in English “data” was historically used with plural verbs (and that it still used with plural verbs in some scientific fields).  It isn’t totally clear if they are basing their statement regarding the “root” of the issue on 1) the rules of Latin grammar, 2) historical usage of the term in English, 3) current use of the term in some technical circles, or 4) some sort of poorly-defined mish-mash of the above three things.  But I agree that their identification of the “root” of the issue seems to miss the point, which is whether, as a publication that is aimed at a wide audience of modern, mostly non-technical, English readers, it should use “data” in a manner consistent with the usage of modern, non-technical, English speakers, or whether it should insist on a usage that would be jarring to most such users out of a desire to appease the prescriptivists (who will likely find much to complain about in The Guardian in any event).

I’m more than a little puzzled by the distinction the WSJ apparently makes in a half-concession to current usage. It seems to direct its writers to use “data” with singular verbs when “data” refers to an identifiable collection of information, but to insist on plural verbs when referring to a not-yet-collected body of information.  The rationale seems to be that “data” can only be properly used as a singular when it refers to a “collection” of information, and a body of information can not be said to be a collection if it has not been collected.  So, one should say “the data that is available support’s Zandar’s theory”, because there is a discrete and identifiable body of data that has been collected about that theory as of this moment, but one should say that “the data are still being collected” because, by definition, data that are in the process of being collected cannot be said to be a collection of data. 

This strikes me as a rather silly distinction that, in addition to being pointless, would likely be opaque to the WSJ’s readers, who, i think, would most likely assume that a writer erred if he or she wrote that “the data that is available supports the x theory” but later wrote in the same article (and, perhaps, the same sentence) that “the data are still being collected”.

And what is the point of the distinction?  Those who favor “the data are” probably favor such usage in all circumstances, while those who find “the data are” to clang in the ear likely find it jarring in all circumstances.

The Guardian, I take it, simply has its writers say “the data is” in all circumstances, which seems like a happier solution.

I don’t intend to get into this tired old argument - has long since shown me the way away from prescriptivism. Let people say what they like (they will anyway). As for style guides, I’m all for them........ in the places where they belong. But that brings me to this person David Marsh. I must say that I take exception to being disparaged as “increasingly hyper-correct, old-fashioned, and pompous”, because I’m used to saying “the data are” (I also call an item in a set of data “a datum” - what else?). Let me say what I like, too! Why must this man stoop to the vocabulary of political invective to boost his argument?  “Old-fashioned” I don’t object to being called, mind; more often than not, nowadays, that’s a compliment. I enjoy civil discourse, too - more, it seems, than does Mr. Marsh. ”Guru”? If we’re going to get into a slanging-match, try “asshole” for size, Mr. Marsh.

In my field, it is common to treat the word data as a plural. Perhaps this is because we are typically dealing with countable numerical samples.