Can you trust the etymology dictionaries? 
Posted: 06 November 2013 08:03 PM   [ Ignore ]
RankRank
Total Posts:  64
Joined  2010-11-02

I was on this discussion board three years ago complaining that all the English dictionaries, in their brief summary etymology statements for dictionary words, are confidently reporting stories that often do not have decent supporting evidence. I focused on English words that the dictionaries were reporting to be of Arabic ancestry.

On and off since then I wrote up http://EnglishWordsOfArabicAncestry.wordpress.com . It is focused on giving the etymologies of the English words that are of would-be or actual Arabic ancestry, as such, and it does not focus on critisizing or highlighting the errors of the dictionaries as such. But I’d like to re-iterate the complaint I was making here three years ago, backed up now by what’s at that presentation. I only deal with words that are not rare. I’ve excluded Islamic words because it’s too easy to show they’re from Arabic. About 175 not-rare, not-Islamic words are reported by English dictionaries as coming from Arabic. I’ve gone through these words, one by one, looking at the evidence. I find the dictionaries are wrong in a major way in about 15 percent of the words (26 words). The other 85 percent are correct, or nearly correct, excepting some relatively smaller errors.

I find the error rate of each of the dictionaries is practically the same. In quite a few cases it happens that most dictionaries give the same erroneous report, but one dictionary, or mabye two, doesn’t make the particular error on the particular word. In these cases, the dictionary that is not making the error is pretty much random and unpredictable. An exception is that the NED says more often that there’s no evidence when in fact there’s no evidence. Today’s Concise OED has got more serious errors than the NED, due to the Concise OED’s tendency to report something as a certainty when the NED says it’s a speculation. However, the NED also makes errors of the other type (”type II errors”); e.g. the NED says marcasite, safflower and spinach are words of uncertain origin, whereas the medieval Arabic origin of those words was well documented at the time the NED was being written (and today’s Concise OED is correct in saying they’re from Arabic (although the Concise OED is in error in saying the Arabic source-word for safflower was asfar = “yellow”, because the Arabic source was usfur = “safflower")).

At http://EnglishWordsOfArabicAncestry.wordpress.com , two-thirds of the text is in the footnotes, and only one-third of the text is in the primary body of the presentation; i.e. the evidence is mostly in the footnotes. The footnotes have hundreds of external links to online evidence sources in Arabic, Latin, French, Spanish, Italian, Catalan, German, and English. The words that have the longest footnotes tend to be ones that the dictionaries have made the worst errors on—those words include CALIBER, CORK, GUITAR, LILAC, NATRON, SODA, RACQUET .

When I was here three years ago I raised questions about the following twelve words that the dictionaries claim are from Arabic: albacore, alizarin, almanac, caliber, cork, genet, lilac, hazard, massage, racquet, massicot, and scarlet. To those I now add fourteen more: alkanet, amber, attar, borage, carafe, fustic, gauze, guitar, natron, sandalwood, soda, tobacco, tambourine/tambour (meaning a drum), and typhoon. Supplementarily I find relatively smaller errors on eight more: abelmosk, alfalfa, curcuma, garble, jar, lac, safflower, and zenith. I count it as a major error if the English dictionaries correctly report a word is from Arabic but are very incorrect about the way the word was transferred from Arabic to the West.

As you know, there are many words whose history can be well-documentedly carried back to Old English or Ancient Greek, and the etymology stops there, and the etymology is good. There are many other words whose truly clear and obvious history starts in the late medieval or early post-medieval centuries, and for this class of words there is a worthwhile effort to carry the etymology of the word back a step farther, to explain where the late medieval or early post-medieval word came from. And this effort is a success in many cases, as it produces documentary evidence that is convincing when you think about it. And in other cases as you know what’s produced is mere speculation, with unconvincing documentary evidence. The worst problem, though not the only problem, is that today’s dictionaries have an ugly tendency to accept and report the speculations as if they were supported by convincing evidence. The result is that for most words, picked blindly at random and considered individually, you cannot trust the dictionaries to be correct about the word’s history (and that includes words that the dictionaries tell you can be carried back to Old English). If you need to be sure about a particular word, you need to seek out the evidence elsewhere. You cannot trust the dictionaries to have done the job reliably for you.

Another problem from my experience is that for a large minority of words it’s not easy to find a place that gives the evidence, despite umpteen places that give summary conclusions and dogmas.

Profile
 
 
Posted: 07 November 2013 04:56 AM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRankRank
Total Posts:  4811
Joined  2007-01-03

Be careful about citing old dictionaries and extrapolating the results to current scholarship. For instance the NED (a.k.a. OED1) does get the etymology for marcasite wrong (entry published in 1905). The OED3 corrects this error. Spinach and safflower, the other two entries from the OED1 you claim to be wrong haven’t been updated yet. If you consult references that are over a century old, you will find errors (regardless of what the reference is or what field you are studying; you wouldn’t take a biology textbook from 1905 and complain that it doesn’t discuss the structure of DNA; the same goes for dictionaries).

As you know, there are many words whose history can be well-documentedly carried back to Old English or Ancient Greek, and the etymology stops there, and the etymology is good. There are many other words whose truly clear and obvious history starts in the late medieval or early post-medieval centuries, and for this class of words there is a worthwhile effort to carry the etymology of the word back a step farther, to explain where the late medieval or early post-medieval word came from.

Many English dictionaries, the OED included, have a policy to stop tracing the origin of a word once the language it has been directly borrowed from has been identified. This policy is in place because otherwise few entries would ever be completed. Etymology can suck up many hours of research and editorial time. (Especially as further etymological work often requires knowledge of languages that is not easily found, e.g., how many scholars of Aramaic are there?) Often the OED editors will make an exception and go further back for a particularly compelling entry, but they never promise to do so. Any reference has its limitations, and a good researcher has to learn and take those limits into account. Yes, it would be nice for dictionaries to include full and detailed etymologies for every word. Is this a realistic expectation? Not in the slightest.

Another problem from my experience is that for a large minority of words it’s not easy to find a place that gives the evidence, despite umpteen places that give summary conclusions and dogmas.

Welcome to the world of research. Secondary sources, like dictionaries, are by nature summaries. There are limits to what they can cover in space and with available editorial time. For the gory details, you have to wade through bibliographies, journals, specialized references, and often primary source material. This is true not only of dictionaries, but just about everything.

Profile
 
 
Posted: 07 November 2013 06:46 AM   [ Ignore ]   [ # 2 ]
Avatar
RankRankRankRankRank
Total Posts:  3149
Joined  2007-02-26

Many English dictionaries, the OED included, have a policy to stop tracing the origin of a word once the language it has been directly borrowed from has been identified.

Hmm, but the OED traces words borrowed from French back to Latin, so they seem to apply that policy unevenly.

Profile
 
 
Posted: 07 November 2013 07:39 AM   [ Ignore ]   [ # 3 ]
Administrator
Avatar
RankRankRankRankRank
Total Posts:  4811
Joined  2007-01-03

As I said, they make exceptions. I would suppose a lot depends on the ease with which one can take a word back further. From French to Latin doesn’t require much work or an unusual skill set on the part of the editors.

Profile
 
 
Posted: 07 November 2013 07:54 AM   [ Ignore ]   [ # 4 ]
RankRankRankRankRank
Total Posts:  3538
Joined  2007-01-29

In general, the revised entries in the OED not only take words as far back as they can be taken (to PIE, if possible), they provide a wide range of cognate words in other languages.  I think your “exceptions” are in fact the rule.

Profile