Futuristic Swearing

6 October 2006

There was a mention above of the show Battlestar Galactica, which has competed with The Wire for the honorific of the best show currently on television. For those who remember the cheesy 1970s television series of that name starring Lorne Greene, this current incarnation of the show is very different from the original. While it retains the basic premise of the original (a fleet of ships, led by the battlestar Galactica, carries the remnants of the human race in search of the mythical planet known as Earth) and many of the characters have the same names as in the original, the show is a superbly written and acted drama.

But of linguistic interest is the word frak. It is an expletive with the same meanings and emphasis as the morphologically similar English expletive. It is one of group of fictional expletives that is used in science fiction tales to simulate linguistic change (and get past the real-world censors). The crew of the Galactica use frak in exactly the same ways that the English word is used, including in combinations like motherfraker and as verb meaning to copulate.

Joss Whedon’s short-lived science fiction show Firefly and the subsequent feature film Serenity brought us gorram, a future dialectical version of god damn. Whedon also has his characters lapse into Chinese when they go into a fit of cursing, a hint of the demographics of the future.

Veteran writer Robert Heinlein used many such invented curses in his science fiction books. Frimp, meaning the sex act, and kark, meaning excrement, in I Will Fear No Evil; the word kink is a swear word of ambiguous meaning in The Door Into Summer; and slitch, a blend of slut and bitch, appears in Friday.

The Star Trek series used their share of invented swear words. There is the Klingon epithet p’tahk, meaning lowly being, jerk. Andorians refer to humans as pinkskins. And in one particularly absurd episode of the original series, space hippies refer to Captain Kirk as Herbert, meaning square or nerd.

In Richard Adams’s Watership Down, which is not science fiction but an enchanting tale of the secret lives of rabbits set in 20th century England, the lapines use hraka, their word for excrement, as an expletive.

It’s common for science fiction writers to take such liberties with swearing to both make their material more acceptable to a wider audience while adding a bit of linguistic flavor to their future worlds.

Finally, in Douglas Adams’s The Hitchhiker’s Guide to the Galaxy the ultimate swear word was Belgium, described as “the concept it embodies is so revolting that the publication or broadcast of the word is utterly forbidden in all parts of the galaxy except one, where they don’t know what it means.”

American Dialect: Baltimore & The Wire

6 October 2006

It’s rare for a television show to accurately portray regional or social dialects, but one show that has done so consistently over the past few years is HBO’s The Wire. Set in inner-city Baltimore, the show details the exploits of a police department wiretap unit and the drug dealers it pursues. Throughout the show, which has just started its fourth season on the pay-TV channel, the characters speak in authentic Baltimore dialects.

In addition to its linguistic accuracy, it’s probably the best show on television right now–at least, that is, until tonight’s season premiere of Battlestar Galactica. The drama is gripping, the dialogue well-written, and the characters multi-layered and intricate. It’s also a very adult program, not at all suitable for children, with horrifying portrayals of the violence that surrounds illegal drug trafficking.

But here we’re primarily concerned with dialects and there are two major ones that appear on the show. The most striking, at least to most viewers, is the speech of the drug dealers, a local variation on the inner-city African-American dialect. The second is almost Southern dialect of Baltimore’s white working class, heard on the show most often from the mouth’s of police officers.

The opening scene of this season’s first episode demonstrates the difference between these two dialects, both linguistically and socially. Snoop, a cold-blooded assassin, is buying a nail-gun in a hardware store. The store clerk shows her one that is the "Cadillac of nail guns." Later on, Snoop tells a fellow assassin "He mean Lexus, but he ain’t know it." A deft bit of writing that is both authentic and demonstrative of the social distinctions contained within automobile brands.

Last season, when a police commander permitted drugs in a particular section of west Baltimore in order to control crime everywhere else in his district, the police officers took to calling the area Amsterdam, because drugs were "legal" there. The dealers, not understanding the reference to Dutch drug laws, took this to mean Hamsterdam, because it was like a cage where the police could watch them like pet rodents. Another bit of clever writing that combines a pun with social commentary.

Among the dealers, what up and yo are the common greetings. One boy tells a friend after spotting an undercover cop, "yo, he police." The most common epithets are bitch, used to refer to both men and women (or perhaps more accurately boys and girls as most of the dealers are, sadly, still in their teens), and nigger.

Working-class black dialect is also captured. An African-American political campaign manager assures his white candidate that it is possible for him to win the black vote, "Black folk been voting white for a long time. It’s y’all that don’t never vote black."

Distinct from the African-American dialects portrayed on the show, but just as authentic, is the speech of the white police officers. Bawlmer accents abound, particularly among the minor characters–it’s clear that the producers have done a lot of local casting. Police slang is also common. The word police is used to mean police officer, as in "being a police isn’t just about carrying a gun." Innocent bystanders are taxpayers, ghetto youths are hoppers, and quick arrests without a lengthy investigation are rips. And of course, the famous Baltimore honorific hon is often heard.

If you subscribe to HBO, I highly recommend tuning in to The Wire and listening closely to the dialects portrayed there.

Planetary Update: Xena Is Cancelled

15 September 2006

The dwarf planet 2003 UB313 has been given an official name by the International Astronomical Union (IAU). Unofficially named Xena, after the television heroine, by its discoverer Mike Brown of Caltech, the object will now officially be known as Eris, the Greek goddess of discord. And the dwarf planet’s moon, previously dubbed Gabrielle after Xena’s companion on the TV show, will now be known as Dysnomia, Eris’s daughter and demon of lawlessness.

The discovery of Eris/Xena was what prompted the recent definition of the term planet by the IAU.

The use of the Greek name falls into line with that of the other planets, which also have the names of Greek or Roman gods and not the naming convention used for other non-planets, that of deities from other mythological traditions like Sedna and Quaoar.

It should also be noted that the name Dysnomia shares its first syllable with Brown’s wife Diane. Similarly, Pluto’s moon Charon shares an initial syllable with Charlene, the wife of Clyde Tombaugh, its discoverer.

(Dysnomia’s association with lawlessness and the fact that the part of Xena was played by actress Lucy Lawless is assumed to be entirely coincidental.)

Word Watch: pretexting

15 September 2006

The Hewlett-Packard scandal involving its Chair Patricia Dunn hiring private investigators to spy on other board members has brought the term pretexting to the fore. Pretexting is the obtaining of private records about an individual by pretending to be someone authorized access to them. The term comes from the idea of creating a false pretext justifying access to the data.

The term is not new, however, having been around for least 14 years. From the 9 March 1992 issue of Computerworld magazine:

Another technique, called “pretexting,” is to get the data by phone after claiming to be an [Social Security Administration] employee from another office where the computer is down.

Somewhat earlier, is the more general use of the term to mean the creation of false pretenses. From the Usenet group soc.culture.vietnamese, 4 February 1992, Vietnamese Legend (The Happy Dream):

Worried at not finding him back, he sent for Sinh several times; but the latter refused to return to the Court, pretexting that he had to stay for a while to organize the administration of the occupied country.

Classifying Human Knowledge, Part 2

8 September 2006

Last week we looked at the Dewey Decimal and the Cutter Expansive Classification systems for organizing books. The other major system in use by English language libraries is the Library of Congress Classification or LCC system. The LCC is used and maintained, obviously, by the Library of Congress in Washington, DC and is also used by most of the larger libraries in the United States, including most university and research libraries.

Originally designed by Herbert Putnam in 1897, the LCC replaced Thomas Jefferson’s classification system in the Library of Congress. The top-level hierarchies are based on Cutter’s classification system, but the rest of the LCC system differs markedly from Cutter’s.

The great advantage of the LCC is that usually the same number, the one assigned by the Library of Congress, is used by all other libraries, so finding a book across libraries is easier. Its classification of technical and scientific subjects is also well regarded.

But it does have some distinct disadvantages. Unlike Dewey, the subcategories are not consistent. Each major category is subdivided on its own without consideration of what designations are used in other categories. It is also designed with the needs of the U.S. Congress in mind. So categories are often broken out geographically, even when this doesn’t make much sense for many researchers.

The LCC divides all human knowledge into 20 major categories, each one given a letter:

  • A – General works

  • B – Philosophy, psychology, religion

  • C – Auxiliary sciences of history (e.g., archeology, heraldry, geneology, biography, diplomatic history)

  • D – History, general and Old World

  • E & F – American history

  • G – Geography, anthropology, recreation

  • H – Social sciences

  • J – Political science

  • K – Law

  • L – Education

  • M – Music

  • N – Fine arts

  • P – Language and literature

  • Q – Science

  • R – Medicine

  • S – Agriculture

  • T – Technology

  • U – Military science

  • V – Naval science

  • Z – Bibliography, information resources

You can see the government influence in the breakout of law, agriculture, military science, and naval science as top level domains.

Second Tier domains are designated with a second letter. The breakout for category P, language and literature, is as follows:

  • P – Philology, linguistics

  • PA – Greek & Latin

  • PB – Modern & Celtic

  • PC – Romanic

  • PD – East & North Germanic

  • PE – English

  • PF – Other West Germanic

  • PG – Slavic

  • PH – Uralic

  • PJ – Semitic

  • PK – Indo-Iranian

  • PL – East Asian, African, Polynesian

  • PM – Native American & artificial

  • PN – General literature

  • PQ – Romance literature

  • PR – English literature

  • PS – American literature

  • PT – Other Germanic literature

  • PZ – Fiction and children’s literature

These second-tier categories are further subdivided into categories designated with numbers. The PE subclass is broken out as follows:

  • PE101-458 – Old English

  • PE501-693 – Middle English

  • PE814-896 – Early Modern English

  • PE1001-1693 – Modern English

  • PE1700-3602 – Dialects

  • PE3701-3729 – Slang

These numbers are followed by a Cutter number for the author’s last name and they year of publication. The LCC number for Word Myths is PE1584 .W55 2004.

A high-level outline of the LCC is available here. (I must go off on a rant here. The Library of Congress charges many hundreds of dollars for access to the complete classification system. While OCLC (Dewey) and the UDC consortiums also charge, these are non-profits and the fees fund the continued maintenance of the systems. The LCC, however, is paid for by the U.S. taxpayer and should be available for free to all who ask. This is unconscionable.)

So far the three systems we’ve examined have represented a single ontological methodology. They are all designed to organize books on the shelves of a library. They are all designed to group like things in a category that is assumed to be useful to researchers. The aim is to place books on the same topic in proximity to one another so that researchers can scan the shelves of the appropriate section for relevant books.

The great advantage of this system is that the collection can grow without reorganizing the catalog. As shelf space in a particular section is used up, the books in that section can be moved without all the other categories changing or moving–all that needs to be changed is the map showing where each section is. It’s a great methodology if your intent is to catalog physical items that can be located in only one place–like books.

There are some obvious drawbacks. One is that most books can be classified into multiple categories. Is T.S. Eliot an American or British poet? Does Cassell’s Dictionary of Slang belong with the dictionaries or with the books on slang? Why is the book Marley & Me, about an unruly Labrador retriever, filed under animal husbandry? If you think about it, dogs are domestic animals, but how many people are actually going to think to look under this category for a book about a house pet?

Another disadvantage is that the categories that are useful change over time. Why does the Library of Congress classify Bantu and Mandarin in the same category (PL)? Because when the categories were created, there weren’t many books in the Library’s collection in those languages. Also, the Library of Congress has a category on East Germany . This was once very useful. But now, aside from books about the period from 1945-89, it’s not terribly relevant. Do we continue to file books about the region in this category?

But most of all, in a digital age such a system isn’t that important. If your "books" are electronic, who cares where they’re "shelved." You can create "virtual shelves" of material as users demand.

One method that is increasingly being used to categorize electronic resources is tagging. It’s not a formal system and information is not classified in advance. Instead, people assign tags to a resource as they go. An electronic source about baseball slang, for example, could be tagged with terms like baseballslanglanguagesportsAmerican sports, etc. There are no restrictions on what tags can be used or on how many can be assigned to each resource.

The idea is that if enough tags are assigned by a number of different people, then patterns will begin to emerge. The baseball slang resource will appear in searches for both baseball and language. The advantage is that since the information can be classified in an unlimited number of categories the problem of looking in the wrong place goes away. With enough tags, you will find the book no matter what search path you take.

What would seem to be an obvious criticism of tagging, that there will be little consistency between how different people apply tags, actually turns out to be a great advantage. One person may tag works as being about cinema, while another uses movies. Wouldn’t it be better to use a single tag instead of both of these? Or would it? Aren’t cinema and movies really two similar, but distinctly different categories? In one you have The Bicycle Thief, in the other you have Titanic. There will be some overlap of course, with some using the same tag for both. But by ordering search results by "relevance," the person searching under cinema will be presented with The Bicycle Thief first, with Titanic being way down on the list. Someone searching on movies will get the opposite result.

And tagging makes virtual shelves possible. In the older systems, classifying 20th century American authors would be alphabetical. So Robert Heinlein would be interposed between Zane Grey and Larry McMurtry. Wouldn’t it be better to have Heinlein classified with Isaac Asimov in Sci Fi, and group Grey and McMurtry with the other Westerns? Tags let you organize 20th century American authors alphabetically or by genre, whatever your needs are at any particular moment.

Tagging is not restricted to online resources. One can also use it to classify books and other physical objects. In such a case, each book must be assigned a unique identifier (e.g., ISBN). This ISBN can then be used to locate the physical object when needed. This does not shelve these books in same place as other books on the same topic, but it is not a disadvantage in a closed stack library where books are brought to the researchers.

An example of a catalog that uses tagging is www.librarything.com, which brings us back full circle, for it was that web site that got me thinking about cataloguing. Another example is del.icio.us, a site that catalogs web bookmarks.

To read an excellent discussion of ontological methodologies and a more complete description of tagging, check out Clay Shirky’s Ontology is Overrated.