April 26, 2010

Fantasy TV in the service of science: An open letter to HBO about "Dothraki"

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

Editor's note: Joshua Hartshorne is a graduate student at Harvard University's Psychology Department interested in human behavior and language. He wrote the open letter below because HBO is currently creating a new fantasy language, called "Dothraki," for an upcoming television adaptation of George R. R. Martin's A Game of Thrones. At least some fans are guaranteed to try to learn Dothraki, just as thousands have studied Klingon, Sindarin and Na'vi. The letter to Martin, the show's executive producer David Benioff and Dothraki creator David Peterson suggests a few different elements or structures for the language that could do science a favor by inventing a language that includes exactly those features that researchers would like to test to see if subjects—in this case, the show's highly motivated fans—can learn.

Dear David Benioff, David Peterson and George R. R. Martin,

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

As a long-time fan of George R. R. Martin's A Song of Ice and Fire, I have eagerly followed news of HBO's upcoming adaptation of A Game of Thrones, the first book in the series. I followed this news as a fan only—until I learned that the creative team at HBO had commissioned David Peterson to create Dothraki, a language spoken by several important characters in the story. As Mr. Martin explains in his blog, this will add detail to the rich tapestry of the story. It also presents a unique opportunity for science, and I urge you to consider the possibilities. I lay out the reasoning below:

Language universals and the human mind

Fundamental to understanding humanity is understanding language. Something about the human brain allows nearly every human child—but no chimp, mouse or kangaroo—to learn a language; children will even invent a new language if there are none available to learn. Linguistic universals provide clues as to what this something is.

Although the variation in human language is incredible, some aspects remain the same. Russian has the vowel "yery," which is lacking in English, and English has the consonant "h", lacking in Russian—but both Russian, English and all other spoken languages have vowels and consonants (sign languages, not surprisingly, work differently). In English, verbs come before direct objects ("Mary kicked the ball") whereas in Japanese, verbs come after direct objects (roughly: "Mary-wa ball-o kicked"), but all these languages have subjects, verbs and objects.

These universals may reveal the structure of our minds: all languages share these properties because we can't learn languages that work differently. However, the fact that all languages do share a property isn't proof that they must share that property. Universals could also arise by historical accident. The fact that Barack Obama's name sounds similar in every language isn't evidence of a universal, innate property of how the human mind conceives of the 44th president—rather, every language has imported his name from the same source.

Artificial languages

The only way to prove humans are incapable of learning a language with feature Y is to create such a language and prove that people can't learn it. In fact, researchers regularly teach people "artificial languages" in order to study language learning by humans (and animals) under experimentally controlled circumstances (here's a fun experiment you can do online), and some of these probe the psychological reality of language universals.

For instance, Spanish divides its nouns into masculine and feminine (other languages use other groupings). For some nouns, the distinction seems arbitrary (the word for "book" is masculine but the word for "novel" is feminine), but generally the distinction is not completely arbitrary (words describing male people and animals are masculine, whereas words describing female people and animals are feminine). This raises the question of whether you could have a language in which nouns were randomly divided into arbitrary categories. Artificial language experiments suggest that learning such arbitrary categories is difficult or impossible, but people easily learn categories built around some aspect of meaning (e.g., gender) or phonology (the way words sound).

There is an obvious limitation to this work: children take years to learn their first language, whereas most artificial language experiments last only an hour, simply because it's very difficult to convince a volunteer to spend more than a few hours studying a useless, artificial language—unless that language is spoken by popular, fictional species. Hundreds or thousands of people have spent untold hours mastering the finer syntactic points of Klingon (Star Trek), Sindarin (Lord of the Rings) or Na'vi (Avatar). Fans are already lining up to learn your Dothraki. This presents a unique opportunity to create a language that violates known language universals and see just how well people can learn it.

Some universal suggestions

There are many lists of proposed linguistic universals on the Web (my favorite is this well-curated, if dry, list of 2029 proposed universals). Below are my own personal favorite universals you might try violating:

Action verbs. For action verbs in English and possibly all languages, the subject is the doer and the object the do-ee ("Mary broke/kicked/threw the vase"). Though again there are a few more complicated languages, prominent theorists posit this pattern is an innate part of our linguistic minds. However, others argue the dominance of this pattern is an historical accident and verbs where the doer is the object and the do-ee is the subject should be perfectly learnable. Numerous studies have shown that both adults and preschoolers find it very difficult to learn subject-do-ee verbs ("The vase shbroke Mary" = "Mary broke the vase"), but again these studies are short, so perhaps the participants simply didn't spend enough time learning and using the new verbs. Use this pattern for Dothraki—or, even better, have some verbs follow one pattern ("break") and other verbs the other (shbroke)—and we'll see how well students can do given more time.

Word order. Klingon made a nice start here by using the extremely rare object-verb-subject word order. This word order can appear in many languages in unusual constructions or in poetry ("The drink drank I"), but is not the default in any, except perhaps a few rare, poorly studied languages such as Hixkaryána (600 speakers in the Amazon) and Huarijío (2,800 speakers in Mexico). Other rare word orders you might try are object-subject-verb and verb-object-subject. Although such word orders are not completely nonexistent in human language, their scarcity could suggest that they must be learned differently from more typical patterns.

Xor. Most languages have a word that means "neither," a word that means "both" and a word that means "one or both" ("or"). Interestingly, although logicians have invented words to mean "one but not both" (xor) and "zero or one but not both" (nand), natural languages do not have such words. Similarly, languages have words for "none" and "all" and "more than none" ("some"), but no language has a word that ambiguously means "more than none but less than all" ("only some") or a word that means "less than all" such that it includes the possibility of none.

The question is why? The fact that logicians use words like xor and nand suggests that such words aren't impossible to learn (though whether even logicians can use the word as fluently as they use "or" is an open question). Some have argued that the missing words are missing simply because they aren't needed. Interestingly, any language with only nand doesn't need "or," "and" or "both," since they can be constructed by just using nand several times in a row. So you might provide the Dothraki with only nand to see how well they get by; alternatively, give them xor and we can see if people use it.

Situational words. Some words mean what they mean regardless of context. "George Washington" refers to George Washington whenever and however you say it. Some words change meaning depending on who is speaking ("I," "we"), who is being spoken to ("you") or the day the word is spoken ("today"). However, this contextual dependence is relatively straightforward: as a first pass, "I" refers to the person speaking and "today" refers to the day the word is spoken (this is not quite right, but it's close). I don't know of any languages with pronouns that change meaning depending on the day of the week or time-related words (like "today") that adjust depending on who you are talking to. Is that because they are unlearnable or simply because nobody has invented one yet?

For fun, you could try even more complex words: a word that means "blue" on Monday, "green" on Tuesday and "black" any other day of the week, or perhaps a word that means "dog" when used as the subject of a sentence but "elephant" when used as the object. Or a language that uses subject-verb-object word order in the morning and object-verb-subject word order after noon. If these seem like absurd features for a language, they are, but many features of language are absurd (the Russian word for "manliness" is considered feminine). The important question is: Just how flexible is our language-learning capacity?

And then the science

Are human languages the way they are by accident, or does their structure reveal deep truths about the human mind? By constructing Dothraki such that it violates common principles of existing languages, you could set the stage for a powerful natural experiment. I imagine you are going to the expense and trouble to construct a full language for Dothraki at least partly out of an interest in and love of language. I only ask that you take it one step further.

Of course, there are limitations to such an experiment, since the learning of Dothraki will differ in many ways from typical language learning. Even die-hard fans won't use Dothraki as their primary means of communication. There won't be native speakers to learn from. The learners of Dothraki will probably be adults (though not necessarily), who generally do not learn language as well as toddlers. However, these limitations are no worse than what we scientists normally deal with (just to be clear—I don't recommend teaching toddlers Dothraki), and I believe this experiment would provide valuable new data.

I realize the miniseries is already in production and much work on Dothraki has already been completed. I suspect there is still time to sneak in a few of the experiments suggested above, but if not, I look forward to Braavosi in Season Four or High Valyrian.

Best Regards,

Joshua Hartshorne

P.S. Any who wish to contribute to the scientific study of language without creating or learning an entire fictional language should feel free to participate in my short experiments at GamesWithWords.org.

Image credit: Joshua Hartshorne

The views expressed are those of the author and are not necessarily those of Scientific American.