Skip to main content

Visions: Predictive Text

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American


In the series "Visions," science fiction about the very latest research will be paired with analysis looking into the facts behind the fiction. The goal is to marry ripped-from-the-headlines science fiction with analysis into the possibilities hinted at by new discoveries.

The search engine's spokesman brought the reporters down the aisles of the company's giant Missouri server farm.

"So we've long tracked trends in search queries, which has given scientists insights into flu activity, among other things," he said as the computers hummed quietly around him. "We've also experimented with computer-generated news stories based on publicly available data."


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


"Now, here's our latest development," he said, waving his arm expansively. "We call it Early Edition. Based on the search queries that are most popular at any given moment, we figure out what news people might most want updates on, and based on trends seen in news and other databases we mine, Early Edition automatically generates stories about what might happen next."

A brief shocked silence was followed by the question "How accurate are these stories?" blurted out by the sports and business reporters there at the same time, one imagining sports scores appear before they happened, the other envisioning stock prices.

"Oh, they're often completely inaccurate," the spokesman chortled. "Or rather, Early Edition typically generates multiple stories, one for each possible future it sees. It ranks these stories according to how probably it thinks each story is, but of course even improbable events sometime happen."

Seeing all the perplexed looks on the journalists' faces, the spokesman explained, "The idea of Early Edition isn't to predict the future, as one might think, but rather to help users prepare for the most likely futures, like the storyboarding or previsualization directors often conduct before shooting movies. It helps people think about what they might want to search for now to help understand what might come. Advertisers can also benefit by catering to potential future needs. If you come this way, I can show you just what I mean..."

Later, the spokesman flopped down onto the couch in the company president's office.

"So what did they think?" the president asked.

"Oh, they seemed to buy it," the spokesman said, taking a swig from the bourbon in hand.

"That's what Early Edition predicts," the president agreed.

The spokesman chuckled, then furrowed his brows and sighed. "How far out can it look right now?" he asked.

"Right now? A little less than a day," the president responded, leaning back in his chair. "Any more requires even more servers than we currently have, exponentially more. Still, it's good enough. Early Edition hasn't been wrong yet."

The spokesman craned his head sideways to eye the president. "You know, one of these days, Early Edition will be wrong. Quite wrong. All the theorists say it."

"Oh, I have no doubt, no doubt," the president said. The hint of doubt in his voice when he said that, however, was palpable.

***

Scientists are increasingly mining Google and Twitter for valuable data. As trivial as the latter might often seem, the sheer volume of messages now tweeted daily — an average of 230 million per day, according to September statistics — is enabling researchers to unearth insights into human behavior, such as global patterns in mood, as one paper last week demonstrated.

Search engines also engage in predictive text to figure out what queries users might type in. Of course, predictive text is far from always accurate — comedy site Damn You Auto Correct! has created a cottage industry collecting especially laughable instances of iPhone autocorrect mistakes.

Journalists are often no better at predicting the future. In addition to the infamous "Dewey Defeats Truman" flub — incumbent U.S. President Harry Truman actually defeated Republican challenger Thomas Dewey in the 1948 presidential election in an upset victory — newspapers are still making such pre-writing mistakes, with the Daily Mail recently erroneously announcing that Amanda Knox was found guilty.

Who knows where all the data and number-crunching now taking place might lead to in the future? Still, I would take any predictions made by computer or otherwise with a significantly large grain of salt.

You can email me regarding Visions at toohardforscience@gmail.com.

Charles Q. Choi is a frequent contributor to Scientific American. His work has also appeared in The New York Times, Science, Nature, Wired, and LiveScience, among others. In his spare time, he has traveled to all seven continents.

More by Charles Q. Choi