September 20, 2012

How to Not Find What You're Looking For

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

On August 7th, Zookeys published a paper on the discovery of the Semachrysa jade, a new species of the insect green lacewing. The discovery was noteworthy enough to be picked up by Science two days later because Shaun Winterton, the primary researcher, didn't encounter the insect in its native Malaysia, but on the photo-sharing website Flickr.

The discovery made by Winterton and photographer Hock Ping Guek should be heartwarming—not just for utopian-minded futurists and procrastinators seeking justification, but for researchers looking to capitalize on the largest centralized repository of information ever seen. But to make serendipitous discoveries more common, we must first understand their nature.

The word serendipity itself comes from Horace Walpole, who wrote that the main characters in “The Three Princes of Serendip” were “always making discoveries, by accident and sagacity, of things they were not in quest of.” We seem to have no trouble remembering the accident part of chance findings, but the second part is worth repeating: a successful discovery lies just not in the unexpectedness of what we find, but in our ability to make sense of it and connect it to what we already know.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The seemingly paradoxical idea of planning for serendipity online is just a bit younger than the internet itself, when the potential of its vast quantity of information became apparent. “Discovery is Never by Chance: Designing for (un)Serendipity,” presented at the ACM conference on Creativity and Cognition, posits to computer scientists a few major distinctions to the world of serendipitous information retrieval, such as whether or not the person was looking for something in particular or just browsing.

Users engaged in casual browsing may be the most receptive to receiving information that’s just outside their specific goals. In An Algorithm for Discovery,” an editorial for Science, neurologists David Paydarfar and William J. Schwartz distilled their recommendations for the discovery process down to five essential elements. The first step, they wrote, was “Slow down to explore.”

And yet, slowing down and exploring is at odds with how scientists are trained to conduct research.

“We are taught that research is a very stepwise type of process that follows specific elements, and there's really no formal acknowledgement of serendipity and unexpected discoveries in this process. We looked at a number of models of information literacy and they’re all very linear,” says Sanda Erdelez, a professor at the University of Missouri’s School of Information Science.

A relentless focus on testing—and noticing—nothing but the hypothesis may lead to an efficient workflow, but it also guarantees that a lab will ever make a truly remarkable or unexpected discovery. That’s not to say that slowing down can’t be learned. Randall Peterson, Associate Professor of Medicine at Harvard Medical School and author of “Systematizing Serendipity for Cardiovascular Drug Discovery” knows firsthand that it’s not just the responsibility of the individual researchers to slow down and take note.

“What we try to do in our lab is create a culture where unexpected observations are valued, talked about and pursued. We give freedom to people to pursue a lot of unexpected findings.”

It’s important to note just how Winterton, who is quite fond of using Flickr himself, made the discovery. “The images I came across by Kurt were in fact random, as Flickr presents you with random images when you sign in, presumably based on your previous interest,” wrote Winterton in an email interview.

Had the photos on Winterton’s sign-in page been shown completely at random, he would have seen photos of weddings, landscapes, cities, and cats. Instead, Flickr’s randomness was highly personalized, displaying photos of interest to Winterton based on his user habits.

“The reason personalization creates opportunities for serendipity is that people don’t know what to do with random new information. Instead, we want information that is at the fringe of what we already know, because that is when we have the cognitive structures to make sense of the new ideas,” wrote Jaime Teevan, coauthor of “Discovery is Never by Chance,” via email. “Personalization helps us find things at the fringes of our current knowledge.”

Scientists and researchers can capitalize on the potential benefits of personalization by creating user accounts on websites they frequent.

Each scientific discipline has its own language; jargon saves time when two psychologists want to discuss ideas, but it’s a huge linguistic roadblock when you try to enter interdisciplinary realm.

“My wife [Suzanne Kennedy-Stoskopf] is a veterinarian and an immunologist, and I was reading something she had about clarifying the solution. I asked, ‘what is this?’,” said Dr. Michael Stoskopf, a veterinarian who received his PhD in Toxicology and is a professor at NC State University. He’s also the author of “Observation and Cogitation: How Serendipity Provides the Building Blocks of Scientific Discovery,” which calls for an embrace of observation-based discoveries.

“She said, ‘when you put it in a centrifuge.’ I asked, ‘why don’t you call it ‘centrifuging it?’ She said ‘Well we call it clarifying.’ I laughed, because clarifying made me think of clarifying butter.”

Systems that directly allow for fuzzy searching between disciplines may be much further away than we’d like: the experts themselves are having a hard time getting it off the ground. As Microsoft information retrieval and human-computer interaction researcher Susan Dumais notes, “Interestingly, we have referred to this problem as verbal disagreement, vocabulary mismatch, and statistical semantics.”

Not only do identical ideas get called by different names, but compatible ideas are completely lost in the mix. A cognitive psychologist studying the primacy effect might benefit from an insight about first-mover advantage, but may be completely unaware of the idea. The best workaround for this scenario is the oldest one in the book, to make use of social connections.

Social bookmarking websites like Pinboard, Delicious, and Diigo, allows users to store a webpage alongside metadata. If you’re interested in networks, for example, looking at websites with tags of networks or SNA might yield some interesting results. But here, the scientific community is at somewhat of a disadvantage: these websites are only as good as their users, and right now, there don’t seem to be many researchers onboard. (Meanwhile, Diigo has a large educator population and techies seem to flock to Pinboard.) Pinterest offers bookmarking and grouping for images—a great format for amplifying the prospect of making serendipitous discoveries, but right now it’s flooded with recipes and home decorating ideas. A similar website catering to scientists and researchers would allow for a more centralized location to share interesting findings.

With scientific journals behind paywalls and indexed in different databases, increasing the use of Open Science initiatives and third-party repositories for sharing interesting findings currently appear to be the best bet for users looking for more serendipitous discoveries. No one benefits when the world’s most brilliant insights are published on an unread blog. Hock Ping Guek, the professional photographer in Malaysia who first captured the Semachrysa jade, places plenty of high quality photos on his Flickr page and uses that site for networking. Because his work had also been attracting the attention of bug blogs, the odds were surprisingly good that at some point, the right scientist would see a photo of the Semachrysa jade.

“Coincidentally, a colleague of mine with an interest in photography came across the same images posted to a photography website by the same guy and drew my attention to them later that week,” wrote Winterton in an email interview. If the Flickr algorithms hadn’t shown him the photos, it’s possible that Winterton would have made the discovery anyway because of his social network. Maintaining contacts with others who share similar—but not entirely overlapping— interests is another key to making discoveries. Social discovery may be one of the most valuable ways to make a serendipitous discovery online because of its favorable signal to noise ratio.

But social discovery is impossible unless you have a good network and a good way to share interesting information for others or save it for yourself.

“There are tools like OneNote that allow you to grab a piece of information very swiftly and put it in a place where it’s searchable later on. You don’t know right away if what you’re going to check is going to be a future Nobel Prize-winning discovery,” says Erdelez. There are plenty of alternatives to OneNote, like Evernote (which also has a shared notebooks feature), DEVONthink, and Springpad. Author Steven Johnson has maintained one document for the last eight years that serves as a repository for all of his ideas, which he makes a point to read occasionally. “In a funny way, it feels a bit like you are brainstorming with past versions of yourself,” he writes.

One of the most important elements to being a high information-encountering individual, to use Erdelez’s nomenclature, is to have lots of interests and give yourself time to pursue them. In order to use that information successfully—and receive good info from others—you’ve got to store it, revisit it, and share it. Because of Winterton’s information-sharing and networking habits, another scientist passed him the photos later that week. It may not have been as quick as Flickr’s algorithm, but there’s no way he could have missed the photo in his inbox.

Papers Mentioned:

Winterton, Shaun L., Winterton, Guek, Hock Ping, Brook, Stephen J. A charismatic new species of green lacewing discovered in Malaysia (Neuroptera, Chrysopidae): the confluence of citizen scientist, online image database and cybertaxonomy. Zookeys. 2012; (214): 1–11. DOI: 10.3897/zookeys.214.3220

Perkins, Sid. New Species Discovered, Thanks to Flickr. ScienceNOW. 2012 Aug 9. http://news.sciencemag.org/sciencenow/2012/08/scienceshot-new-species-discover.html

André, P., Schraefel, M. C., Teevan, J. and Dumais, S. T. Discovery is never by chance: designing for (un)serendipity. In C&C '09: Proceedings of the Seventh ACM Conference on Creativity and Cognition. 2009. 305-314. DOI: 10.1145/1640233.1640279

Paydarfar, David and Schwartz, William J. An Algorithm for Discovery. Science. 2001 April 6; 13. DOI:10.1126/science.292.5514.13

Schlueter, Ph.D. and Peterson, Ph.D, Randall T. Systematizing serendipity for cardiovascular drug discovery. Circulation. 2009 July 21; 120(3): 255–263. DOI:10.1161/CIRCULATIONAHA.108.824177

Stoskopf, MK. Observation and cogitation: how serendipity provides the building blocks of scientific discovery. ILAR Journal. 2005. 46(4):332-337. PMid:16179740