social network,Yahoo,search,sentimentIf the adage "if you don't have anything nice to say, don't say anything at all" were applied to the Web, most comment threads would be empty. Some comments posted in response to a news article, blog, video or other online content successfully advance a debate and/or challenge assertions made by reporters, bloggers and editors. Many comments, however, come across as primarily hostile or entirely irrelevant, and drag down the level of discourse to anonymous mud slinging.

A team of researchers from Internet search giant Yahoo and Pomona College in Claremont, Calif., is studying Web comment threads and experimenting with software that can determine when such threads cross that line between constructive and destructive. Today "the line" is often determined by a community editor or Web site manager who, in addition to dozens of other responsibilities, must monitor comment threads.

Anonymous environments, where commenters are allowed to hide their identities or create a cryptic or jocular username, are more likely to yield aggressively worded comment threads, says Elizabeth Churchill, a principal research scientist at Yahoo Research who manages the company's Internet Experiences Group. This aggression affects not only the people participating in the thread, who come to feel as though intimidating or insulting language is the norm, but also potential newcomers, who're less likely to jump into a debate if it has already turned negative, she adds.

Churchill and Pomona College assistant professor of computer science Sara Owsley Sood led the research group, which developed software that they dubbed a "sentiment engine" to analyze the words used across comment threads. The researchers distinguished off-topic negative comments from on-topic negative comments that, while critical, are offered in the spirit of debate. The approach was to combine relevance analyses for detecting on and off-topic comments with sentiment detection methods that broke comments down into three broad categories: happy, sad and angry.

Sentiment analysis is a Web site-specific problem—the same words that are perfectly fine to describe the weather (cold, hot, etc.) have a different meaning when they are used to describe people, for example. "Phatic statements and conversational comments are often the glue that moves a web site from being informational to being social," according to the research report, which Churchill will present Thursday at the 2010 Grace Hopper Celebration of Women in Computing conference in Atlanta. "Ideally in a community news comment site, we would want both of these elements."

Rather than automatically shutting down comment threads, the goal is to signal community and Web site managers of the potential for a problem but then to leave discretion in human hands, Churchill says. The ultimate outcome would be balance in comment threads, which may express anger or sadness but are nonetheless relevant. Another option might be to offer separate threads for commenters who want to "take it outside" and continue expressing their disagreements in the equivalent of a virtual parking lot.

The researchers found that most comment threads turn negative after some period of time, even if the first few comments express something positive. The researchers are looking for patterns that indicate when the conversation is about to go south and whether this is a product of the number of comments in a thread (sooner or later someone will have something unkind to say) or the influence of one negative comment that gives others the impression that negativity is acceptable.

The researchers acknowledge that there's much more work to be done to improve the accuracy of their sentiment analysis systems. "It's very much a preliminary investigation into sentiment detection techniques for trying to surface when people are being inappropriately aggressive in comment threads," Churchill says. "I'm very interested in how emotions affect what people do online."

Image courtesy of Andy Dean, via