ADVERTISEMENT
  About the SA Blog Network













Guest Blog

Guest Blog


Commentary invited by editors of Scientific American
Guest Blog HomeAboutContact

Edward, Bella, and McGurk: Why Bad Lip-Synching Is So Funny

The views expressed are those of the author and are not necessarily those of Scientific American.


Email   PrintPrint



“You slapped a fiiiish. Why would you do that?”

“I wanted some seafood.”

At nearly 16,000,000 views at the time of this writing, this “bad lip-synching” of Edward and Bella is objectively hilarious. Funny lip-synching videos litter the Internet, putting ridiculous words in the mouths of everyone from Mitt Romney to Bane. A shared love of making fun of people seems to dictate that these videos are funny, and that’s that. But why? The best “bad lip-synchs” take advantage of how our brains process speech.

Not Just What We Hear

Speech recognition is a concatenation of many diverse internal pattern seeking programs, all looking for minute changes in everything from the tone and volume of speech to the physical motion of a person’s mouth. So it’s not just what you hear, but also what you see.

Even though speech is primarily auditory, we prioritize the kind of information we are getting depending on the context (both consciously and unconsciously). For example, during a particularly lengthy foreign film, we learn to ignore both the visual (i.e., the mouth movements) and the auditory aspect of speech to focus solely on the words on-screen. This isn’t speech as we typically recognize it, but anyone who has suffered through a badly-dubbed film knows that the more difficult it is to link those words on screen to the person speaking, the more aware you become of your aching backside. Our brains seek to synch even disembodied words to their owners.

Likewise, imagine that you see a friend across the room at a crowded “End of the World” party. They are barely audible above the din of “Play Gangnam Style!” requests, so you focus intensely on their mouth movements. Add the minimal auditory input to the “enhanced” visual input, and you can just make out that they want another beer.

Because diminishing either what we see or what we hear during speech diminishes the whole, this points to the fact that speech perception is an aggregation of more than just one sense like hearing. It is multimodal.

But the interpretation of speech is not the only case where our brain crosses multiple wires, so to speak. The taste of something is another multimodal perception. For example, when water spouting from a bubbler with iron piping tastes like iron you are in reality smelling the iron, which is then combined in your brain into the “irony” taste of the water (as the tongue has no “iron” taste receptors).

As another example of how strong this connection can be, just think about eating a green French fry or yellow steak. Even if the food were perfectly normal, I’d bet you would hesitate to bite into it. Or consider the sad case of Crystal Pepsi. In 1992 Pepsi decided to change the color of their soda from brown to clear, while maintaining the same flavoring and ingredients. The sale of the soda plummeted. It was pulled from the shelves in 1993.

Just as what you smell and what you perceive on your tongue can together make up what we taste, what we see and what we hear when people are speaking combine to form our perception of what someone is saying.

The McGurk Effect

There is perhaps nothing that makes us question how we actually sense the world more than illusions. Not only do they amaze us, they offer clues into how the brain processes sensory information. One of the most common optical illusions, the “Necker Cube” is so mystifying with its shifting depths because our brains have competing 3D models of what the cube should look like. As it arbitrarily flips between them (somewhat driven by attention to certain details), our pattern-seeking minds reveal their software. There are also speech illusions.

The McGurk Effect is a phenomenon where the auditory component of one sound is combined with the visual component of a second sound, resulting in a perceived third sound. To do this illusion effectively, you need a dubbed video. In it you have a speaker mouth the syllables “va/va/va” while playing the sounds of “ba/ba/ba” over the video. What you see then overrides what you hear, changing the played sound of “ba/ba/ba” to “va/va/va” in your mind, even though the audio never changes. You can watch this BBC video if you want to have your mind sufficiently blown by this illusion. The really amazing part is that, during the illusion, if you close your eyes and therefore shut off the visual part of your speech recognition, the illusion immediately dissipates! (The video linked to above does a great job in pointing this out.) The on/off switch to this illusion couldn’t make it any clearer: speech perception is much more than what we hear.

This of course brings us back to Twilight.

Making Fun of Sparkling, Pasty Vampires

To successfully mess with our speech perception, the words substituted in the “bad lip-synching” Twilight video need to have accompanying mouth movements that, when spoken, mimic the original lines in the movie. The humor then emerges from this tip-toeing on a tightrope of plausibility—a lip-synching that is close enough to confuse us yet far enough away from perfect is hilarious. It gets funnier as the words synch up with the mouth movements more squarely (and a good impression of each character helps, as in the case of this uncanny Bane impression). Combine all of this with the fact that you are watching Bella scold Edward for punching a fish and you get a viral video.

It’s not that you are seeing incorrect speech in these videos; you are in fact seeing a different speech. Just as coloring a French fry green can make it taste repulsive, a vampire talking about eating cake with the seemingly correct mouth movements is LOL-inducing because we tentatively perceive it as the genuine article.

Watch one of the videos again and notice how you are inevitably drawn to studying the mouths of the speakers to see just how close the match is, to examine if it is “real.” Even when the synching is not perfect, because we are looking to be entertained, we give leeway to the inevitable shoehorning of ridiculous words and phrases into the video; the asinine become the authentic.

And when we don’t have a mouth to examine, words can shape what we recognize as speech. Case in point, this video shows how easy (and hilarious) it is to mistake the classical composition “O Fortuna” for a song about men liking cheese. Combine both lip-synching and text put to what we hear, you get an elf who is sick of Barack Obama.

I think it all comes down to believability. Right off the bat we do not believe that Edward asked whether or not mice have “wee-wees.” But if the impression is decent, if the mouth movements synch-up, we suspend our disbelief and revel in a reality where teen-dream vampires ask such questions. Likewise, many of us know that most music videos are actually lip-synched, but we have gotten so good at synching them that nobody seems to mind. Bottom line for prospective video makers: take advantage of our multimodal speech perception well enough, and you can make a ventriloquists’ dummy out of anyone.

A particularly tight synching garners the immediate “It looks like that is what they are actually saying!” response. In a way it is, and it’s damn funny.

Further Watching: More “bad lip-synching” videos

Kyle Hill About the Author: Kyle Hill is a freelance science writer and communicator who specializes in finding the secret science in your favorite fandom. Follow on Twitter @Sci_Phile.

The views expressed are those of the author and are not necessarily those of Scientific American.






Comments 2 Comments

Add Comment
  1. 1. bucketofsquid 4:32 pm 12/21/2012

    It is truly a shame that Buffalax had all of their videos pulled from Youtube for copyright violations. I discovered many wonderful artists from other countries via them and now it is hard to find any source of the originals. The only one I can reliably find is Daler Mehndi. There is just something special about a group of Japanese singing about how “Grandpa gave me aids!”

    Link to this
  2. 2. rexnnye 4:21 am 12/24/2012

    Its just a movie scene … and an expression of love and romance. Whatever they do in movie is no relation with reality and in fact what they were in doing in movie is they paid for it.
    You live in apps world … http://www.appostrophic.com/

    Link to this

Add a Comment
You must sign in or register as a ScientificAmerican.com member to submit a comment.

More from Scientific American

Scientific American Back To School

Back to School Sale!

12 Digital Issues + 4 Years of Archive Access just $19.99

Order Now >

X

Email this Article



This function is currently unavailable

X