April 25, 2019

“Like All Good Stories, It Starts with Pigeons”

Suresh Venkatasubramanian tells us about one of the most important tools he uses to root out algorithmic bias

Two pigeons stand in front of a brick wall — Frédéric Bisson *Flickr* (CC BY 2.0)

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

On this episode of our podcast My Favorite Theorem, my cohost Kevin Knudson and I were happy to talk with Suresh Venkatasubramanian, who is in the computer science department at the University of Utah. You can listen here or at kpknudson.com, where there is also a transcript.

None — Suresh Venkatasubramanian. Credit: Chris Coleman

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Dr. Venkatasubramanian likes to describe himself as a bias detective or computational philosopher. One of the main focuses of his work in the past few years has been on algorithmic bias, the idea that algorithms can reinforce human prejudices. I talked with him for an article I wrote about algorithmic bias for the children’s science magazine Muse, which is also available here, and you can find one of his recent papers about algorithmic fairness here. He chose to talk about Fano’s inequality, which is important to his work but has applications much more broadly in computer science and statistics.

Fano’s inequality begins as all good stories do, Dr. Venkatasubramanian says, with pigeons. Specifically, the pigeonhole principle, the intuitively obvious fact that if the number of pigeonholes is smaller than the number of pigeons you have, at least two pigeons will share one pigeonhole. He describes the way the pigeonhole principle forms the foundation for several other observations that underlie many lower bounds in computer science, including Fano’s inequality. (In computer science, a common goal is to find a lower bound for the number of steps in an algorithm: is there a theoretical minimum amount of time it can take?)

Fano’s inequality is about the amount of entropy, or uncertainty, in a relationship between two variables. Dr. Venkatasubramanian used the example of American Caucasian names and genders. Few if any people named Nancy are men, and few if any people named David are women, but there are fair numbers of both men and women (and people of other genders) named Dylan. So if an algorithm wants to predict a person’s gender based on their name, it is more likely to get it right if the name is Nancy or David than if it is Dylan. Fano’s inequality makes that, again intuitively obvious, observation precise. It puts limits on how accurately an algorithm can predict a variable x based on a variable y, based on the uncertainty in the function that associates those two variables. For more details on Fano’s inequality, see Dr. Venkatasubramanian’s post about it here. A more advanced introduction, by Bin Yu, is here (pdf).

In each episode of the podcast, we ask our guest to pair their theorem with food, beverage, art, music, or any delight in life. Dr. Venkatasubramanian picked goat cheese and jalapeño jelly. You’ll have to listen to the episode to find out why it’s the perfect accompaniment for Fano’s inequality. (For those who want to see the hot pepper-eating orchestra I mention in the episode, the video is here. But why would you want that?)

You can find Dr. Venkatasubramanian at his website and on Twitter. He and some of his colleagues blog about their work and related issues at Algorithmic Fairness. You can find more information about the mathematicians and theorems featured in this podcast, along with other delightful mathematical treats, at kpknudson.com and here at Roots of Unity. A transcript is available here. You can subscribe to and review the podcast on iTunes and other podcast delivery systems. We love to hear from our listeners, so please drop us a line at myfavoritetheorem@gmail.com. Kevin Knudson’s handle on Twitter is @niveknosdunk, and mine is @evelynjlamb. The show itself also has a Twitter feed: @myfavethm and a Facebook page. Join us next time to learn another fascinating piece of mathematics.

Previously on My Favorite Theorem: