ADVERTISEMENT
  About the SA Blog Network













Guest Blog

Guest Blog


Commentary invited by editors of Scientific American
Guest Blog HomeAboutContact

“Anything but Country”: What Factor Analysis Reveals about Our Tastes for Tunes


Email   PrintPrint



What’s your favorite type of music?

Lil’ Wayne? AC/DC? Anything but country?

We like to think that our musical preferences are somehow deeply unique and meaningfully representative of who we are as individuals. But what if I told you that when it boils down to it, we’re not all that different from each other? In fact, most of those seemingly “nuanced” differences in musical taste can be summed up by a mere five factors.

What Exactly Is A Factor – And What Is A Factor Analysis?

A factor can be thought of as an underlying concept that explains the variability in a given dataset.

To understand the basic theoretical idea behind what a factor analysis involves, take this picture of the Muppets from Sesame Street. Conceptually a factor analysis is like asking, “How can we explain the most about how these Muppets are different from each other while using the fewest adjectives possible?” Ideally, you’d want to use enough descriptors to be thorough without going overboard, perhaps focusing on 2-5 of the most crucial, defining differences. Individually describing every single Muppet wouldn’t be very helpful, nor would it be helpful to say, “Well, some of them are furry.” However, if you posit that the most important features to focus on are furriness, color, and clothing, you’ve done a pretty good job of briefly (yet thoroughly) summarizing the main ways in which the Muppets differ from each other – and those features could also be thought of as factors.

Behind all of the numbers, figures, and statistics, this is the conceptual basis of a factor analysis.

Moving From Muppets To Music

In an effort to understand the science of musical taste, Peter Rentfrow, Lewis Goldberg, and Daniel Levitin conducted a factor analysis of musical preferences. Much like in the Sesame Street example, the researchers were looking for factors to explain differences in the data – only instead of Muppets, we are now examining people’s ratings of 15-second music samples. The final set of “factors” should do a thorough (yet succinct) job of explaining overarching “like” and “dislike” patterns in people’s ratings (for example, a pattern showing that rock & roll fans also tend to like punk and heavy metal).

To conduct this analysis, the researchers first picked 52 songs that sound similar to well-known songs from several major genres, but never became popular (to avoid any bias from sheer overexposure).

A sampling of some songs that were chosen by the research team

They then asked all of the participants to rate how much they liked a 15-second clip of each song on a scale from 1 to 9.

Printed below is the full list of songs, and the numbers on the right are factor loadings. The Roman numerals at the very top (I, II, III, IV, and V) indicate the five factors that the researchers planned to find – the musical equivalent of five features summarizing the Muppets’ differences.

It’s important to note that there’s a reason why they ended up settling on five factors – they went through a bunch of different possibilities first, and five was the number that did the best job of balancing thoroughness with brevity. I’ll discuss this more later.

In the meantime, you need to know two things to understand what these numbers mean: They range from -1.0 to 1.0, and a high number (preferably over 0.40) means that the song “loads onto” (or belongs) to that factor. Factor analysis is a bit like a puzzle; it’s not as if the statistical program spits out the adjectives for you. You have to look at what loads onto each “factor” and then figure out what the descriptors should be for yourself; there’s no “right answer” lying around simply waiting to be discovered.

For example, in the first column, the samples displaying high numbers mostly consist of classical, jazz, and instrumental music; now you have to find a word that best represents those musical pieces. The researchers called it “Sophisticated,” but you likely could have come up with other ideas. How about the second column? Country rock, New country, Mainstream country, Bluegrass, Rock ‘n’ Roll…in this case, the researchers went with “Unpretentious,” but you probably could have found an equally plausible alternative (perhaps “country” or “rockabilly”). The third, fourth, and fifth columns were defined by the researchers as “Intense,” “Mellow,” and “Contemporary,” respectively.

Now that we’ve seen the statistics, what does this look like graphically?

The researchers ran the analysis four times, each time planning for a different number of factors (2, 3, 4, or 5). This is a graphic representation of all four solutions, with the five-factor solution at the very bottom, and the two-factor solution on the second row. We actually get some good information from looking at each possible solution, which is why it’s really interesting to go back and look at all of the options rather than simply focusing on the researchers’ final five-factor solution. For example, the two-factor solution is highlighted in red below:

When the researchers looked for 2 factors, they ended up with Sophisticated Music vs. Everything Else. This actually tells us something fairly important: When you specifically try to find the single most important “this or that” distinction in musical preferences, you end up with sophisticated on one side, consisting mostly of classical, jazz, and instrumental music, and everything else on the other, including tunes ranging from country to heavy metal.

Let’s revisit the title of this post. Doesn’t it sometimes seem like everyone’s default response to the “favorite music” question is anything but country? That may be one of the more common responses, but it’s probably not true. What this answer likely indicates is a preference for some combination of rap, pop, and/or rock music, but it’s incredibly unlikely that these respondents have a particular affinity for Celtic music, Swedish death metal, or polka. In fact, here’s what the two-factor solution really tells us: When it comes to a “this” vs. “that” distinction in musical taste, it doesn’t come down to “country” vs. “everything else” – it comes down to “sophisticated” vs. “everything else.” If someone really wants to give an all-or-nothing response to the favorite music question, it would probably be more accurate for him to say he likes anything but the high-brow stuff. Or, conversely, anything but the low-brow stuff.

Additionally, notice that the Sophisticated box remains consistent in every single row; what this tells us is that the songs from that original category don’t tend to move over and become a part of different categories, even as more potential options appear. This gives us even more insight into that musical category – not only is it conceptually distinct from everything else in the two-factor solution, but even when more categories open up and there’s a chance for some of the songs in that “box” to be re-grouped with other samples, they don’t budge. It really seems like the “classical, jazz, and instrumental” classification is its own beast, and it has very little in common with other types of music.

Once you allow for a third factor, some of the songs from that “everything else” category split off into another box called “Intense.” What this means is that the rock, heavy metal, and punk songs are separating out from the rest of the songs, leaving the rest behind in that ambiguous catch-all category (now termed by the researchers as “Unpretentious.”) Once a fourth factor is added in, that catchall box breaks down a bit more and makes room for a “Mellow” category.

(As a tip, you can actually tell which categories are “breaking down” and losing some of their songs to the new factors based on the numbers printed on the arrows. The closer the number is to 1.00, the more that box remains the same from row to row. The lower the number is, the more that box has changed.)

The final grouping of factors that the researchers settled on was a set that they cleverly termed MUSIC: Mellow, Unpretentious, Sophisticated, Intense, and Contemporary. What does this mean? Simple: These are the five most important descriptors involved in categorizing music and explaining differences in our musical tastes. If we like a piece of music from the “Mellow” box, we probably also like other music from the “Mellow” box.

Are Genres So Important After All?

So what does this mean for how we understand music?

Interestingly, this is one of the only studies on musical preferences that does not break the results down by genre. Classifying music based on this MUSIC model is not the same thing as saying “If you like rock music, you will like other rock music.” It may seem like the MUSIC adjectives look like genres, but in reality, dozens of genres can fit into just one of the MUSIC categories. This study really suggests that the specific “genre” of a song may not be so important after all; really, our tastes may be more guided by these underlying musical characteristics that span a wide variety of industry-imposed labels. If you like Barry White and Jack Johnson, categorizing those songs based on genre makes you seem idiosyncratic, but if you realize that they’re both Mellow, it doesn’t seem quite so strange.

Pandora’s Overcomplicated Box

Most likely, you are familiar with the website Pandora, part of the Music Genome Project. If not, it is a well-known project designed to “sequence” various songs; if you put in a beloved song or artist, it plugs this preference into its algorithm to find other songs that you should like.

Essentially, the Pandora algorithm categorizes every song based on 400 musical characteristics, ranging from lead vocalist gender to the level of electric guitar distortion. Once you “thumbs up” a given song, the software automatically looks at that song’s scores on all 400 characteristics and then compares it to every single other song in the database; whatever songs have the “closest” scores on the highest number of characteristics get played next.

It’s very impressive, and plenty of people use (and love) Pandora, but there’s one thing this research can tell us: That might not all be so necessary. How much do we need 400 characteristics when our musical taste patterns can be summarized so well by a mere five?

There’s even some anecdotal evidence to support the idea that despite all of the effort behind its creation, Pandora may be deferring to this five-factor model more than the sequencers even realize. I’ve heard countless stories of Pandora choosing seemingly random songs to play on a given station; one of my favorite personal anecdotes is the time I had the Lil’ Wayne Pandora station on in the background while I worked, and eventually realized that the last three songs it had played were Mandy Moore, Jessica Simpson, and Britney Spears. At the time this made no sense, but here’s the funny thing: According to the MUSIC model, this may have made perfect sense. After all, all four of those artists fall into that last “Contemporary” box. Lil’ Wayne and Britney may seem to have little in common on the surface, but conceptually, they’re more similar than artists like Rihanna (Contemporary) and AC/DC (Intense).

And to be fair, I can’t really complain. I do love both Weezy and Brit.

References: Rentfrow PJ, Goldberg LR, & Levitin DJ (2011). The structure of musical preferences: a five-factor model. Journal of personality and social psychology, 100 (6), 1139-57 PMID: 21299309


Melanie Tannenbaum About the Author: Melanie Tannenbaum is a doctoral candidate in social psychology at the University of Illinois at Urbana-Champaign, where she received an M.A. in social psychology in 2011. Her research focuses on the science of persuasion & motivation regarding political, health-related, and environmental behavior. You can add her on Twitter or visit her personal webpage. Follow on Twitter @melanietbaum.

The views expressed are those of the author and are not necessarily those of Scientific American.






Comments 9 Comments

Add Comment
  1. 1. neuromusic 1:28 pm 08/8/2011

    First, explaining Factor Analysis with muppets is awesome. I am totally going to use muppets to explain PCA in the future. Calculate the EigenMuppet.

    Anyhow.

    The category names seem a bit… contrived? forced? like they were picking category names to make sure that they worked out to be a convenient, easy-to-remember acronym? Like “MUSIC”?

    The thing is that each of these categories is already it’s own section in a record store… err, the iTunes store.

    Isn’t “unpretentious” a non-category, defining everything else as pretentious? Oh, wait, it’s simply Country and Rock. Perhaps it should have been labelled “NASCAR”. But that would kill the acronym.

    What I think is really interesting, though (and perhaps part of the reason the authors tried to make an acronym and define each of the categories in the most positive language they could) is that the factors seem to break out along social and economic lines. Well, real society has more subdivisions than 5, but they at least correspond to stereotypical high school cliques.

    Sophisticated = Band nerds
    Unpretentious = FFA
    Intense = Sk8rs (do those even exist anymore?)
    Mellow = Speech & Debate
    Contemporary = Jocks & Cheerleaders

    Are these categories any better? not really. The problem I have is that the authors are using this contrived MUSIC acronym to push forward their argument that these factors are “genre-free” (their words, from the paper). Which I think is BS.

    They note in the paper…
    “Genre-based measures also assume that participants share a similar understanding of the genres. This is an obstacle for research comparing preferences from people in different socioeconomic groups or cultures because certain musical styles may have different social connotations in different regions or countries. Finally, there is evidence that some music genres are associated with clearly defined social stereotypes (Rentfrow & Gosling, 2007; Rentfrow et al., 2009), which makes it difficult to know whether assessments based on music genres reflect preferences for intrinsic properties of a particular style of music or for the social connotations that are attached to it.”

    They do some more math to try to argue that this isn’t the case… to argue that these categories are about the *attributes* of the music and not the *genre*. But I’m not convinced that the two can be easily separated, even with fancy math.

    Link to this
  2. 2. Melanie Tannenbaum 7:47 pm 08/8/2011

    Neuromusic,

    Thank you for the thoughtful comment! There are aspects I didn’t get to cover in the post, and I’m glad to have a chance to visit them here in the comments.

    First, I also groaned when I saw how conveniently they managed to backronym their way into “MUSIC.” I think a lot of the specific terms themselves are unnecessary stretches devised for the sake of coming up with a cutesy model name. The one I take particular issue with is actually, as you noted, “Unpretentious.” In fact, I also agree with you that the classist implications are a little weird. In fact, the authors note in the body of the paper that most of the music that eventually fell into that factor (in this sample and the 2-3 replications that followed) was not even Country & Rock, but actually Country & Singer/Songwriter. Singer/Songwriters, unpretentious?! Maybe some of them, but please. Anyone who’s heard John Mayer give an interview would beg to differ. I think a more accurate description would have been something along the lines of “Acoustic” or “Minimal Instrumental Background” (most country *and* most singer/songwriter songs are just a voice and a guitar, or a voice and a piano). Much like you, I’m more than a little annoyed that they prioritized the sanctity of the “backronym” over descriptive accuracy.

    The other point that I want to make, however, really involves the nature of factor analysis and how it differs conceptually from a cluster analysis. For anyone who does not know (I won’t assume that you either know or do not, but I’m sure there are some readers who’d benefit from the explanation), there’s a subtle yet crucial difference between the two. Going back to the Muppet example, while a factor analysis identifies the underlying adjectives that best summarize the differences between the Muppets (“furriness,” “color,” etc.), a cluster analysis would literally group the Muppets together in the most logical way. For example, you might get something like…Cluster A = Humanoid Muppets (e.g. Bert, Ernie, The Count), Cluster B = Muppets Based On Actual Animals (e.g. Big Bird, Kermit the Frog), and Cluster C = Made Up Creatures (e.g. Grover, Elmo, Cookie Monster).

    To get back to the point, this is where I will (very respectfully) disagree with you when you say that there’s no meaningful difference between “attributes” and “genres” as the authors describe them. In my mind, genres are akin to a cluster analysis; they take musical samples and group them in a logical way. You start with a bunch of music, and you end up with something like Cluster A: Rap/Hip-Hop, Cluster B: Pop/Top 40, Cluster C: Singer/Songwriter, Cluster D: Country, etc.

    That isn’t to say that this doesn’t make perfect sense! Cluster analyses are really informative. But what I think is really important to get from this paper is the point that genres don’t explain everything, much like saying that Bert is Humanoid, Big Bird is Real-Animal-Like, and Elmo is Made-Up is informative, but doesn’t explain all the ways in which they differ. People often like music from more than one genre, and these patterns are usually pretty predictable. An AC/DC fan may very well love Mozart…but if there were a forced choice scenario between Liszt and Ludacris and you had to bet on which one the AC/DC fan would prefer, it’s *far* more likely that you’d place your money on Luda. When you’re simply looking at the genres, there’s no rationally sound reason why you’d predict this – you’d sort of just *know.* After all, just based on “genre”-speak, Rap is theoretically just as separate from Rock as Classical is. Looking at the factor analysis, however, you can soundly reason that 1. based on what we know from the two-factor solution, we see that Classical music is a lot more different from Rock than Rap is, and 2. both Rock and Rap probably share some aspects of that “Intense” attribute.

    I greatly simplified the factor analysis explanation for the purposes of this post, but we know that each sample can definitely “load onto” multiple factors. These results don’t necessarily mean that Metal (or Pop, or R&B, or Singer-Songwriter) can’t be “Sophisticated,” nor does it mean that Jazz isn’t Mellow or that Rap/Hip-Hop isn’t Intense. All that it means is that those 5 “attributes” do the best job of explaining patterns in the data. As an example, you can see in the figure with the factor loadings that Ludacris’s song (“Intro”) comes really close to loading onto that “Intense” factor along with the rock/heavy metal songs, and if the researchers had used a slightly more lenient “cutoff,” they easily could have said that it did. That kind of shows that a heavier Rap song can be seen as somewhat similar to a rock or heavy metal song, because they are both “intense” — but simply using genre-speak, you couldn’t realistically say that two songs by Ludacris and Iron Maiden really belong to the same “genre.”

    I hope that this (very lengthy, eek) comment made sense!

    Link to this
  3. 3. disgruntledphd 9:29 am 08/9/2011

    Right, I registered for this website just to complain about this post, so I suppose I’d better be clear.

    Firstly, factor analysis is a descriptive method, by itself it tells us little about anything. Essentially, what it does is look for correlations between items on a scale (in this case, ratings of songs) and try to examine how many factors we need to account for the observed ratings. So far, so good. However, the authors of this paper, while doing some things right, did many, many things wrong.

    Firstly, they used principal components analysis rather than factor analysis. This is a problem as factor analysis per se will give us estimates of how much variance the items have in common, while principal components just tries to reduce the data to a more manageable number of components.

    Secondly, they report no fit statistics for the various factor solutions. It is obvious that five factors will fit the data better than 1, so this tells us little about anything. I would have liked to see fit statistics which penalised the solutions based on the number of factors.

    Thirdly, despite having three studies, they failed to perform what is called a confirmatory factor analysis in any of them. This is where you specify your factors and how they relate, and then put that model to an alogirithm which sees how well this model can recreate your data. Without this, their study is intriuiging, but essentially meaningless. The sad part is that this (possibly intriguing, definitely flawed) study was published in JPSP which is a high status journal, and will undoubtedly be cited many, many times, despite its flaws.

    Its worth noting that I am a junior researcher, familiar with factor analysis, and it took me less than five minutes to spot these flaws. Its terribly sad that the reviewers at JPSP do not seem to have given even the same consideration to this paper.

    Link to this
  4. 4. Melanie Tannenbaum 11:00 am 08/9/2011

    Disgruntled,

    Thank you for your comment. I agree with some of your points, though minorly disagree with one unless I am misreading it (they definitely should not have performed a CFA on a dataset that had already been analyzed using exploratory methods…unless you are just saying they should have done a confirmatory analysis using a completely separate sample. Personally, I’m not actually a fan of CFA and I even have a paper very roughly in the works about as much, but that’s a topic for another day I suppose.)

    Anyway, I mostly wanted to thank you for your insight, but also note that while the study itself may be flawed, my main purpose in writing this post was to provide a very simplified, general-audience-friendly explanation of the conceptual basis of a “factor analysis,” intended for those who hear the word “statistics” and immediately get scared off, using a topic area that’s easily accessible and greatly interesting to the average person. I toyed around with the idea of specifically explaining what a principal components analysis is, but decided that even that was a level of detail too unnecessary for these purposes; I really wanted it to be a primer that was relatively thorough, but simple enough that non-PhDs (like my parents and high school friends, for example) could understand the basics of factor analysis by the end of it. I hope that despite your issues with the original article itself, this point still came across intact.

    Link to this
  5. 5. GAry 7 2:46 pm 08/10/2011

    Guess I must be an outlier. I see no songs by Jerry Garcia or any member of the Grateful Dead. SInce my first musical love was classical it seems natural to me that I would be attracted to the expertise of the Dead, so I guess THEY would be included in the “sophisticated/mellow/intense” box,,,er,,,oops,,,it appears you have no such box,,,never mind,,,(darn. I forgot “unpretentious”).

    Link to this
  6. 6. bucketofsquid 4:53 pm 08/11/2011

    I like most classical and instrumental but hate most jazz. I also like hard rock, rap, reggae, punk, techno, blue grass and some country or acappella. While I have no problem with using music to explain factor analysis, I do have a major issue with such bizarre and arbitrary “factors”. Something more along the lines of tonal quality or range, complexity, social direction (if it has lyrics are they meaningful?), pace and perhaps diversity of sound patterns.

    Link to this
  7. 7. Taras 6:47 am 12/21/2011

    That is quite an interesting research. Thank you for the article, Melanie.
    May I ask just a few questions though?

    1. I did not really get how did you get the songs scored in each of 5 factors? You say that participants gave a rate from 1 to 9 for each song. How did they rate it? For each of 5 factors? But what goes next? I am just wondering how did alternative rock song got .12 points on being sophisticated while heavy metal song gets -0.2, for example.

    2. I did not really see what is the practical usage of this research. Does it simply works as let’s say simplified version of Pandora or Last FM? Or is there anything else?

    I hope I was clear: English is not my native language.

    Thanks a lot!
    Taras.

    Link to this
  8. 8. Best Of PsySociety: 2011′s Most Popular Posts | PsySociety 3:56 pm 12/28/2011

    [...] had a few guest posts on other blogs (like the SciAm guest blog and The Thoughtful Animal), for which I don’t know the pageview stats, and I’m not [...]

    Link to this
  9. 9. Weekly Roundup 140: A Curated Linkfest For The Smartest People On The Web | SimoleonSense 2:19 pm 04/24/2013

    [...] What Factor Analysis Reveals about Our Tastes for Tunes – via Scientific American – In an effort to understand the science of musical taste, Peter Rentfrow, Lewis Goldberg​, and Daniel Levitin​ conducted a factor analysis of musical preferences. Much like in the Sesame Street example, the researchers were looking for factors to explain differences in the data – only instead of Muppets, we are now examining people’s ratings of 15-second music samples. The final set of “factors” should do a thorough (yet succinct) job of explaining overarching “like” and “dislike” patterns in people’s ratings (for example, a pattern showing that rock & roll fans also tend to like punk and heavy metal). [...]

    Link to this

Add a Comment
You must sign in or register as a ScientificAmerican.com member to submit a comment.

More from Scientific American

Scientific American MIND iPad

Give a Gift & Get a Gift - Free!

Give a 1 year subscription as low as $14.99

Subscribe Now >>

X

Email this Article

X