About the SA Blog Network

Guest Blog

Guest Blog

Commentary invited by editors of Scientific American
Guest Blog HomeAboutContact

Don’t lose the context! Response to: Are you maternal enough to be a woman?

The views expressed are those of the author and are not necessarily those of Scientific American.

Email   PrintPrint

Are you maternal enough to be a woman? I saw this headline on Scientific American blogs, and was intrigued. As a researcher in intra-sex variation in personality, I was eager to see any reference to maternal inclinations, given that it is the subject of my most recent paper. Hang on a sec? I realised this was about my most recent paper! Both Kate Clancy and Scicurious seemed to have very strong reactions to the paper, and I was quite surprised at their responses. I felt compelled to reply, firstly to clear up several misrepresentations of our paper, but also to provide some balance to the misconceptions about evolutionary psychology as a discipline.

In case you haven’t read them, have a gander at the two blog posts I’m talking about (Framing and definitions: Are you maternal enough to be a women? and The more feminine you look the more children you want. It must be science.)

Before I talk about the blog posts, I’ll give you a quick synopsis of the results of our research. In the first study, we found a significant positive correlation (0.436) in young women (aged 18-21) between urinary estrogen metabolite levels (at late-follicular stage of menstrual cycle) and self-reported desired number of children; that is, women with higher estrogen levels reported wanting a higher number of children, than those with lower estrogen levels.

Late-follicular urinary estrone-3-glucuronide levels (E1-3G: creatinine ratio) and reported ‘ideal number of children’ in 25 nulliparous women aged 18–21 (from Law Smith et al. 2011)

For the second study, we made composite faces (‘averaged’ the facial characteristics) of women who wanted the most children and the least children, in two independent samples. We asked people to look at the pairs of faces and decide which one they thought looked most feminine. We found that both men and women judged the faces of those wanting many children, to look more feminine, than the faces of those wanting fewer children. Have a look for yourself at the pairs below.

Composite faces of 18 women with lowest ‘ideal number of children’: Mean=1.39 children, SD=.69 (left) and 18 women with highest ‘ideal number of children’: Mean=4.33 children, SD=.85 (right) from Sample 1 (n=84) (from Law Smith et al. 2011)

Ok, so back to the blogs. Scicurious admits she got pretty angry after reading this paper and that she found it hard to step back and approach in a scientific manner. Kate was a little more measured. This is the intriguing bit for me. What is it about this kind of research that makes rational scientists get so hot under the collar? They both concede the methods, data, and analysis are sound, and our conclusions were appropriate – we made no wild conclusions of causation (as it is a correlational study). So why all the fuss?

Kate Clancy’s post was titled ‘Framing and definitions: Are you maternal enough to be a women?’ and she writes on the blog titled Context and Variation. So I found it a little ironic to see the paper taken so out of context:

“in the introduction, they point out only the biological underpinnings of maternal tendencies in a way that is essentialized, reduced to an individual’s hormones prenatally and in adulthood…”

Steady on there. Let’s get a bit of context. There are obviously HUGE effects on ‘ideal number of children’ preferences from social, cultural, and circumstantial factors. Who in their right mind would dispute this? This paper certainly does not. But to cover all those in an introduction in a research paper in a specialist journal would be inappropriate, as we were not investigating any of these variables. I could understand the objection, if we had written only about hormones in the context of a broad review paper of maternal behaviour, or a piece for popular consumption in a newspaper. But scientific research is necessarily specific.

We are evolutionary psychologists working in the field of how hormones relate to behaviour; Our research question was investigating possible links between hormones and behaviour (in this case, maternal preferences); We published in the journal ‘Hormones & Behaviour’! Our study follows on from previous research in women demonstrating hormonal and physical correlates of maternal tendencies. All these studies come in the broader comparative context of well established links between maternal behaviour and hormones in many species of animal. And there lies the rationale for investigating this in humans.

Of course there are undoubtedly MASSIVE effects on maternal inclinations from social, cultural and circumstantial pressures. Our results certainly support this. In our sample, estrogen levels could predict 19% of the variance in ‘ideal number of children’. Although this is statistically pretty impressive for a biological correlate of personality (a correlation of 0.436), this still means that 81% of the variance is up for grabs. So that’s the vast majority of variance we can speculate is related to the plethora of social, cultural, and circumstantial variables. But scientific progress is about acquiring little bits of knowledge, one study at a time. No study would, should, or could attempt to answer all the questions, all at once.

Both bloggers criticise our use of a WIERD sample (Western, Educated, Industrialized, Rich, and Democratic). Aside from loving this acronym (which is brilliant!) I have to point out that from a design point of view, when looking at hormones, a sample has to be homogenous; as there is so much variation across ages and ethnicities in hormonal profiles. It is the nature of good research design to try and reduce down the possible confounding variables which would mask any effects if they were present. Further studies should certainly look at samples of different ages, ethnicities, socio-economic backgrounds and across multiple cultures to see if the hormonal associations we found are present in different samples. But this does not undermine the results of this study. We found what we found, no more, no less.

I can’t help but wonder, would all these criticisms be made of a research paper looking at … hmm let’s say.. genetic variation and osteoarthritis? A paper of this ilk would no doubt be published in a genetics journal, and would not review the lifestyle and other circumstantial factors that relate to arthritis (of which there are many), but instead it would focus on concisely reviewing previous genetics related evidence, providing the rationale for the study. The sample would certainly be homogeneous in terms of ethnicity, in order to minimise confounding variation. The results might show a certain significant percentage of variance in risk for arthritis that can be linked to variants in specific genes. The results would be published, most likely reported in the scientific and popular press. And that would be that – no one would get angry. So why, when it comes to studies like ours, do scientists from other disciplines momentarily forget their scientific training and opt for emotional responses, personal anecdotes, and sweeping generalisations about a broad academic field of study? I can’t help but think there is something about the nature of evolutionary psychology research that makes some people distinctly uneasy.

It seems that evolutionary psychology has got a bit of a bad name. For some, it conjurs up ideas of universals, blanket claims of specialised behaviours, evolved modules in the brain for the smallest of preferences, and ‘just-so’ stories for how behaviours we possess have came to be. No scientific discipline is immune from a few dubious studies. But the overwhelming majority of evolutionary psychology research is none of the above, it is the scientific investigation of preferences and behaviours in humans in the context of evolutionary theory; encompassing human behavioural ecology, comparative psychology and traditional Evolutionary Psychology (EP). It is a relatively young discipline by scientific standards, and early pioneering studies in the late 1980s investigating sex differences (e.g. finding that men prefer youth in a partner, whereas women prefer resources, across 37 cultures, Buss 1989), were the essential building blocks for later work. These ‘main effects’ needed to be established before individual differences and intra-sex variation could be explored. It is this variation which much of the current research in evolutionary psychology investigates.

I felt the final line in Kate Clancy’s blog post was quite inflammatory, which again took our findings completely out of context, much like the emotive headline of “Are you maternal enough to be a woman?”

“Not wanting a baby today, or any day, does not make you less feminine.”

If I’d read this in a newspaper, fair dues, such are the perils of science reported by journalists in the popular press. But on a science blog, written by scientists? I was a little disappointed. It seems to be the proverbial straw-man fallacy, the setting up of a caricatured argument, as it is easier to criticise than the facts. Arriving at the interpretation that our study suggests not wanting a baby makes a woman less of a woman is, at best, wildly out of context; at worst, provocative and misleading.

I think that the two authors’ emotive reactions to the study, may not necessarily have been about what the paper apparently implies to these authors, but rather a reaction to the actual data and what it showed. I think it perhaps came down to how our findings made them feel. Is this paper really so threatening to how we feel about ourselves as women? Only if we seek to define ourselves only in relation to our ability or preferences for having children.

So what if a small proportion of our desire for children turns out to be associated with our hormone levels? So what if it does actually turn out that estrogen is one of the causal factors? So what! What is it that made this notion so repugnant? We should celebrate our diversity in personality and preferences, and embrace all the factors that have shaped us; culture, upbringing, circumstances, and, heaven forbid, some natural biological variation.

A hot topic at the moment is the concept of neurodiversity, it’s mostly used in the context of Asperger’s Syndrome, an area I currently research. But in general, neurodiversity can be applied to any variation in personality or way of being and perceiving the world, that may have partly biological roots. Neurodiversity is about celebrating our differences, and appreciating that there no right or best way of being, no normal and no abnormal, just a whole spectrum of being, with each personality difference bringing its own unique platter of strengths to the table. It takes women of all sorts to make the world go round, of all shapes and sizes, and of all personality styles and types. Finding that there may be some biological links to some of this diversity does not undermine or denigrate all the other experiential factors that undoubtedly shape us.


Buss, D.M. et al (1989) Sex differences in human mate preferences: Evolutionary hypotheses tested in 37 cultures. Behavioral and Brain Sciences, 12, 1-49.

Law Smith, M.J., Deady, D.K., Moore, F.R., Jones, B.C., Cornwell, R.E., Stirrat, M., Lawson, J.F., Feinberg, D.R., Perrett, D.I. (2011). Maternal tendencies in women are associated with estrogen levels and facial femininity. Hormones and Behavior. DOI: 10.1016/j.yhbeh.2011.09.005

Miriam Law Smith About the Author: Dr. Miriam Law Smith is a Psychologist gaining her PhD in Evolutionary Psychology from the University of St Andrews (Perception Lab). She studies personality and social cognition from an evolutionary perspective. She is also training as a Clinical Psychologist (DClinPsych) and her current interest is the relevance of evolutionary principles in the field of clinical psychology, most importantly, the concept of neurodiversity. Follow on Twitter @DrMiriam Follow on Twitter @DrMiriam.

The views expressed are those of the author and are not necessarily those of Scientific American.

Comments 24 Comments

Add Comment
  1. 1. kclancy 10:47 pm 10/26/2011

    Miriam, I’m so glad you submitted this post to the Guest Blog. I think blogs serve as a great place for post-publication peer review and discussion.

    I have to admit I was a bit surprised that your critiques of Sci and I largely boiled down to the fact that you thought we were over-emotive (particularly as the tone of your own post is a bit emotive yourself, to be honest). You also didn’t respond to the bulk of my critique, which was:

    “In addition to the usual issues of studying a WEIRD population (Western, Educated, Industrialized, Rich, and Democratic), studying the hormones of a population that young may yield different results than in older populations. Girls can have irregular cycles for as many as twelve years after their period (Vihko and Apter 1984), and they have lower hormone concentrations than adult women (Lipson and Ellison 1992). How might that interact with the questions the study authors are asking? How important is it that a teenage girl’s hormone concentrations correlate with maternal tendencies?

    Then, there is the way that they define maternal tendencies. The study authors asked subjects at what age they wanted to have children, and how many they wanted, in order to arrive at that subject’s maternal tendencies. Again, I wonder how the study’s results might change if they asked an older population, or a population from another country or ethnicity.”

    What are your thoughts on the age of your sample? I understand that from your perspective you prefer to sample a homogenous sample; we tend to do the opposite. But what about adolescent subfecundity? What do you think about the fact that since you’re sampling western college students they are likely not going to have kids for many more years, tendencies or not? And, what do you think you are truly testing with the way you defined your “maternal tendencies” variable?

    The other issue I had was:

    “Overall women have been having children at older ages and, when they have control over their reproductive lives, generally choose to have fewer of them. But if anything, estrogen would be increasing in these populations, as they are often also well-fed and therefore never needing to divert resource away from reproductive hormone production.”

    Any thoughts on this? My first thought, upon reading it again and being away from this material for a while, is that there may be a difference in within and between population variance based on this issue.

    Finally, I was surprised to see you characterize the last sentence of my post as “inflammatory.” I was pretty careful in my word choice and tone throughout the post, complimented the carefulness and rigor of your paper, and also made sure to point out part of my concern was with how the media would spin your paper, not entirely what you wrote. Now, perhaps that is my own fault, I may not have parsed those things out well enough (a symptom of my supposedly histrionic and hysterical post?). But it would be nice to hear more about how you would contextualize your own work. In your post here, it seems to me like you are diminishing the results of your own paper. So… what do your results mean? How can we contextualize them? What predictions would you make about what you would find if you asked women from different populations this same question? If in your discipline it makes sense to sample from a homogenous population, what is the impact of the results?

    All the best,

    Link to this
  2. 2. scicurious 10:55 pm 10/26/2011

    Hi Dr. Law Smith (I’m sorry I didn’t use the right last name when I blogged this paper, I’ll correct it). I can’t tell you how pleased I am to see an author on a study so comfortable with expressing their finding through blogs. It’s something I hope to see more often and I’m so glad that you’re one of the pioneers! I appreciate you offering your perspective and response to my critique.

    With regard to your findings, I did find the study itself to be well conducted (especially that you measured across menstrual cycle, so many people fail to do that). Did you find any variations in measure across the cycle? I noticed that you only showed results from the late follicular phase, did you get significant variations in phase?

    Obviously I don’t expect a full review on the sociological pressures present in various societies for women to have children, but I was surprised that, while in the paper you modulate your conclusions such that estrogen levels present a component of reproductive tendency, you don’t explicitly mention other factors at all. It would be very possible to talk about them having an effect at least in a general sense, and it seems like this is, as you note above, important considering your effect size.

    I also am not sure that using a WEIRD sample here is actually good, though it’s certainly a good place to start. While homogeneity of sample may reduce variance (and variance is always huge in human populations, so I completely understand why you’d want to do that), I don’t know that college students reporting on how many kids they want eventually is a good predictor of how many kids they end up having. It’s also going to be so homogeneous that I’m not sure how well the correlation would apply to other groups (especially groups that are not exclusively Caucasian, which I noted in your sample).

    And I still have an issue with this conclusion:

    “Importantly, our results demonstrate that female facial femininity is a cue to the behavioral characteristic of maternal tendencies (in addition to the previously established link between femininity and fertility).”

    I don’t think that female facial femininity here is a “cue”, I think it’s a correlation. And I am also not convinced that you measured maternal tendencies half so much as future reproductive goals (given that your women were nulliparous). I think the study is very solid and I think studies like this are important, but I am not sure about the interpretation of the results. My concerns are not merely those of “emotive reaction”. While I did have an emotional reaction to the paper, I do not think it makes my concerns less valid. I am concerned that with a lack of interpretation to these findings, the media will interpret them for you. I am aware that evolutionary psychology has produced many valuable and interesting studies, and where it has developed a reputational problems has been in part from the public and media response to the findings, which generally get interpreted such that correlation implies causation, and are often supportive of norms which are more cultural than biological. Without explicit notes to the contrary in such papers, I’m afraid we can only expect more of the same.

    Link to this
  3. 3. ejwillingham 11:01 pm 10/26/2011

    “Arriving at the interpretation that our study suggests not wanting a baby makes a woman less of a woman is, at best, wildly out of context; at worst, provocative and misleading.”
    That’s not what Clancy said. What she said was, “Not wanting a baby today, or any day, does not make you less feminine.” So, it seems that here again, you have equated “feminine” with “womanness.” The title of the paper attempts to link “maternal-ness” with estrogen levels and facial femininity, something I take issue with below. Clancy’s asserting that maternal-ness–or lack thereof–is not necessarily the cornerstone of femininity.

    In the paper, however, you’ve equated “womanness” and “maternal” with “number of children desired.” Do you have support for how placing a value on a number of children reflects maternal desire or “maternal-ness?” Because I don’t see how expressing a desire for a number of children–rather than simply for children, period–equates to that. That, to me, is a strangely materialistic and numerical expression of something that cannot be quantified, and it uses children as a unit, which strikes me as odd. Why would a woman who wants one child, or two, be less maternal or womanly or feminine than a woman who wants four? Why not five? Wouldn’t the better comparative have been between women who want NO children (i.e., literally don’t want maternity) to women who want children *at all*?

    The title of your paper is, “Maternal tendencies in women are associated with estrogen levels and facial femininity.” Yet the faces were composites of 18 women each divided based on how many children they wanted…from “independent samples.” Why did you not use the 25 women whose estrogen levels you measured as the basis for your composites? Again, I don’t think you use the right metric for “maternal tendencies” (instead using a metric for “child collection tendencies”), so the title of this paper does not work for me any better than the titles of the blog posts you cite worked for you.

    Looking at the data you provide for estrogen levels vs # of children desired, those data points are all over the place. Your R2 is 19%, which if I remember correctly means that I can infer that 19% of the “number of children” data can be explained by the estrogen values. With only 25 women…I can’t see a lot of strength there.

    I also looked at the values (and read your paper). You’ve got 25 values. If we take 15 as the midpoint for estrogen values, below 15, there are 16 women. Of these, 1 woman wanted no children. Eleven wanted 1-2 children. Four wanted 3-4 children.

    Above 15, there are nine women. Of these, three (a third) wanted 2 children. The remaining six wanted 3-4 children, although two are very close to that 15 midline (based on my eyeball of the graph and not having the raw data).

    Put another way, of your 25 women, 15 (3/5) or the majority, wanted 0-2 children, regardless of estrogen levels. The woman who wanted no children? She had higher estrogen levels than at least three of the other women and the *same* estrogen levels as a few women who wanted 2 or even 4 children. I can’t see a clear pattern here between “wants some children” and “wants more children” and estrogen levels. Had the faces of these same women been used in the “feminine or not” testing, that would have been of interest.

    I see three main issues in this paper:
    1. The graph displayed suggests that 19% of the variability in number of children desired is attributable to estrogen levels, for a very small group of women. A close look at the data reveals no clear high-low estrogen/high-low #children pattern.
    2. The use of “number of children” as a measure of maternal desire is, I think, arbitrary and indicates more materialism than maternalism; a better metric would have been the binary yes/no to “do you want to have children?” As I noted, the sole woman who did express that had higher E levels than several women who want to have children.
    3. The division of women in the “feminine or not” faces group based on how many children they wanted instead of whether or not they wanted children. Also, I’m not clear on why you didn’t use the same women whose estrogen levels you measured. Without that link, the title of the paper doesn’t seem entirely valid.

    Ultimately, I think it’s erroneous to equate maternal desire with number of children desired. That, in my mind, is a flawed premise, which makes the title and the conclusions of the paper flawed, as well.

    Finally, neurodiversity, which you mention, is about perspective taking. I think the perspectives the two bloggers took were of interest and valid. Your perspective is of interest, as well, and although I found the tone patronizing, I am in love with the fact that you responded so quickly and with a blog post to these critiques of your paper.

    Link to this
  4. 4. kclancy 6:48 am 10/27/2011

    A small point: I re-read my post. You claimed the final line of my post was:

    “Not wanting a baby today, or any day, does not make you less feminine.”

    I remember thinking to myself, “That’s funny, that doesn’t sound like how I would end a post. Also, those words aren’t particularly inflammatory.” So I was surprised to look again: the final TWO lines of my post were actually:

    “Not wanting a baby today, or any day, does not make you less feminine. And when the media onslaught begins over these findings, we would do well to remember it.”

    In that context — and your post here prioritizes contextualizing things, yes? — my sentences here are a warning about the way the media will likely spin your study, not some kind of defamatory comment on your paper. It’s a shame that you took my sentence out of context in a way that made it seem to be directed at your study, when in situ it is related to my primary concern, which is about media coverage.

    Scientists shouldn’t necessarily concern themselves with our their papers are covered (not necessarily — so I’m very glad you’re doing it, but I don’t think the burden should always fall on scientists). But bloggers and journalists should be concerned with this. So as a blogger, I was very much doing my job in making sure to put your work into context and asking some important questions about it, to head off those less savory journalists who want to simply parrot a press release, or extend your paper to all humankind.

    Link to this
  5. 5. nerdygirl 9:58 am 10/27/2011

    It looks to me that the more “feminine” face is actually the more childlike. How does that affect perception?

    Link to this
  6. 6. criener 1:15 pm 10/27/2011

    I wrote a longer comment but then it got cleared by some error. Argh.
    I’ll keep it shorter, or make two.
    I mostly agree with Dr. Law Smith, in that most of the criticisms I find misplaced.

    The major one is a confusion between rated femininity and “essential” femininity.
    This is an enduring criticism of much of psychology, and an unfair one in that most scientists are not essentialists. Psychologists who study intelligence, memory, happiness, etc, are not concerned about what those abstract, seemingly unquanitifiable, *really* are. They just make their measurements and see if they can predict any other interesting measurement. An intelligence researcher can’t tell you what “really” makes someone smart, and an attractiveness researcher can’t tell you whether someone is actually attractive. All we can say is that the particular people in this study rated those particular stimuli a certain way.
    Throughout the blog posts and the comments here I see the implication that Dr. Law Smith is somehow attempting to investigate some sort of essence of womanhood.
    Clancy: “Not wanting a baby today, or any day, does not make you less feminine.”
    No, but it may cause you to be rated less feminine on average.
    Clancy: “So, is this the end of the story? Women who are more feminine have higher maternal tendencies?”
    No, women whose faces are on average rated as more feminine (using this photoshop method that lots of other people have used for reasons that are not ironclad but acceptable in the literature) have higher maternal tendencies as measured by the easy convenient way that we measured it.
    Scicurious: “Doesn’t mean less “feminine” ladies are going to want fewer kids ALL the time and vice versa. When you keep that in mind, this paper’s not really a big deal.”
    Not less “feminine” but rated less feminine. And this criteria for “a big deal” is bizarre, if this ALL the time were the criteria, then we would have to throw out most of psychology.
    ejwillingham: “Ultimately, I think it’s erroneous to equate maternal desire with number of children desired.”
    No one is equating. the authors operationalized maternal desire with a simple, convenient self-report measure. They admit as much. This makes the study a limited first step, this is a limit, not necessarily a flaw.

    As psychologists, we take the vast glory of the human condition and we slice it and dice it. We measure things that many consider immeasurable. We find wonder in this, even as many others (and this is a long tradition) see us as materialist robots, robbing human nature of its random, unpredictable beauty.

    I’m glad this conversation is taking place, and glad to see the reaction, but I have to agree that the blog posts that began it were unnecessarily and unfairly critical. Clancy’s was titled “Are you maternal enough to be a woman?” insinuating that “being a woman” was somehow at stake. Scicurious described the results and added a sarcastic “It must be science.” The criticisms were not of the study itself, but essentially of the word count policy of the journal and of the conventions of that field. If that is a real concern (and I can agree that sometimes it is) go ahead and say that, but acknowledge that you are similarly dismissing most of the other articles published in that journal.

    I also find the concern about the way the media will spin the article misplaced. First, scientists can’t write for all audiences all the time. In a journal article intended for other researchers on Hormones and Behavior, a long introduction explaining the various limits on this research and the vastness of the factors which explain (intelligence/memory/attractiveness/etc) is simply unsuitable. Second, instead of worrying about what FOX News is going to say, be the media that you want to see. Instead of taking the scientist to task for not writing for a wider audience in a research journal. provide that context of a broader biocultural milieu that you mention.

    Link to this
  7. 7. criener 2:16 pm 10/27/2011

    Obviously failed at keeping it short. not the first time.

    Link to this
  8. 8. ejwillingham 12:50 am 10/28/2011

    “ejwillingham: “Ultimately, I think it’s erroneous to equate maternal desire with number of children desired.”
    No one is equating. the authors operationalized maternal desire with a simple, convenient self-report measure. They admit as much. This makes the study a limited first step, this is a limit, not necessarily a flaw.”
    Obviously, I disagree. Using desired number of children as a proxy isn’t nearly as clear-cut as using a yes/no desire for children, period. The former involves much more than maternal desires and the latter is just as “convenient.”

    Link to this
  9. 9. mcshanahan 2:00 am 10/28/2011

    criener – I understand where you’re coming from. I’m not a psychologist but am a researcher specializing in social measurement (and trained with several psychologists). And you’re right there is a difference between reporting on how people rate traits and essentializing those traits. But to say that the meaning of those traits is unimportant is, I think, disingenuous. You cite intelligence researchers and say that they don’t know what intelligence really is. But studying intelligence begins with the assumption that intelligence is a meaningful construct. The necessary accompanying assumption is that the way it is operationalized is meaningful as well. It is therefore well within a reader’s purview to question the way something is operationalized, as ewillingham has done. If the operationalization is suspect, so too are the results. I study identity. It is nebulous and difficult and has many meanings to different people, like femininity. This means though that I should pay more attention to how I measure it, more attention to how I operationalize it, and more attention to how my results could be interpreted. If I don’t then what am I really studying, answers to a couple of questionnaire items? Researchers should be able to defend their choices to measure constructs the way they do. Saying that it’s just what psychology is like is a weak defense.

    Link to this
  10. 10. mcshanahan 9:12 am 10/28/2011

    I also just wanted to add a thank you to all who’ve been involved so far. Dr. Law-Smith, kclancy, ejwillingham, nerdygirl, criener, Sci Curious – it is truly a pleasure to participate in a discussion with all of you about research online. It has been both thought provoking and informative. Thanks for engaging in this way!

    Link to this
  11. 11. criener 10:52 am 10/28/2011

    @ejwillingham: I don’t disagree that a yes/no is a more clear cut operational definition of maternal desire (although doesn’t it still suffer from just as much sociocultural influence?). But I don’t think face validity is the best and only way to criticize. In this case, it seems to have the effect of making the results uninterpretable because there is no variation at all (which is kind of odd to me, honestly. Who needs a non-WEIRD sample, wouldn’t this be different at a place like Wellesley?). But I think that your question about the validity of this measure as a proxy is does not end with a period, but with a look at the literature that Law Smith cites to support her choice of that question (the two Deady papers, and the Moore).
    Which gets to my response to @mcshanahan
    I agree with your point up to “The necessary accompanying assumption is that the way it is operationalized is meaningful as well.” I think this is where the rubber meets the road and it is very seldomly an assumption, but an empirical question. Catell thought he could measure intelligence by reaction time, lung capacity, pain thresholds, and others. But the correlations with other measures didn’t work out, so we don’t do that anymore. Yes, face validity is one way of evaluating operational definitions, but better is convergent validity or predictive validity. You could say that I am silly for measuring ear lobe length and saying I am measuring intelligence. And I would be the first time. But if I found high correlations between ear lobe length and academic performance, reading speed, standard tests of short term memory, job satisfaction, salary, I would start to have a better case that ear lobe length is measuring something interesting.
    The researchers in this case did defend their choice of measures. Does trait estrogen exist? Chatterton et al seem to have found stability. Is asking how many kids you want a valid way of assessing maternal desire? I am sympathetic to ejwillingham’s criticism, but if I were criticizing this choice, I would want to read the two Deady articles and the Moore.
    This gets to a skepticism I have about this sort of conversation and open science/ open peer review in general. When I am giving a general talk to people outside of my field (even scientists) I have to front load a whole lot of explanations about why I have chosen to measure the way I have. I have to summarize the literature. But when I am justifying to scientists in my field, I can just say “Proffitt, 2006″ and people can know that if they want to find the twenty studies that justify why using that particular measure of hill slant is valid, they can go there.
    I absolutely agree that scientists should do more science writing for a general audience, and more science education. But we shouldn’t hold them accountable for not doing that in their reports to others in their field. I am all for openness and for more discussion, but there are real limits to critically evaluating an experiment without having read any of the cited literature. I thought Kanazawa deserved every bit of criticism he got, but I read this article and did not find it in the same league.
    Hopefully that was a slightly less weak defense :)

    Link to this
  12. 12. mcshanahan 12:24 pm 10/28/2011

    Hi criener – Much stronger :)
    Still I’m going to disagree a bit though. It’s true that we can communicate in our own fields by referring to the literature in the way that you’ve explained. And I agree that this is appropriate in a journal publication for a specialist audience. But it still doesn’t preclude questioning and probing whether particular ways of measuring are appropriate or optimal. It’s an empirical question as you say but, when social constructs are involved, one that can never really be settled completely. Convergent validity is very useful but, and this may be a discipline thing on my part, if there is no explanatory power in the measurements then what is the point? Even if ear lob length correlates strongly with other measures of intelligence, if there is no explanatory connection then it’s just a proxy and should be treated with caution. It isn’t a measure of intelligence, it’s a measure of ear lob length that happens to do a reasonable job of predicting intelligence as measured with other instruments that themselves have flaws. But really that’s a point that has more to do with the internal thinking that we do as researchers than this paper specifically. While I agree that it’s really tough to write publications for both specialist and non-specialist audiences at the same time I don’t think it ever hurts to have outsiders and insiders challenge our assumptions (and validity always involves assumptions no matter how good the numbers are).
    And I’m honestly shocked that you invoked Kanazawa. The criticisms of his studies concluded they were almost entirely without merit. The response in these blog posts is nowhere near that and nor should it be. I thought this was a friendly discussion about interpretations and assumptions, with the critics acknowledging that the paper was strong in many ways. kclancy has been very clear above that she hoped her critical post would help others interpret the research. I don’t think anyone has tried to say that this is illegitimate research.

    Link to this
  13. 13. kclancy 3:14 pm 10/28/2011

    +1 to mcshanahan!

    “But studying intelligence begins with the assumption that intelligence is a meaningful construct. The necessary accompanying assumption is that the way it is operationalized is meaningful as well. It is therefore well within a reader’s purview to question the way something is operationalized, as ewillingham has done. If the operationalization is suspect, so too are the results.”

    Additional +1s to everything else you said. I am a little tired of Sci and I being set up as over emotive straw people, which is essentially what creiner is doing when he brings up Kanazawa.

    Also, aside from starting to address ejwillingham’s and my concerns with how maternal tendencies is operationalized as a variable, none of my other questions have been addressed. So I do appreciate this conversation in some ways, but in other ways… it doesn’t feel like a conversation.

    Link to this
  14. 14. criener 3:57 pm 10/28/2011

    You’re right that it was an exaggeration to mention Kanazawa in the way that I did. You are right that the posts here are not on that level. I was thinking of the discussions that followed (kanazawa), and the many times that I read about how ridiculous it was to measure x using a self-report questionnaire, or that something was totally subjective, ignoring the fact that subjectivity can still have a pattern, and that pattern can be interesting. And honestly, I haven’t found this conversation uniformly friendly and polite. But agreed that the Kanazawa reference is taking it too far, a cheap shot, and unfair.

    As far as the above section, I am a little confused about what the difference is between explanatory power and convergent validity. Do you mean a mechanism? That is, if I can’t say how ear lobe length relates to intelligence, then measuring correlations is always just a proxy? If that is what you mean, then that is certainly true, but again, true of most of psychology. There is relatively little understanding of mechanisms in social psychology, for example and a whole lot of correlation. That places limits on correlation research, and limits on a good deal of exploratory psychology research, even if it isn’t correlation. But these limits do not mean that the research should be dismissed, which is the overall sense I get, from kclancy, from scicurious, and from ejwillingham above.

    I do agree that it is good to have assumptions challenged. But we can’t get anywhere productive if critics treat background empirical evidence, or even the experiment itself as an assumption.

    Link to this
  15. 15. criener 4:20 pm 10/28/2011

    Ok, sorry again for the Kanazawa reference.
    I don’t think I am setting up emotive straw people, it has not been me, but first Sci (in the sarcastic title, and then in the first paragraph), then Law Smith, who brought up emotions. I was interested in what I am sure seem to be semantic splitting hairs but illustrated something important to me.
    But it seems like it is best for me to stop here. Apologies for not conversating. I saw some themes I was interested in, and obviously didn’t have either the time or the inclination to address the rest.
    I hope you all enjoy your weekend, I apologize again for the godwinning via kanazawa.

    Link to this
  16. 16. goolick 12:07 am 11/2/2011

    That distribution is a complete joke, and I don’t think you can really draw any conclusion from it at all. R-squared of 19%? What was the standard deviation there, a full year? Change that last point on the right down to a woman that doesn’t want any children, and all of sudden you have absolutely nothing. The variation looks completely random. Also, your sample size was 24? You have got to be kidding me.

    Also, the second study seemed like garbage too. First of all, I’m not really buying any sort of merging of facial features into a “composite face” as evidence. Then you say that you “found that both men and women judged the faces of those wanting many children, to look more feminine, than the faces of those wanting fewer children.” You just presented your argument in a awkward, 3 part sentence with strange comma placement. I had to read through that several times.

    This isn’t science. It’s conjecture. It’s MADNESS. And this is just the first four paragraphs. How do you misspell an acronym, anyways? Did you really think that weird was spelled that way?

    I admit that I wasn’t interested enough to read the entire post, but why read your squabbling and calling out of other people when I’m not convinced of your argument in the first place?

    Link to this
  17. 17. goolick 12:09 am 11/2/2011

    Sorry, 25.

    Link to this
  18. 18. DrMiriam 6:11 pm 11/7/2011

    Thanks to all for the comments on my post. Apologies if I wasn’t able to respond to all of the critiques fully in my original post, I felt my post was getting long enough already! But I will do my best to respond to the rest of them here in the comments.

    Regarding the issue of the age range of the women in the study sample. Certainly, studying the hormones of an older population may yield differing results, and this is a logical research question generated from the paper, going forward. Yes, some girls can have irregular periods extended into adulthood (you cited Vihko & Apter 1984). In our sample (as with all our samples used for research with hormones), we only used women who reported regular periods, and this was clearly seen in the menstrual timings (predicted from self-report diary dates and actual timings), and hormone variation across the cycle. So this is unlikely to be a source of confound.

    I’m not sure 18-22 year olds could be considered girls? But they are certainly young women. Lipson & Ellison (1992) looked at salivary progesterone levels across age in a cross-sectional design and found the highest levels in the 25-34 age bracket (although they were primarily concerned with whether fecundity across ages correlates with progesterone levels, and they didn’t measure estrogen). For sure, estrogen levels may rise with age into women’s late twenties, but it doesn’t necessarily matter that our sample’s estrogen levels may be higher when they are older. We were not specifically interested in absolute fecundity levels, but rather in looking at individual differences in estrogen levels between group members, and whether this correlated to ‘ideal number of children’. It is likely that individual differences within the group may remain, regardless of absolute levels, and this is the variation of interest in relation to our psychological variable.

    Just to clarify our theoretical justification for the use of a young sample (as opposed to testing a sample of women in their late 20s), our aim was to test young women at an age before they had started having children – and thereby attempt to capture their ‘ideal’ number of children, as opposed to reproductive decisions made when trade-offs and constraints have come into play. In that sense, we were interested in their ‘reproductive ambition’ outside of contextual constraints as far as possible. This touches on another point raised in the comments on the use of the term ‘maternal tendencies’. Interestingly, ‘reproductive ambition’ is the term we originally used, in an earlier paper, to define the ‘ideal number of children’ variable, as opposed to ‘maternal tendencies’. I can understand that the term ‘maternal tendencies’ may not appeal to people from a more general perspective, but written in a scientific paper within our specific field, we were aiming to encapsulate the psychological variable we were attempting to measure.

    On to the other issue of age. Yes, women in the west have certainly been having children at older ages…. and generally choosing to have fewer of them. Certainly, these population level shifts (‘the demographic transition’) have been occurring in the West. Yes, estrogen levels may well be at their potential maximum, as women in the west are well-fed and bodily resources have not been diverted away from reproductive hormone productions. But these issues are at the population level. Individual differences in estrogen levels, between the members of that population, are still likely to be present, and it is these individual differences in estrogen that we are investigating in relation to maternal tendencies. From our results, we could predict that the correlation may be present in other (homogenous) samples of women from other populations. Hormonal levels on a group level may be higher, or lower, in other ethnic groups – but we may still expect (based on our results) to see within these groups, individual differences in estrogen that relate to maternal tendencies. Based on the life-history theory strategies present in the populations, you may expect different sizes of effect.

    I don’t feel that I diminished the results of our study in my post. But I don’t overstate them. Like i said, they showed what they showed. The impact of the findings in evolutionary psychology are that in our sample we have demonstrated a biological correlate of ideal number of children (with a large effect size). A huge amount of research has demonstrated biological correlates of other preferences and behaviours (e.g. testosterone and dominance/aggression, testosterone and face preferences, estrogen/progesterone and face preferences across the menstrual cycle). So from our perspective, extending this into maternal tendencies is certainly quite interesting, and is the first step in future research in other samples and populations.

    I still do think that these critiques would not be levelled, if we were talking about another variable, and not ‘ideal number of children’. A great paper was published last week in Nature looking at genetics and psychiatric conditions. Not a trace of any other contextual variables mentioned in the introduction, or discussion. And rightly so! But no one would suggest for one second that the authors are implying that there are no other factors outside of genes that contribute to psychiatric illness.

    I do maintain that the last sentences in your post, and especially the headline, were quite provocative. “Are you maternal enough to be a woman?” this undoubtedly insinuates that our paper suggests a lack of maternal drive equates to being less of a woman. This is a value judgement mistakenly attributed to our paper, and one that we just did not make. It certainly would elicit emotion from women, and thus why I labelled it emotive.

    In my post I responded to your critiques of our paper. I certainly share you concern with how findings like these are reported in the media. This is a separate issue, like you say in your comment. And one which can’t be lumped onto the results and journal write-up of our study. Maybe, as you say, you didn’t parse the two out clearly enough. I certainly don’t recall calling you histrionic or hysterical? (Your words not mine, and perhaps a little extreme :)

    I totally agree with your concerns with how findings like ours are reported in the media. So I think that your comment that scientists shouldn’t necessarily concern themselves with how their papers are covered (in science blogs) is a little contradictory. I’m glad I was given the opportunity to respond in the Guest Blog, as I think it has opened up dialogue about the nature of writing research for publication. I, and other comments here, have reiterated that the specialist journal is not the place for the general context. The interface for layman context should be in our efforts to communicate the science to the public, most commonly when talking to journalists in the mainstream press. Many research groups routinely put out press releases when research is published, and take the time to talk with journalists from the popular press, in order to help translate our findings, and try to ensure that the results are not overstated, or generalised to all humankind, as you say.

    The potential media spin should not affect our write-up of a study in the academic journal, but it certainly should affect the way we help a University Press office to tailor a press release, and in our talking to journalists. Journalists often find themselves juggling between accessibility and accuracy, and so the onus should be on scientists to try to ensure our research is accurately portrayed, when journalists are angling to make a story accessible and catchy. It doesn’t always work, some media pieces are less than accurate! But that doesn’t mean we should stop engaging. I definitely agree with you that it is the job of science bloggers to be concerned with accurate portrayal of scientific studies. Perhaps more so arguably than newspaper journalists, as many more look to science blogs as a reliable portal to scientific findings.

    Thanks for the compliments! I’m pleased to engage in discussion through this blog format, and to open up debate. I think I’ve hopefully covered your comments on homogeneity of the sample in my comment above to Kate. In response to your point on variation across the cycle, that is a good question. We reported the late follicular phase, as it is the phase we could reliably pin point, as there is a peak in estrogen, and progesterone rises after ovulation (until menstruation). As we were able to map this phase most easily, we could reliably compare across all participants (we did 4-6 weekly samples in order to be able to catch each participant at the late follicular stage once each). In theory, with daily samplings, you could compare other stages reliably. We would certainly predict that the correlation would be comparable in any stage of the menstrual cycle, as essentially what we are comparing is individual differences in “trait” estrogen levels.

    With regard to the conclusion of that facial femininity represents a cue to maternal tendencies. If there’s a reliable association between a non-visible variable (maternal tendencies) and a visible variable (sexual dimorphism/femininity of facial appearance), then the visible characteristic represents a cue to the non-visible characteristic. The terminology used in the paper comes in the context of evolutionary theories of biological signalling systems. When you query calling facial sexual dimorphism a “cue”, and say it is instead a correlation, I think you might be confusing two separate issues (a biological cue vs. a biological signal, and correlation vs. causation). I hadn’t mentioned it in my blog post, but you make a similar confusion in your original post…
    “Now, correlation is not causation. So let’s keep that in mind. Doesn’t mean less “feminine” ladies are going to want fewer kids ALL the time and vice versa.”
    The latter bit is about a correlation coefficient being less than 1 (in our study the correlation coefficient was in order of 0.4). A correlation of 1 would mean that (in a sample) ALL less feminine faced women wanted less children than more feminine faced ones. The former sentence, “correlation is not causation” is about whether can we attribute causality from one variable to another. The two issues are separate.

    With regard to your other comments I think I could be in danger of repeating myself, so I hope that my response to Kate re the writing-up of the research in an academic journal vs. the media, covers your other points regarding context. I don’t think that a generic explicit note in the academic paper about other socio-cultural variables existing, would stave off any media interpretations, and again I don’t think this should be a concern when writing up in context of an academic journal, but certainly should be in a press release or any dealings with the press. Thanks again for your thoughtful comments.

    Ok. I did not in my post (or previously) equate femininity, in the general sense, with woman-ness. That is what occurred when the headline was written “are you maternal enough to be a woman?” and the statement “Not wanting a baby today, or any day, does not make you less feminine”, as if our paper had implied it did. Also I have to make a distinction between me having an issue with a misleading headline about our research in a science blog on Scientific American, and you having an issue with my title of my research paper which is entirely appropriate in the context of where it was – a specialist academic journal.

    In terms of the methodological justification for looking at the variance in number of children desired; it is entirely appropriate to look at this as a continuous variable, rather than a binary “wanting children” or “not wanting children”. From an evolutionary perspective, this is of value to look at variables associated with number of children, as opposed to “some” or “none”, as the former investigates variation, whereas the latter simply segregates those into those who might reproduce or not. If we are interested in potential reproductive outcomes, the former continuous variable is more accurate in representing the full extent of variation present. Using children as a unit may strike the layperson as “odd”, but in the context of evolutionary theory, it is a key component of interest in measuring inclusive fitness, and therefore of great interest if there are biological correlates of preferences for different numbers of children. To reduce the data to 0 vs. “some” children makes much less sense theoretically, for either the estrogen analysis or the composites, in the context of our hypotheses and our theoretical stance.

    With regard to the eyeballing of the data that you and @goolick both engage in; there’s not much I can say here, as the statistics speak for themselves. It doesn’t matter if the data points “seem all over the place” to you both; the analyses we used are robust, the correlation co-efficient is comparable using both parametric and non-parametric statistics. To expect the data points to line up neatly in a line would be to expect a correlation co-efficient of close to 1! I can’t really engage in meaningful discussion about the data, as (and I really don’t mean to seem rude here) from your comments it would suggest that neither of you have a good grasp on statistics.

    @Nerdygirl: Yes, that is a great observation, that the more “feminine” face appears more childlike. Neoteny (or the retaining of a childlike appearance) is a key component of the difference between male and female faces or sexual dimorphism.

    @Criener: thanks for your many interesting comments; on i) highlighting of the distinction between rated femininity of facial appearance (specialist terminology in our study) and the essence of femininity or womanhood in the layman or general sense; ii) noting the misplaced requisite for a correlation co-efficient of 1! iii) the reiteration of context in the writeup of research for publication in a specialist journal vs. potential portrayal in the popular press.

    It has been interesting to engage in discourse about our study, and also the broader issues that it generated debate on. So thanks to everyone that contributed constructively in this discussion.

    Link to this
  19. 19. kclancy 8:41 am 11/8/2011

    Just to be clear, ejwillingham is not a layperson, but an accomplished and knowledgeable biologist and writer, and she has a very strong grasp of statistics.

    Thanks for responding to our comments.

    Link to this
  20. 20. ejwillingham 1:08 pm 11/8/2011

    “With regard to the eyeballing of the data that you and @goolick both engage in; there’s not much I can say here, as the statistics speak for themselves. It doesn’t matter if the data points “seem all over the place” to you both; the analyses we used are robust, the correlation co-efficient is comparable using both parametric and non-parametric statistics. To expect the data points to line up neatly in a line would be to expect a correlation co-efficient of close to 1! I can’t really engage in meaningful discussion about the data, as (and I really don’t mean to seem rude here) from your comments it would suggest that neither of you have a good grasp on statistics.”

    I disagree that my observations suggest that at all. I also don’t recall asserting that the points need to “line up” for the data to be relevant. Statistics don’t “speak for themselves.” A good illustration of that is here:'s_quartet (thanks to @drzen for that). The data and distribution always warrant further investigation. In this case, my closer look at them suggested some of the points I made above, three points that still remain inadequately or completely unaddressed.

    Using number of existing offspring as a unit is a measure of individual fitness, but that’s not what you counted; you counted how many children were *desired* intellectually, which is not the same thing, nor is it the same thing as predicted number of offspring. What you measured has any number of confounders, according to the literature, including influence of number of siblings in each woman’s own family (which influences number of children desired), social influences (e.g., religious and cultural expectations), education level, socioeconomic expectations and, as I noted, a certain acquisitiveness or materialism. For a group this small, those confounders will play a considerable role. Again, the binary choice of true maternal desire–do you want children (not “some” but “any”) or not–would have been the less confounded one.

    Also, I’m not sure you quite understand what inclusive fitness is, which relates to fitness of the individual in terms of contribution to the group as a whole and the manifestations of altruism, rather than to the fitness of the individual specifically. It is measured as both the contributed offspring of that individual *and* that individual’s contributions, harmful or helpful, to others in the community (sometimes, specifically, kin). Indeed, using a self-focused marker of “number of children desired” would not relate to inclusive fitness or kin selection, but to direct personal genetic and phenotype representation–or desired representation in the next generation. In other words, materialism, not maternalism.

    As @kclancy observes, I am not a layperson. I did my PhD work in this lab ( (you’ll see the journal in which your paper appears on that page), completed a postdoctoral fellowship in molecular developmental genetics/endocrinology, and in a former life had a tenure-track position as an endocrine physiologist focusing on vertebrate phys. I was approaching this discussion as one would a weekly seminar focusing on a specific paper as the topic, but have yet to experience in that milieu being referred to as a “layperson” or as having an insufficient grasp of statistics. The give-and-take was typically more focused on the observations and queries about data, design, results, and conclusions, and I thought that is what the engagement was here, as well.

    Link to this
  21. 21. ejwillingham 1:51 pm 11/8/2011

    Also, I am curious. In the study currently under discussion, the faces you show above and label as “high maternal” and “low maternal” are similar to the faces from this study in 2005:

    There were 59 women in the 2005 study, according to the article. In the current study, you say that you made a composite of the faces of 18 women with the lowest ideal number of children and of 18 women with the highest ideal number of children, for a total of 36 women. The resulting composites are similar to pictures of the women posted in that 2005 article for women having high and low estrogen levels. Were these two groups of women the same group?

    For the current work, you “made composite faces (‘averaged’ the facial characteristics) of women who wanted the most children and the least children, in two independent samples” while the 2005 images appear to be composites of women with low and high estrogen levels.

    Did you consider taking the four sets of composites–low/high maternal and low/high estrogen and having volunteers rank their femininity? Also, was the group of women for each set of composites the same? If so, it would be of interest to see the estrogen levels from the 2005 study in the context of the number of children they expressed desiring in this later work.

    Link to this
  22. 22. mcshanahan 8:22 pm 11/8/2011

    Thank you again for your response DrMiriam. I really appreciate the time it takes to engage in these discussions and your willingness to do so.
    From my perspective, I still have questions about the operationalization of “maternal tendencies” as number of children desired.
    I’m not a evolutionary scientist but a social research specialist and from that perspective wonder why that single item is sufficient to represent a complex idea like maternal tendency. The papers cited to support it do not seem to use it in the same way. The Moore et al. uses the same single item but reports it as representing only that single construct (i.e., ideal number of children) rather than the broad idea of maternal tendency. On the other side, the Deady and Law Smith (2006) uses it as one of seven items that are combined after a principle components analysis. This seems to create a scale that captures a much more nuanced meaning of maternal tendency, including an item that asks precisely what ejwillingham has suggested along with others related to, for example, maternal feelings. I have to agree that she still has a point that titling the paper with the term “maternal tendencies” and interpreting it with that meaning doesn’t seem all that well supporting by that one single item.
    Furthermore, Deady et al (2006) seems to engage in the same type of parsing of the sample by median splits that ejwillingham suggests. Hers is done by eye-balling because she doesn’t have the data but the idea is the same. Statistics can’t ever speak for themselves – they need to be explored, challenged, and interpreted to be meaningful.

    Link to this
  23. 23. DrMiriam 7:19 pm 11/13/2011

    I am genuinely shocked that out of all my comments posted here in response to your many questions, you chose to take two phrases out of context and respond only to them. I can only assume that your lack of your response to all of my comments means that you realise your original critiques were unwarranted.

    I did not refer to @ejwillingham as a layperson. I responded to the comment that using “number of children as a unit” seems “odd”, and I stated that I can fully understand that to a layperson this may seem odd, and so described our scientific rationale for using number of children, rather than the “zero vs. some”.

    I have no doubt that @ejwillingham is a highly accomplished and knowledgeable biologist and writer, and I never suggested otherwise! My sole comment was that her comments (namely the disregarding of our own analyses and what they demonstrate, and instead insisting that her eyeballing of the data concluded there is no clear pattern) would suggest she may not have a good grasp of statistics. Please see my further comments on this below.

    I addressed your original points about the data, with my description of how our parametric and non-parametric analyses produced comparable correlation co-efficients. I presumed that you would realise what that demonstrates, but as this is not the case, I will explain in detail (although I do not think this is the responsibility of a study author to have to explain what the analyses demonstrate, to unfamiliar readers).

    The link you provided to Wikipedia illustrates my original comment wonderfully! That our analysis is robust precisely because we did not rely on only Peason’s correlation co-efficient for analysis, as we also used Spearman’s rank, a test less susceptible to outliers. As you will see in that link, Ansombe’s quartet refers to the possibility that an identical *Pearson’s* correlation co-efficient can be produced from 4 very different sets of data, and thus why one must not rely solely on this Pearson’s statistic without analysing the distribution of the data. If Pearson’s (parametric) and Spearman’s (non-parametric) produce similar correlation co-efficients (as in our study), then it demonstrates that data are elliptically distributed and that there are no prominent outliers. So your reference to Ansombe’s quartet is not relevant to the discussion of our data; mentioning it simply re-highlights that you have disregarded and/or not understood our own analyses.

    You are setting up a straw-man by suggesting that I dispute that data and distribution never warrant further investigation, and that (in your twitter correspondence) I imply we should simply “hail the almighty r value”! You seem to presume that we have not looked at our own data ourselves, tested distribution, and chosen the appropriate set of statistics to analyse the data. You suggest that your “closer look at the data” (eyeballing and disregarding the analyses that have been performed) reveals “no clear pattern” between high and low estrogen groups. Surely you understand that using the full variance in a measure will produce a more powerful analysis of a linear relationship than reducing the variance of the data into a binary split? This is why our correlational result can be considered a much stronger test of a relationship, than a significant result in a high/low split. We did a median split in our initial preliminary analyses, and it did (of course, this could be predicted from our combination of reported analyses) show the high estrogen group reporting a significantly high ideal number of children than the low estrogen group.

    This bizarre disregard for our actual analyses of data and distribution, is a little more patronising (and I use this term because this is how you labelled my initial Guest Blog post) than anything I have suggested, and as you say, not something I am used to in the context of a normal give-and-take discussion of research.

    Your comments re individual fitness and inclusive fitness are surprising. I did not in my comments, or previously, claim that we are measuring individual or inclusive fitness. To reiterate, I stated that “as number of children is a key component of inclusive fitness”, it is of value to investigate the full variance in a variable (ideal number of children) which may potentially relate to number of children; and that therefore making the binary split between “none” and “some” children makes less theoretical sense. I really cannot work out why you cannot see that this is valid! Of course, ideal number of children is not the same as predicted number of children or actual number of children! Nowhere have I said that they are the same, so making the argument to me that they are not the same, is unfortunately falling foul to the straw-man fallacy again.

    Ok. What we measured (ideal number of children) has of course many other factors that will relate (e.g. woman’s own family, education, etc. as you listed). These factors are just as likely to relate to predicted number of children, and actual number of children. So I’m not sure how you are using this point as a critique against the use of our measure? The point you are making is confused, and seems to return to one of the original critiques that there are many other factors that will relate to ideal number of children, besides any biological correlate in hormone levels. Again, I have to reiterate that this point was never put up for debate! Of course there are a huge amount of contextual and circumstantial variables that will relate to ideal number of children. These will relate in a sample of any size. How would a binary analysis of “none” vs. “some” children be less confounded by contextual variables? I think your seeming confusion in the points you made, may relate to the issue that you have with the label “maternal tendencies”, rather than with the analysis that we used.

    As I reiterated earlier, I did not refer to you as a layperson. It is not necessary to list your full credentials, as I have no doubt that you are an accomplished and published biologist! I referred only to your comments, as they are what are relevant to the discussion. It is those comments (and not your credentials) that seemed to query your grasp of statistics. This is not a personal attack, so do not defend it as such, I can only comment on the comments you make.

    Ok! On to the last comments you made. My posts are getting longer and longer, as it seems that no sooner have I answered your original critiques, new ones are generated! I am flattered that you have taken such an interest in my research as to take the time to look up my past publications. All the images have been created from separate samples. The composites created in that sample (from 2006 paper) may well look quite similar to our maternal composites, to a layperson, or a scientist not specialising in facial imaging techniques. So I will attempt to explain a vast methodological field in a few sentences… When you create facial composites (indeed the purpose of doing so), you “average out” all the characteristics that are not associated with the variable of interest. Therefore, each face in the composite pair will look similar, apart from those specific characteristics associated with the variable you have divided the faces based on. In theory, if you take over 10-12 random faces from a homogenous set (in terms of age and ethnicity) and make an average composite, it will look very similar to any other 10-12 random faces, as the variance will have been averaged out. Therefore, when we create composite pairs based on a variable, we always use at least 12 faces in a composite, so we can ensure that the differences shown in a pair are not caused by random variation alone, but are reliably associated with the variable of interest. So the composites should look similar to each other, and to other pairs, except for the features related to the variable of interest (in 2006 paper, hi/low oestrogen levels; in 2011 paper hi/low maternal tendencies). The camera set up in the lab in St Andrews is standardised and colour callibrated, which is why all composite pairs will have the same tone, backdrop etc. etc.

    We did not consider taking the oestrogen composites and the maternal composites and having participants rank their femininity. I’m not sure theoretically why one would do this? (But I can speculate if you would like, that if estrogen is the mediating variable behind facial femininity and maternal tendencies, then the high estrogen face would be predicted to be ranked the most feminine, next the high maternal tendencies (as proxy of high estrogen), then low maternal tendencies (as proxy of low estrogen), then low estrogen).

    @ mcshanahan
    I appreciate your comment on the operationalisation of “maternal tendencies” as number of children desired. I can certainly understand that as a social scientist you would question the use of that single question to represent the idea of maternal tendencies in the general sense. We certainly would not suggest that in the general sense of the term, the whole of maternal tendencies could be boiled down to a single question “ideal number of children”. In our earlier study (Deady & Law Smith 2006), ideal number of children was highly correlated with the overall factor based on many questions; there was only 1 factor related to maternal tendencies, indicating that all questions were measuring the same construct. From our perspective as evolutionary scientists, “ideal number of children” has much more theoretical relevance (to potential inclusive fitness) than the other questions we initially asked in our pilot work (and as it was highly correlated with the overall one factor) it was the variable that we chose to use. As a side note (and this is probably more your area than mine), I am reminded of Rosenberg’s questionnaire for self-esteem (consisting of many questions)… wasn’t that found to be as accurately represented by the one question asking how much self-esteem a person has? So perhaps a single item question does not necessarily always negate the potential for an accurate measure of a variable; it depends on the context of what you are investigating. But I certainly understand that from a social scientist perspective, you would be more interested in investigating the nuances of a complex phenomenon that could also be described using the term “maternal tendencies”.

    Lastly, you note in our pilot study (the Brief Report in Biological Psychology, Deady et al. (2006) the use of a median split in one part of our analysis, for women’s testosterone levels in high vs. low scorers on the Masculinity domain of the Bem Sex Role Inventory (Bem, 1971). In that sample, the resulting correlation co-efficient from using the full variance of a continuous variable was not significant (with the small effect size of 0.2), and therefore we further analysed using a high/low median split, which revealed high Bem Masculinity scorers had sig higher levels of testosterone. A significant result in this data with this form of analysis demonstrates a weaker relationship than would a significant result with a correlational analysis. The correlational analysis in our current study between estrogen and “ideal number of children” represents a stronger or more conservative test of a relationship than a median split (hi/low, between group result). Therefore, I really cannot understand the fascination with performing a median split with our current data! (From our reported results, one could deduce this would necessarily be significant also, and as I mentioned earlier it formed part of our preliminary analysis, and produced a sig result).

    Just to reiterate, I did not ever suggest that data and distribution and statistics should not be explored, challenged and interpreted. My comment was in reference to ejwillingham’s disregard for i) our own thorough exploration of our data and its distribution, ii) our selection of multiple statistical analyses in order to interpret the data and demonstrate the result, and iii) what the combination of analyses statistically demonstrate; and instead she engaged in eyeballing to conclude definitively that there is no relationship present!?!

    Thank you mcshanahan for the thoughtful acknowledgement of the time it takes and willingness of myself to engage in discussion. I was very happy to do so. And for the most part it has been really interesting to engage with scientists from different disciplines to discuss our findings. I do however feel disappointed by the response by ejwillingham publically on twitter, including derisory comments about my age and personal choice of research interests. I do hope that this type of response does not put off other scientists from engaging with bloggers in the future about their research, as I can only imagine that it will.

    Link to this
  24. 24. shane_richard84 1:06 pm 01/6/2012

    Really good!
    I am searching this kind of informations.the contents are very interesting & valuable.
    This site helps us to know many unknown informaton.
    I have also read some news from another site like this.
    Thanks a lot.

    Link to this

Add a Comment
You must sign in or register as a member to submit a comment.

More from Scientific American

Email this Article