In May of this year, the sports world’s self-appointed judiciary, the international Court of Arbitration for Sport (CAS), upheld a controversial regulation that prevents women with naturally high testosterone (T) from competing in the women’s category in long sprint and middle distance running events. South African middle-distance runner Caster Semenya brought the case, with Athletics South Africa, against the International Association of Athletics Federations (IAAF), arguing that IAAF’s rule is unscientific, unethical and discriminatory.

The CAS panel affirmed that the rule is discriminatory because it only applies to the women’s category, and only to some women within that category. But in a two-to-one decision, CAS deemed the discrimination to be “justified” based on the IAAF’s arguments about sex differences and T. Men are, on average across athletics events, 9–12 percent better than women. The IAAF claims that T is “the main driver” of this difference. By extension, it also claims that women with T levels in the typical male range have an “insuperable advantage” over women with T in the typical female range.

While this case raises important questions of ethics, human rights and medical harm, the IAAF defends the rule with claims about scientific consensus, glossing over profound disagreements about the evidence. Here, we address four myths central to these debates.

Myth 1: T is the “master molecule of athleticism.” T’s effect on athletic performance isn’t always positive, as the IAAF’s own data on elite women athletes well demonstrates. Its initial analysis of data from two world championship competitions showed that women with higher T had significantly better performances in only five of 21 events.

Serious methodological problems with the IAAF paper prompted independent researchers to call for the paper's retraction, and the IAAF issued a correction. But the corrected version still undermines the regulation. In three of 11 running events, the lowest T group did better, and the strongest association across all events was the negative association between T and performance in the 100 meters, where lower T athletes ran 5.4 percent faster than the highest T athletes. In none of the events where high T athletes performed better was the gap greater than 2.9 percent.

One independent group requested and obtained a subset of the IAAF data, concluding: “The results of [the IAAF’s first study] are clearly unreliable, and those of [the second study] are of unknown validity,” making it “impossible” to discern the real relationship, if any, between T and performance. Clearly, though, neither this study nor the broader sports science literature support the IAAF’s claim that targeted athletes “have the same advantages over [other] women as men do over women.” 

Many studies across a range of sports show similar mixed relationships between performance and T. Consider a recent analysis of teenage Olympic weightlifters, in which the best predictor of strength was lean body mass, which has a complicated relationship to T. Among girls, body mass was initially the only significant predictor of weightlifting performance, and T was a predictor of body mass. But, counterintuitively, once the investigators controlled for the girls’ size, they unmasked a strong negative relationship between T levels and performance: girls with lower T lifted more weight.

The researchers noted that T affects muscle, which is crucial to force, but T also affects breast tissue and fat localization in the lower limbs; the latter may be especially important for certain powerlifting moves. Controlling for body mass, there were no relationships between any hormones and performance in boys, even though their T levels ranged from 0.5 to 30.2 nanomoles per liter. In short, T (and other steroids) affect multiple body systems, and the relationships sometimes work in a positive synergy to improve performance, but they sometimes detract from performance.

Myth 2: The best way to see what T does for athletic performance is to compare men and women. It’s simple, some people argue: men have “greater lean body mass (more skeletal muscle and less fat), larger hearts (both in absolute terms and scaled to lean body mass), higher cardiac outputs, larger hemoglobin mass, larger VO2 max (a person’s ability to take in oxygen), greater glycogen utilization and higher anaerobic capacity”; these all affect athletic performance, and are all affected by T, so men’s greater athletic performance must be due to their higher average T levels.

This is a series of linked, but not necessarily logically connected propositions. The wide range of physiological and social differences between women and men athletes confound those comparisons. Even characteristics that are influenced by T are also affected by multiple other factors. They can’t simply be boiled down to T, either in adulthood or during earlier development.

Scientists overwhelmingly prefer within-sex comparisons to answer most questions about the factors influencing sports performance, though sometimes it’s useful to analyze data both within and across sexes. Emerging research using both types of analysis reveals that some factors long thought to be fundamentally sex- differentiated turn out to hinge on other elements. For instance, most studies have shown that men have a greater proportion of fast-twitch muscle fibers, a difference traditionally attributed to genetics.

A recent study of elite weightlifters, though, found women had as many, or more, fast-twitch fibers as men, and concluded that “athlete caliber and/or years competing in the sport influence [muscle fiber proportion] more than sex per se.” T also likely has a relationship with fiber type via body mass, but as the teenage weightlifter study shows, the relationship with T and body mass isn’t straightforward. Simple comparisons of women and men athletes can’t reveal the specific relationships that underlie athletes’ physiologies, and can obscure the recursive, sometimes positive and sometimes -negative relationships with T that are in the mix.

Myth 3: Suppressing an athlete’s T reduces performance, so different T levels between athletes must similarly affect performance. The IAAF says it has data showing women athletes’ performance suffers following abrupt and dramatic T suppression. While this may be true, the organization isn’t justified in using this observation to conclude that women with higher T levels “possess a very clear performance advantage” over their peers, on par with what men typically have over women.

The IAAF narrative suggests that manipulating T only affects athletically-relevant aspects of function, disregarding the fatigue, sleep disturbance, metabolic changes and other physical problems that accompany significant hormonal disruption. The drop in performance might be attributable to these many side effects, physiological changes and psychological influences.

Moreover, manipulating T in individuals can’t illuminate how T levels figure in differences across athletes: this confuses intra-individual analysis for inter-individual analysis. There’s not just a logical problem with this conflation, but a data problem, which P.J. Vazel, an elite track and field coach and member of the Association of Track and Field Statisticians, underscores when he notes that in individuals, “raising or lowering T will show a relationship with performance,” but in analyses across athletes, often “there is no relationship found with performance.”

Faryal Mirza, a clinical endocrinologist at the University of Connecticut Medical Center, suggested that one reason studies don’t always find consistent links between T level and physiological variables is that sometimes high T signals that a person isn’t very efficient at using T: the body is producing more precisely to arrive at “typical” function. (Faryal Mirza, conversation with the authors.)

Myth 4: These regulations are solely about T and performance. Scientific claims are central to this debate, but so is the broader context in which IAAF officials communicate their beliefs about women’s bodies. The vehicle for performance differences is supposed to be T but, as the IAAF has been forced repeatedly to defend the T regulations, it has revealed its concern lies less with the T level than with the source of the T.

The IAAF has made this concern explicit by narrowing the group of women to whom the regulations apply. Women with polycystic ovarian syndrome (PCOS), the most common reason that women have naturally high T levels, and congenital adrenal hyperplasia (CAH) were recently explicitly excluded from the 2019 regulations even when their levels exceed the threshold, though the IAAF has argued that women with PCOS and CAH derive “advantage” from high T.

Likewise, recent IAAF statements highlight sex-atypical chromosomes and gonads, which functions as a dog whistle to suggest that the targeted women athletes are not “really” women. Yet this rule was supposed to be different from prior sex testing regulations, precisely because it focused on T rather than on other aspects of sex biology that are variable among women (and men). Even for those who accept that endogenous T makes an outsized contribution to athletic performance, the defining feature is supposed to be the level of T, not the source of it.

The IAAF’s enforcement of gender normativity is also evident in its rebuttal of concerns raised by the World Medical Association and the United Nations, among others, about mandating that healthy athletes undergo medically unnecessary interventions in order to compete. Rather than viewing the serious and long-term consequences of lowering testosterone as “side effects,” the IAAF proposes that they are “the desired effects.” These changes—including reduced muscle and increased fat—are supposed to produce the kind of body that Stéphane Bermon, Director of the IAAF Health and Science Department, has presented as the “ideal female phenotype” at scientific conferences.

La Maja Desnuda by Francisco Goya (c. 1797–1800) presented by the IAAF as the ideal female phenotype. Credit: Francisco de Goya, La maja desnuda, 1795-1800, Museo del Prado, Madrid

Disregarding women athletes who have resisted these interventions, even to the point of bringing legal challenges against the regulation, the IAAF insists that these “medications are gender-affirming” and “change their body to better reflect their chosen gender.” The latter statement insinuates that women athletes who do not willingly modify their bodies to fit IAAF standards actively “choose” their gender, which deliberately encourages confusion with transgender athletes.  

These four myths are subtle yet powerful. Owing to the ubiquity of what we call “T Talk,” it can be hard even to recognize them. A jumble of science and folklore, T Talk directs attention away from the most important consequences of the regulations by doing what it does best: making challenges to “common sense” thinking about T and gender seem antiscience. Meanwhile, the building criticisms of the interventions as medically unnecessary intrusions in women’s health are, unquestionably, based on science.

The unwanted manipulations of women athletes’ bodily integrity also contravene both international human rights law and medical ethics. At the end of May, Semenya filed an appeal to the Federal Supreme Court of Switzerland saying “I am a woman and I am a world-class athlete. The IAAF will not drug me or stop me from being who I am.”