Skip to main content

Small Is Beautiful

Huge medical trials are vital to testing novel drugs and other interventions—but smaller studies are also crucial

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American


In the world of clinical trials where data rules with near-total power, there is a constant drumbeat that sounds like “more data … more data … more data.” This increasingly loud and rapid pounding has led many people to believe the only way to determine if a new drug or treatment has any value is to rely on the results of large clinical trials.

However, in any experiment, including clinical trials, a delicate balance exists between the magnitude of the effect of the intervention and the number of observations you need to detect a difference between that intervention and a suitable control condition. The equation is pretty clear—the larger the effect, the fewer number of patients you will need to enroll into a clinical trial (or the less time will be needed to detect an effect).

But when the effect of a new drug or other intervention is likely to be less dramatic, it can be tempting to skip small trials altogether. And even in cases where there is a reasonable expectation of a large effect, study sponsors may feel that they are vulnerable to criticism that any results from a small study are inherently untrustworthy so they skip this step. This may sound reasonable. If small clinical trials are just a waste of time, resources and money, are patients and society not best served by going directly into large trials?


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


I would like to make the case that small clinical trials are not only a good idea; they are essential in learning about a new compound or intervention so that subsequent clinical trials are designed and informed by experience, not just opinion. The most common critique of small clinical trials is that they are not as informative as larger studies to which I reply that depends on how small clinical trials are designed, what questions you ask of a small trial and how you approach the analysis of such a study.

One of the main drivers for large sample sizes in clinical trials, particularly in the area of neuroscience, is the fact that patients can and will respond to placebo treatments. It has been amply demonstrated that patients with a broad range of neurological disorders and even those with severe forms of certain disorders can still respond to inactive therapy and that the placebo response is physiological, not just psychological. So how can one mitigate the effect of placebo? 

One of the best tools available to researchers is a methodology developed by Maurizio Fava (Dept. of Psychiatry, MGH/Harvard) in which patients are initially randomized to active or control treatments and, after a period of observation, only those subjects who were initially randomized to control and did not respond are then randomized again to active or placebo. By using this sequential randomization to remove placebo responders, sample sizes can be reduced by approximately 50 percent.  If you are at the early stages of testing a compound and you want to see whether the compound has any chance of working, this is a time-, cost- and resource-efficient approach.

In some cases, the treatment being considered is a symptomatic therapy that is expected to have a reasonably rapid onset and fairly short duration. In such cases, the N-of-1 clinical trial also may be useful. The fundamental premise of these types of studies is that each subject is exposed to both active and control conditions multiple times in a random fashion. What is the value here? 

One of the other drivers for large sample sizes is the intra-subject variability. Most medical conditions—even those that are chronic and generally stable—have some degree of variability. Having a subject go through multiple periods with both active and control treatments allows researchers to get more accurate and precise data from every subject. N-of-1 studies have been in and out of fashion several times, but with the advent of technologies that allow the collection of physiological data non-invasively (Fitbit-like devices), the potential for N-of-1 studies finally may be unlocked. It is entirely feasible to aggregate multiple N-of-1 studies to get estimates of how an intervention would fare across larger populations.

Sometimes, one is limited to doing a small clinical trial by practical factors like funding, limited access to patients or limited supplies of an investigational compound. In these cases, extracting as much information from such a study is an absolute necessity. The most common approach to interpreting the results of a study is to compare the outcome measure from the active vs. control group and apply a statistical test to see if the results are likely due to chance (what we call a p-value). This form of inferential statistical testing rests on a set of assumptions that are generally not true for clinical trials (random sampling from the entire population), but is clearly not true for a small clinical trial.

An alternative method of exploring the likelihood of observing a given outcome is to use permutation tests.  Imagine that you ran a clinical trial in which eight subjects were randomized to active or placebo (four each).  You then take the data from each subject and create every possible combination of eight subjects taken four at a time, which in this case yields 70 combinations. You then check to see how many of these combinations yielded data that are either larger or smaller than what you saw in your original experiment.  The beauty of this approach is that it requires no underlying assumptions and is closer to the question we are asking: What is the probability of seeing these data given the hypothesis.

Lastly and most importantly, the value of small clinical trials is rooted in the notion of independent replication. When one is testing a hypothesis, the scientific method requires that the hypothesis generate testable experiments that can falsify a hypothesis. There is NO single experiment that can prove a hypothesis, but all it takes is one well-crafted experiment to refute it. Confidence in a hypothesis grows in proportion to the number of testable predictions that it makes and how tests of those predictions yield results that are consistent with the hypothesis. When we are making multi-billion dollar decisions about developing new treatments for conditions like Alzheimer’s Disease, anything we can do to increase our confidence that the hypothesis we are testing is right would be welcome news.

Michael Gold, M.D., trained in neurology at the Albert Einstein College of Medicine in NY and did a fellowship in Behavioral Neurology at the University of Florida. He is currently a Vice-president in the Global Product Development group at PPD where he spends his time working with pharmaceutical and biotechnology companies helping them develop innovative therapies for CNS disorders.

More by Michael Gold