This is a guest post from my friend and former colleague Tami Lieberman. She's a postdoc in the Kishony Lab in the Department of Systems Biology at the Harvard Medical School, and you follow her on twitter @conTAMInatedsci.

As a recently minted PhD, I study the evolution of bacteria during infection. I want to better understand how they acquire antibiotic resistance and what else they might be adapting to as they try to survive inside of us. It is super fun to think about invisible evolutionary processes happening inside of people.

I also have fun thinking about the few pounds of bacteria in and on my body— which bacteria live there, how they got there, and what that means for my health. And, of course, how my microbiome might be evolving. Maybe you have also been thinking about the invisible ecosystems on you.

So when two different campaigns on Indiegogo offered me the chance to peer at the composition of my microbiome, I was sold. I had my genome profiled by 23andme years ago and this was clearly the next step.

But I was simultaneously cautious, because microbiome profiling is messy (and I’m not just talking about the sample collection). For example, a story broke a few years ago that each of our guts belongs to one of three canonical types, called ‘enterotypes.’ Since then, the notion of discrete enterotypes has come under considerable scrutiny. The problem is that microbiome research is hard. The DNA profiling technologies and analysis methods are imperfect and still developing. There is no gold standard, because we do not know how to grow most gut bacteria in the lab, to check if they are really there. Moreover, our microbiomes are a moving target, changing with age and diet. How much then can I really learn from a snapshot?

The "experiment"

I set out to see how much information about my ecosystem I could really garner from a single swab. I bought two gut-sampling kits from the American Gut and one from uBiome. I wanted to ask two questions—(1) How different are the two providers’ results? and (2) How much variability is there within a person, even within a single poop?

One Thursday morning in October, after receiving the kits, I was finally ready to take a very special poop (only in that I was going to learn a lot about it). Both providers requested that I swab soiled toilet paper, and I modified this slightly to fit my comparison. I used two different pieces of toilet paper to grab two very small pieces of stool from the same log. I did my best to homogenize each sample by rubbing the toilet paper against itself. Then I swabbed one sample twice—once for each provider—and swabbed the other for the American Gut. I sent in the samples, registered the kits online, begrudgingly filled out a whole bunch of diet and health surveys (for each provider’s own research), and waited for a few more months for my results to arrive.

uBiome vs. American Gut

My uBiome results arrived in March. Then, a few weeks ago, coincidentally while I was at my first conference about the microbiome, my American Gut results came in. I was psyched.

I had initially expected there to be a significant amount of disagreement between providers. On a really positive note, the American Gut and uBiome results taken from Sample 1 essentially agree. All of the bacteria found by American Gut above 5% frequency were also found by uBiome and at similar frequencies, and the opposite is also true. I had expected to find more differences (like another user found). As these providers are presumably using slightly different methods for DNA preparation and analysis, I found these results both surprising and heartening.

The biggest difference between the two providers was the way the data was presented. American Gut provides a table and a PDF with some hard-to-interpret graphs. Importantly, American Gut only compares you to various averages of people, without any mention of the ‘range’ that I would expect among healthy people (Figure 2). Making it worse, they show multiple bars that represent averages of different populations, and these are very similar to one another (because they are averaging over large, similar, populations). In contrast to these bars, my microbiome looks like it might be an outlier—when, as we will see below, it is within the normal variation of healthy people’s guts.

uBiome has some graphs with the same issue, but also has very informative interactive graphs that avoid this problem and show where your results fall within the range of normal variation. As we can see in Figure 3, anywhere from 50% to 76% Firmicutes is within the normal range. The uBiome interactive website, while still in beta, is super easy to navigate and allows several very interesting types of investigation that I won’t get into here.

Intra-stool variability

The real surprise came when comparing the two areas of the same poop. Check out how many Firmicutes are in the second swab that I sent to American Gut! The number of Firmicutes jumps from 64% in Sample 1 to 76% in Sample 2.

Firmicutes are thought to be associated with obesity. Does this mean that if I should try to alter my microbiome and decrease the number of Firmicutes in order to lose weight?? Probably not—because I doubt (for reasons I’ll expand upon below) that this swab provides meaningful results.

To understand why these two samples varied in abundance of Firmicutes, we can zoom in. The bar graphs I’ve shown so far classify bacteria at the level of phylum (one taxonomic step finer than kingdom), but the type of data used by both providers can classify bacteria down to the family and/or genus level. I downloaded the detailed abundance tables from American Gut and made a quick plot, using the lowest classification provided for each identified bacteria. Bacteria with similar abundances in the two samples will fall on the diagonal line; the further from this line, the more different the two samples. Members of the Firmicutes phlyum are plotted in red in Figure 5:

While most of the abundances vary only slightly between samples, there are two huge outliers (highlighted in blue) that explain the differences seen in the bar graphs (Figure 4). The second sample is 15% Lactobacillus (a member of the Firmicutes), while the first sample showed virtually none (in results from both providers). The second samples had no Prevotella, where both uBiome and American Gut agreed that there was about 12% Prevotella in the first sample!

What does this variability mean?

This intra-stool variation, particularly in Prevetolla (0% v 12%), is a pretty big deal. Several recent studies have focused on Prevetolla as a distinguished sometimes-member of our microbiomes. With varying rigor, these studies have pegged Prevotella a determinant of entererotypes, enriched in certain cultures, and correlated with rheumatoid arthritis and predictors of heart disease.

I haven’t tested this, but I bet I could get a microbiologist or three to look at the results from Sample 1 and tell me how this abundance of Prevotella reflects my diet. Jeff Leach recently wrote a long blog post about the microbiome that focused on Prevotella and associated varying Prevotella abundances between him and his friends to differential consumption of whole-grains. Michael Pollan recently wrote a NYTimes feature that lamented his loss of Prevotella after a course of antibiotics. Would these microbial signatures that Leach, Pollan, and others have focused on disappear if they had taken a second swab?

The take home message from this intra-stool variability is that we need to resist the urge to interpret our personal microbiome data. Notice that I said “interpret” and not “over-interpret.” Many people are doing a decent job taking studies with a grain of salt. However, we need to apply this doubt to the data itself – keeping spatial heterogeneity and other sources of variability in mind. This is especially important as people are using these results for all kinds of things, including monitoring their own at-home fecal transplants. We simply cannot get a good picture of a person’s microbiome from a single sample.

Is this random heterogeneity or systematic biogeography?

Some experts I spoke to at the recent microbiome meeting were not surprised that I found intra-stool variability. They told me that it is obvious that we cannot learn much from a single sample (if this weren’t “obvious,” I might be trying to write a manuscript instead of a blog post…). Our microbiome, after all, responds to what we eat—and what I ate for breakfast is different than what I ate for dinner.

To get around this and other sources of variability, researchers are comparing many people when doing comparative studies (obese vs lean, for example) and taking many samples from an individual over time.

But what if these differences might not be random fluctuations over space, but instead systematic differences reflect the biogeography of stool? There are likely systematic responses of the microbiota to moisture, time, and location in the colon. Some studies have hinted that such biogeography exists. Yet, to the best of my knowledge, there have been no systematic studies of the microbial community across a stool sample, although there is this amusing analysis of hook worm egg distribution across stool.

If there are important spatial differences across stool, taking many samples from an individual over time may not average out this variation. The poop from someone who consistently poops before breakfast might look microbiologically different than someone who poops midday. Two groups (for example, healthy people vs. people with Celiac disease) could have systematic differences in the shape, length, or consistency of stool (click it!). Their microbiomes could be essentially the same yet appear significantly different!

I hope that gut microbiome researchers include intra-stool variability in their (long) list of possible confounding factors.

Well, poop

It is important to keep in mind that this “experiment” had a tiny sample size. It is possible that my two samples showed extreme intra-stool variation and that most people will show more consistency across their poop.

But my feeling is that there is a consensus in the microbiome field that there is too much intra-personal and inter-personal variation to draw meaningful conclusions from a single sample.

Yet, the providers obscure the notion of variability. They do include warnings that these tests aren’t diagnostic (perhaps what 23andme should have done), but no disclaimers that the data itself might not be a good representation of your microbiome. The reports from American Gut obscure inter-personal variability. With interest in our microbiome skyrocketing and people performing risky at-home fecal transplants, the disclaimer of sample variability ought to be loud and clear.

You can read more about Tami's research on the evolution of bacteria during infection here and here.