Over the next few months, I plan on writing a lot about research on microbial communities. This is somewhat self-serving, as my own research is moving in that direction, but I also happen to think it's fascinating, and highly relevant to the most current research involving food. Complex microbial communities are of course involved in fermented foods (my recent obsession), but also critically important for the health of the soil in which we grow our crops, ocean ecosystems that generate the majority of the planet's oxygen, and more directly in human health, influencing everything from inflammatory bowel disease to obesity and diabetes.

But the way that scientists investigate these crucial microbes is changing rapidly, largely enabled by a precipitous drop in the cost of DNA sequencing.

Cost of DNA sequencing per million bases. (image from Nature.com)

DNA contains the blueprints for living systems, and analyzing it allows us to peer into previously opaque corners of the living world. Traditional microbiology involves isolating individual microbial species and culturing them independently. This permits deep understanding of that bug, but ignores the wider context of the ecosystem in which it lives. New approaches are allowing us to view that wider context (albeit at lower resolution), enabling new kinds of understanding.

In the next few posts, I hope to explain those new approaches and why they are revolutionizing our understanding of the microbial world, along with some of the challenges they raise.

Why DNA sequencing?

Aside for a few families of viruses, all living things that we're aware of use deoxyribonucleic acid, or DNA as their blueprint. This does not mean that DNA itself does much in terms of the day-to-day business of being alive. That business - the chemical reactions that break down food for energy, or build up new biomolecules, or propel cells through their environment - is mostly carried out by proteins. But proteins can't reproduce themselves, and the instructions for making those proteins, as well as the instructions for when to make them and how much, is all carried in DNA.

The instructions are written in 4 chemical compounds called "bases," usually abbreviated as A, T, G and C. DNA "sequences" are just the order of these bases, so AAATAGGCT means something different than AATGTGCCA. Written human language is actually not a bad analogy for the way that information is coded, the bases are chemical letters, and can be combined into words, sentences and paragraphs that all carry different meanings. An individual microbe is like a single book in a vast library. Over the last 100 years, we've learned to read and interpret, at least to some extent, the language of biological systems. But for most of that time, our investigations have been limited to pulling individual books off of the shelf and investigating them in isolation.

What is DNA sequencing?

As you may have surmised, DNA sequencing is essentially just reading the book. In the early days, this was an incredibly laborious process. In the 1970's when Frederick Sanger was perfecting his technique, it could take months or years to accurately read a few tens of thousands of individual DNA bases. Sanger sequencing relies on another monumental biological technology, the polymerase chain reaction (PCR), which allows us to rapidly make copies of any DNA template in a test tube. His insight was that, as you make these new copies, you can use tricks of chemistry to read each base sequentially as it's being added to the expanding chain.

Sanger sequencing is still being used today for many applications, but it's too slow for modern large-scale genomic investigation. In my next post, I'll describe how Sanger sequencing works, as well as the new technologies broadly referred to as "next generation" sequencing that allow us to sequence DNA on a massive scale. Stay tuned!


Part 1: DNA Sequencing Introduction (Current)

Part 2: Next Generation Sequencing

Part 3: From Genes to Genomes (coming soon!)