10 A.M.—It is hot and sultry in the slums of the Campina Barreto neighborhood on the north side of Recife, in Brazil, and a public health worker named Glaucia has just taken a blood sample from a young, pregnant patient. Glaucia feeds it into a portable sequencer the size of a USB stick, plugs the sequencer into her computer and waits for the results. The device identifies genetic markers of the Zika virus, but flags the fact that this is a mutated strain that could be resistant to existing vaccines. She reports the information to her colleague, Franco, at the nearest hospital and to public health authorities. They need to know that this could signal the start of an outbreak.
This scenario is imaginary, but researchers around the world now use pocket-size genomic sequencers to rapidly detect resistant pathogenic strains in hospitals, explore microbial diversity in Antarctic ice valleys, and diagnose infectious agents in food supply and aboard spaceships (the device works in microgravity). In 2015, for example, Johanna Rhodes from Imperial College London relied on portable sequencers to identify the genetic makeup of Candida auris, a multidrug-resistant fungal pathogen that had caused an outbreak in a London hospital. The same year, a research team from Birmingham University flew to Guinea and used the same technology to detect strains of Ebola in human blood. In a few months, they had sequenced 142 Ebola genomes on the spot, producing results less than 24 hours after receiving an Ebola-positive sample.
But what if sequencer-equipped researchers were able to transmit what they’re learning directly to others? Imagine students in universities becoming the first “sequencing line of defense” by detecting bacteria resistant to antibiotics and educating their neighbors about them. Imagine the same neighbors equipped with portable sequencers to identify microorganisms in soils capable of fighting resistant pathogens. These new “bio-citizens” would be socially responsible actors who use biology as the main language to understand themselves and the world around them, playing an increasing role in protecting global health and ecosystems.
This is the utopian version of what visionaries call the second genomic revolution, where sequencing our genomes and those of other species becomes a pervasive data market in which DNA is the primary currency. Yet we must remain lucid about who will primarily contribute and who will reap the rewards of streaming our DNA to the cloud. The way forward is to make sure that this trove of data does not benefit only those who already reign over our digital infrastructures but build “counter powers,” global commons where citizens can learn to turn their own data into innovations.
The new lab-in-your-hand technology is the product of Oxford Nanopore Technologies, a British company, whose ambition is to democratize genomic sequencing. Its sequencer, called MinION is as small as a USB-stick and easy to use for any apprentice scientist who knows how to prepare samples of blood, bodily fluids or water to be fed into the device. Such preparation is easily done by amateur biologists in DIY bio labs. Researchers and clinicians across the world have now adopted these portable sequencers, some to detect foodborne outbreaks in hospital, others to analyze the DNA of new species in the jungle. As early skepticism fades away, industry giants (Illumina and Roche) and newcomers (Genapsys) alike are showing interest in following Oxford Nanopore’s head start in portable sequencing.
If the ambition is to promote more distributed use of genomic sequencing, users also need a ready-made platform for interpreting genetic data. Oxford Nanopore has designed an intelligent cloud lab, Metrichor, to be used for genomics data storage in conjunction with smartphone apps that interpret the meaning of DNA sequences.
The convergence of automation technologies, intelligent algorithms and cloud computing is progressively making genomics available to less skilled actors. While this does not necessarily ensure democratization, it does enable us to imagine it. And so, what if it actually happens?
The world around us would be equipped with increasingly sophisticated bio-sensing capacity: the ability to identify the genetic composition of our bodily fluids, species surrounding us and microorganisms on our skins and in our backyards. Portable genomic sequencers in our pockets and cell phones would become part of our networks of sensors—what we already call the Internet of things (IoT).
The attributes of a new “bio-citizen” then look like this: scientists, patients, congressmen, employees—everyone—will be monitoring the DNA of their own bodies on shared cloud labs. Portable genomic sequencers, the size of a USB stick and connected to our smartphones, would also be integrated to our most strategic technical systems, including agro-food facilities, airports, battlefields and hospitals. These DNA-reading sensors would identify the nature, transmission paths and mutations of deadly viruses, engineered bacteria and even forgotten lethal pathogens that could one day be freed by the melting permafrost. In their home, individuals would have access to liquid biopsies – blood tests that could track their most vital biomarkers and identify at an early stage the pieces of DNA shredded by a cancer tumor or a viral agent. If millions of citizens were streaming these data to the cloud, they would build the most powerful data set for preventive and precision medicine the world has ever known. The genetic identity of any living thing, then, acquires a new life on the Internet. We enter the age of the Internet of living things (IoLT).
The amount of genomics data to be stored, curated and protected in the digital bio-space will keep growing, requiring powerful and expensive computing platforms. It will create a complex architecture with new needs related to the governance of such an increasingly data-driven society.
Without access to the cloud, as provided by Google and Amazon, many biomedical projects—from J. Craig Venter’s Human Longevity to genome-wide analyses focused on autism and Alzheimer’s—could hardly have taken shape. Google and Amazon offer a deal too tempting to refuse: the most sophisticated cybersecurity strategies as well as analytical speed and power. These services seldom come free; universities, companies—in the future, hospitals, doctors and citizens—will likely keep paying for each genome to be stored, analyzed or transferred to a different repository.
Another hard truth is that most analyses of genomic data are comparative, meaning what can be learned about a new and potentially important genomic sequence is based on some existing point of reference. Yet, genomic sequences of interest risk being held by private databases—think 23andMe—that gain a competitive advantage by selling access to their “genetic gold.”
As a consequence of the growing number of players that may be involved in the process of generating, collecting and processing the data, determining the legal ownership of such data may prove increasingly complex. Like our personal information gathered by the IoT, our genetic secrets might end up trapped by 10,000-word-long consumer agreements.
The powerful and lucrative alliance between genetics and a data-driven society has already made tech giants in Silicon Valley and Seattle the new masters of our digital identities. If we consider the current privatization of consumers’ data and the erosion of digital privacy, it is not difficult to imagine, in the future, large corporations using their vast computing and machine-learning platforms to commodify continuous streams of genetic data about humans and ecosystems. Global conflicts over ownership would have to be balanced by open-source efforts to ensure that research, data and technological tools primarily serve the public good. The Global Alliance for Genomics and Health is an example, a thriving effort to share genomes across disciplinary and geographical boundaries.
For the Internet of living things to realize its promises, U.S. policymakers and regulators, in collaboration with technologists, should have an ambitious conversation about global data commons. How open and resilient should our big data architectures be, in particular those used for monitoring vital public health and environmental factors?
Experts will also need to consider the challenge and cost of ensuring accuracy when dealing with biological and microbial samples. One can imagine an IoLT node monitoring for Ebola virus and sending a positive signal, which, if not substantiated, could cause panic. The potential of monitoring for biological threats is enormous, but methods to validate data and address personal and collective liability issues are needed.
What is more troubling as we slowly enter the age of ubiquitous genomics sequencing is that we face an increasing socio-economic disparity between the technological elites—Silicon Valley or the new Shenzhen tech Eldorado—and the majority of citizens, the ones who provide data. While I have no hope that this gap will soon be closed, the next decade will first tell us if the new bio-citizen is just in our imagination.