About the SA Blog Network



Opinion, arguments & analyses from the editors of Scientific American
Observations HomeAboutContact

Candy-maker releases cacao (cocoa) genome sequence online

The views expressed are those of the author and are not necessarily those of Scientific American.

Email   PrintPrint

The makers of M&Ms have decoded an essential recipe for some of their most popular products: the cacao tree’s (Theobroma cacao) genome.

The sequence, posted online September 15 and available at no charge to the public, was assembled by Mars, Inc., in partnership with the U.S. Department of Agriculture’s Agricultural Research Service and IBM.

The cacao or cocoa tree (the seeds of which are used to make cocoa powder and chocolate) joins other widely consumed crops, such as wheat, corn and rice, that have already had their genome sequenced. Many cacao-producing countries are relatively poor and lack the resources necessary for advanced genetic study of cacao, which is among the top 10 most heavily traded crops in the world, the researchers behind the genome note. The cacao plant is grown on some 17 million acres across the globe, and the largest producer, Cote d’Ivoire, hauled in some 1.3 million metric tons of cacao seeds in 2005.

The researchers are refining the data before submitting it for formal peer review and publication, but they note that the online version is "fully functional," according to Howard-Yana Shapiro, a plant scientist at Mars and adjunct professor at University of California Davis’ College of Agriculture and Environmental Sciences. The public version of the genome is about 92 percent complete and has pinpointed about 35,000 genes, according to researchers working on the project. Mars’ release comes before that of chocolate competitor Hershey, which had also been working to sequence the cacao tree’s genome (with The Pennsylvania State University and French labs), The New York Times reports.

The Mars scientists concede that mapping the cacao genome "was in our interest," says Juan Carlos Motamayor, a genetic researcher at Mars. But, he notes, the data also could help boost the livelihood of workers who grow and process cacao, many of whom live in poverty and work on small farms.

Cacao breeders will be able to start using the sequence data to select for traits such as yield and hardiness, as well as begin puzzling out plant defenses against cocoa blights, the researchers say. Cacao trees take several years to mature and can produce beans for decades. With hopes that the new information will eventually help producers double or triple their yields, Shapiro notes, the genome could go a long way toward creating "an economic model that is sustainable."

Cacao cultivation has more than doubled since the mid-1980s, but most of that growth has come from increased land use, not higher yields. A tree that produces more seedpods would be "a better use of land," Shapiro says. Sturdier, better-producing crops would also likely sweeten chocolate candy makers’ bottom line by upping supply and lowering prices.

Image of cacao tree (Theobroma cacao) with seedpods courtesy of Wikimedia Commons/Medicaster

Rights & Permissions

Comments 4 Comments

Add Comment
  1. 1. hotblack 4:59 pm 09/15/2010

    An interesting move.

    Want a better price from your chocolate suppliers? Give away the genome and hope it spawns more competition.

    Link to this
  2. 2. jtdwyer 5:21 am 09/16/2010

    hotblack – Good catch!
    But according to Juan Carlos Motamayor, a genetic researcher at Mars, "the data also could help boost the livelihood of workers who grow and process cacao, many of whom live in poverty and work on small farms," – just as soon as they start using the sequence data to select for traits such as yield and hardiness…

    Link to this
  3. 3. hotblack 12:24 pm 09/16/2010

    Yeah, I don’t see how this helps the existing growers who are already living in poverty like indentured servants. But the new American business ethic seems to be not to run a good, stable business, but an upwardly profitable one at any cost. Regardless of how badly your suppliers have it, keep applying pressure.

    I had read an article on chocolate production recently, which incidentally poined out that none of the cacao farmers even knew what their product went on to become. None had ever actually seen or tasted chocolate. When you figure a candy bar would have cost them a weeks wages, you can understand why.

    Link to this
  4. 4. news reader 5:59 pm 09/18/2010

    "Mars Inc. Cacao Genome Database claims Open Access, public domain: falls short"
    1 Comment – Hide Original Post
    This initially looked very promising: Mars, along with a number of collaborators (USDA, IBM, Clemson University Genomics Institute; Public Intellectual Property Resource for Agriculture at the University of California-Davis; National Center for Genome Resources; Center for Genomics and Bioinformatics at Indiana University; HudsonAlpha Institute for Biotechnology; and Washington State University), have sequenced the cacao genome and released it "Open Access" and "public domain" for the benefit of all, at a site called the Cacao Genome Project:
    McLean, VA Today, Mars, Incorporated, the U.S. Department of Agriculture-Agricultural Research Service (USDA-ARS), and IBM released the preliminary findings of their breakthrough cacao genome sequence and made it available in the public domain.
    - From the Mars Inc. press release 15 September 2010
    A quote from the Independent article on the release (First rice, then wheat now cocoa genome unravelled 15 Sep 2010) from one of the collaborators on the project:
    Professor Shapiro, a molecular biologist, said: "We thought: ‘Let’s put this in the public domain so everyone has free access to it for eternity’. It could be patented and it can’t be now. We have full open access.
    "public domain"

    "full open access"

    As this is data, we could also be talking about Open Data.

    Let’s take a look at how ‘open’ this Cacao Genome Project is by examining the fine print (of the license):

    In order to get access to the data, you have to get an account (no anonymous access; obligatory registration is pretty counter-Open Access and arguably not ‘public domain’). In order to get an account, you have to agree to a license.

    Registration & license

    From the license:
    The Provider is making available the information and data found in the cocoa genome databases for general information purposes for scientific research, germplasm conservation and enhancement such as plant breeding, technical training, general education, academic use, or personal use.
    Restricted use, appearing not to include commercial use. So more of a GPL-ish license as opposed to a BSD-ish license (before anyone calls me out, but I am not saying GPL is NOT commercial, just generally viewed as less commercial-friendly than BSD).

    Moving on:
    Anytime the User consults the data base through the cocoa genome database web site, he/she shall be bound to the same obligations under this IAA. Should the User store the information and data for future use he/she shall be bound to the same obligations under this IAA.

    The User shall not transfer the information referred to in this agreement, or any copy of them, to a third party without obtaining written authorization from the Providers which will only be provided subject to the third party user entering into this same IAA.
    Wow. That is particularly extraordinary. A WTF moment.

    Fortunately I didn’t agree to the license so I AM able to talk about it now.

    Not allowing third parties to see a license is inherently incompatible to the idea of Open Access, Open Source, Open Data and public domain.

    It is simply bizarre in these modern times.

    Moving on:
    The User shall not claim legal ownership over the information and data found in the data base nor seek intellectual property protection under any form over these information, data and data base. For clarity, the user agrees not to claim any of the sequences disclosed in these databases in any patent application.

    Translation: Don’t claim legal ownership, because we own the IP for the data AND the sequences, and (maybe) we will be claiming patents, etc some time in the future. I have not been able to find anything on the site to the contrary (see below ‘Deluded or Disengenuous’ below).

    Moving on:
    However, the foregoing shall not prevent the User from releasing, reproducing or seeking intellectual property protection on improved seeds or plants that may be developed using the information for purposes of making such seeds or plants available to farmers for cultivation.
    This appears to allow commercial use of the database ("make available" can include selling the seeds), which seeds to conflict with the earlier clause.

    Clearly, this data set has not been released as Open Access and certainly not released into the public domain.

    Instead of Open Access or public domain, they have a restrictive license, which allows gated access for a restricted set of uses.

    They should therefore not be claiming Open Access or public domain for this data.

    Deluded or disingenuous?
    The "About" page of the Cacao Genome Project claims that the license is in place to defensibly block patents of the sequences. While this may be true, claiming an Open Access AND public domain release of the data is either disingenuous or deluded.
    Public access to the genome will be available permanently without
    patent via the Cacao Genome Database. Before viewing the data, users
    have to agree that they will not seek any intellectual property
    protection over the data, including gene sequences contained in the
    database. The Information Access Agreement allows any cacao breeders
    and other researchers to freely use the genome information to develop
    new cacao varieties. This allows for a level playing field and a
    healthy competitive environment that will ultimately benefit the
    sustainability of cacao production in the long term.

    ‘Free’ as in ‘beer’ they should have said.
    posted by Glen Newton at 10:30 AM on 17-Sep-2010

    newsreader said…
    Not only that, but during the first day of the website, it also stated that if you use the data you could not publish any articles with it until some unspecified time in the future, not that I would ever know what to do with it;) but someone out there must ….. so where does that leave them? but it changed the next day… so which version is accurate if you enter the first day…

    18 September, 2010 17:25

    Link to this

Add a Comment
You must sign in or register as a member to submit a comment.

More from Scientific American

Email this Article