Reference genome of the shea tree (Vitellaria paradoxa), a tool for predictive breeding – Interview with scientists!

Posted by

FTA communications

If you live in the Global North, chances are the word ‘shea’ may not ring too many bells.

Best known outside of Africa for shea butter, which is widely used to make chocolate as well as many cosmetics, the shea tree (Vitellaria paradoxa) is an evergreen flowering tree found across the African continent, from Senegal in the west to the foothills of the Ethiopian highlands in the east.

Latest news

It also provides millions of households with a highly nutritious cooking oil, as well as a vital source of income, with its supply chain almost entirely controlled by women. But shea tree numbers have been declining for decades due to land use conflicts, putting many of these livelihoods in jeopardy.

FTA has been conducting research in genomics since its founding as part of Flagship 1: Tree genetic resources and several research priorities, including restoration, plantations and tree crop commodities, and enhanced nutrition and food security.

A recent study, published in Frontiers in Plant Science, explores new methods to reverse the decline of shea tree populations by improving the species through the use of genomics.

FTA spoke with two of the paper’s authors: Iago Hale, Associate Professor in Agriculture, Nutrition, and Food Systems at the University of New Hampshire and the paper’s lead author, and Prasad Hendre, a genomics scientist at World Agroforestry (ICRAF).

This interview has been edited for length and clarity.


Why have shea tree numbers have been declining?

 IH: The main factors have to do with land use change. The shea tree is a dominant species in the parkland ecosystems where it occurs. The problem is that these are very stressful environments: it’s hot, there are long periods of droughts, and so these trees take decades to mature.

There’s a substantial opportunity cost in letting a tree grow slowly over decades, especially when there are competing land uses that would pay off much more quickly.

One example is the conversion of parklands to agriculture, for crops like mangoes and cashews, which happen to be largely controlled by men. So, not only does it threaten shea tree numbers, but it also potentially undermines economic opportunities for women.

Another threat lies in the fact that shea wood makes really good charcoal. So, if you’re a farmer, do you wait 20 years in the hope that a volunteer shea tree of unknown quality will eventually produce nuts, or do you cut it down and immediately get some income from turning it into charcoal? In situations of unclear land tenure, the calculus favoring short-term returns becomes even stronger.

PH: Most shea trees occur in the wild and are not planted by humans. As new volunteer seedlings establish themselves, they’re immediately cut because there are better uses for the land. Ultimately, there are much shorter-term economic opportunities that shea simply can’t compete with. So, the motivation behind this project is: how do you improve the performance of shea in the landscape so that it can actually compete?

Your project aims to tackle these challenges by developing improved varieties of shea using genomic analysis. What exactly is genomic analysis, and what are its potential benefits? 

PH: With genomic analysis, we are trying to understand how the traits we see – the phenotype – are linked to or determined by an individual tree’s DNA. There are regions in the genome that control how genes behave and thus directly control these traits – for example, butter quality, butter yield, the number of nuts, and the number of fruits produced by each plant.

And just to be very clear, genetically improved varieties are not genetically modified. We are not talking about taking a gene from one plant and putting it into another. This is a traditional breeding method, but using modern, faster and more efficient tools.

IH: There are countless examples of crops that have been radically improved without the use of genomics: we’ve had centuries of crop improvement through traditional approaches like cross-breeding. You can do that for crops that have very short generation times, but with the shea tree, we are unable to improve it following traditional approaches simply because of the time taken for it to mature.

Where genomics comes in is by assembling collections of diversity of shea and examining their phenotypes. By genetically “fingerprinting” those trees, we can start to make associations between important traits and the underlying DNA. Now, we can walk up to a young seedling, bring it into a lab to look at its DNA, and assess its potential – without having to wait 15 years.

PH: We call it predictive breeding. Genome analysis gives us a tool to predict the performance of an individual even before it is sown in the field.

IH: With predictive breeding tools, we have a way of supporting rational decision making, which seedlings to keep, which ones to cull. We envision a future in which a farmers will be able to say, ‘Actually, I’m going to keep that tree over there because it’s likely to produce significantly more fruit than that one over there.’

These genetic tools also provide a much more accurate way to assess genetic diversity in the landscape because they enable you to be much more strategic in sampling populations.

The idea of “genetic fingerprint” is fascinating, but what does it actually mean?

IH: When we talk about “genetic fingerprinting” what we mean is using DNA sequencing to “see” what versions of shea’s naturally-occurring genes any given tree possesses. Some of these versions, or alleles, of genes are desirable and some are undesirable, from a production standpoint. Given a nice biodiverse collection of mature shea trees to work with, genetic sequencing and its underlying analytical methods help us see the potential in new seedlings before they mature.

Is there any analogy we could provide to exemplify this type of work better?

IH: Yes! Let’s say you’d like to visit a certain city you’ve never been to before and not spend countless hours driving random directions in hope of landing there. For this task, you need a map. Such a map, coupled with landmarks and road signs, are the tools needed to efficiently navigate and reach your destination. In a similar way, if you want to ascertain if a certain tree carries natural versions of genes that are desirable for end uses, you need a reference genome (the map) and genetic markers (the landmarks or signposts). By creating a reference genome for the species Vitellaria paradoxa, the shea tree, we have developed and made available a navigation tool for use by the whole shea improvement community.

How is this technically done?

IH: We extract DNA from plant tissue (usually young leaves) and sequence it. Although DNA exists as very long, coherent molecules in living cells, it gets all chopped up into very small fragments during extraction. So the sequences we obtain are more like pieces of a labyrinthine puzzle that we then have to assemble using computational techniques that fall under the discipline of bioinformatics. Once assembled as best as we can into the long chromosomes found in living cells, we have the so-called “reference genome”. Further work with the sequencing of expressed genes (mRNA) allows us to annotate the genome, essentially identify and locate the genes themselves.

In total, this annotated genome would be the map in the analogy before.

Sticking with the analogy, once you have an accurate map, once you know the full lay of the land, you then have the ability to navigate from one place to another with only a very small subset of information on that map. A simple set of directions: turn left, go 5 miles, turn right, you’ve arrived!

PH: That’s right. In a genome, so-called “molecular markers” serve as the very abridged signposts in the larger map. We may find that only 2 or 3 regions of the genome explain the lion’s share of difference in shea butter quality among trees. Having the full map allows us to identify those regions, “see” their status with a few strategically chosen markers, and then characterize other trees (e.g., trees in a farmer’s field) with just those few markers. In other words, the analytical burden when applying the information in practice is a mere fraction of the analytical burden needed when creating the map in the first place.

So the “tool” generated by this study is the roadmap of the shea genome!

IH: Exactly. Only with this map in hand can we begin characterizing the genetic makeup of any given shea tree. Thus our reference genome is the thing that enables the genetic fingerprinting and predictive breeding mentioned earlier.

You use existing, mature trees to understand the genetic underpinnings of traits of interest. Once this is done, you can select for improved trees, whether in a breeding program or naturally occurring in the landscape. And this selection is possible because it is based on their genetic make-up (something that can be assessed at the time of germination) rather than their phenotype (something you have to wait decades for).

This opens up an opportunity to breed a species that takes a very long time to mature – and to invest in these improved varieties to address urgent challenges like land use change, climate change, and so on.

So then, why has there been so little work on developing genetically improved shea varieties so far – and what can we do about it?

IH: One main reason is the generation time. By and large, there haven’t been the right incentives for long-term investment in this species. But again, with genomics, the hope is to be able to accelerate the time frame in which we can develop the species so that it becomes viable to work on.

PH: There’s also another reason: the donor angle. It’s difficult to attract a donor – an institution, organization or funding agency – that’s ready to invest over a long-term period. Who is ready to invest for that long without seeing the outcome? Unfortunately, there are few donors who want to invest in trees.

IH: This is where shea has a lot of opportunity because there’s a robust export market and economy around it. In the chocolate industry, for example, there is vested private interest in realizing improvement with the shea tree. So, I think there’s an opportunity with public-private partnerships where it isn’t just donors and foundations that are interested, but there’s also business interest and profits to be made from seeing this tree improved.

What are the main implications for local communities in the Sahel, now that the Shea tree reference genome is available?

IH: I think it’s premature to talk about impact right now. Ultimately, we’ve created a tool, and it’s really going to fall on national programs throughout the Shea Belt and institutions like CIFOR-ICRAF and other partners in the region to use it.

We put the tool into a bit of preliminary use, looking at shea butter quality and doing some initial positing of candidate genes that probably play an important role in that characteristic. The results were promising. We’re quite confident in the quality of the tool that we’ve created, and we’re seeing good insights into how we may be able to use it to select the traits of interest.

PH: We also have the African Plant Breeding Academy to ensure that there’s a critical mass of early- and mid-career plant breeders working in African institutes who have been empowered to use these tools in their breeding programs.

But above this formal training, we need to be creative and think of innovative ways that have not been taught. It has to involve not just breeders and genomics; it has to happen in the farmers’ fields. That’s something that we’re working on at the moment: how can we incorporate everything and work with farmers in a participatory way?

IH: Those are the folks who are really on the front lines and who are going to see the implementation and impact of these tools. Although I’m proud of the work and I’m happy it’s published, if it is not taken on by the farmers and breeders in the field, it will ultimately have no impact.

The immediate follow up question then is: how transferable is this technology to farmers?

IH: It’s not transferable in the sense of farmer’s making direct use of the reference genome, looking at gene sequencing and pounding out bioinformatics programming on farm. But it is transferable in the sense that it enables the application of these methods (targeted genetic characterization) in an efficient way to tree populations of interest, whether they are growing on farm or on a research station. We just need to link the people to the technology.

Bear in mind that farmers are already relying on organizations such as FTA and CIFOR-ICRAF to provide them “plus” materials, so this work is inscribed in an ongoing collaboration. As much as we are sensitive to the disruptions new technologies can have vis-à-vis power and agency of practitioners, I believe it is fair to view our work here as simply providing more information (a better lens) by which to select promising material.

Think about it: at the moment, farmers are selecting trees blindly. For those who actively plant seedlings, genome-enabled breeding and selection is a revolution: it gives them predictive power!

So how can a farmer on the field actually access this technology?

PH: In practice an end-user (farmer, breeder, etc.) would send a small tissue sample (e.g. a hole punch from a leaf) to a central lab and receive the marker data for that sample within 4-6 weeks. This is orders of magnitude faster than waiting for a tree to reach reproductive maturity. Such data could then be used to support decision making around which trees to keep and which to cull. For programs that work to grow and distribute shea seedlings to growers, a strategic marker screen could be used to select against those seedlings with the lowest predicted potential, thereby realizing a net gain in the landscape.

So, the opportunity is now available, but it is critical for us to get the word out and provide the support so that it can be applied and implemented at scale.

–> Read more about the FTA’s work on genomics:

Revised by Ming Chun Tang and Fabio Ricci.

This article was produced by the CGIAR Research Program on Forests, Trees and Agroforestry (FTA). FTA is the world’s largest research for development program to enhance the role of forests, trees and agroforestry in sustainable development and food security and to address climate change. CIFOR leads FTA in partnership with ICRAF, The Alliance of Bioversity and CIAT, CATIE, CIRAD, INBAR and TBI. FTA’s work is supported by the CGIAR Trust Fund.


Back to top

Sign up to our monthly newsletter

Connect with us