A brief history of gap analysis for crop diversity conservation

Many thanks to long-time friend-of-the-blog Dr Colin Khoury for this latest contribution.

Conservation gap analysis using Geographic Information System (GIS) tools relies on several sources of biological and environmental data, including in situ species occurrences and climatic and other environmental variables used to conduct species distribution modeling, as well as passport data from ex situ collections. While species distribution modeling and associated methods had been in development since at least the 1970s (see Rebelo, 1994 and Booth et al., 2013), the widespread use of these tools was not possible until such biological and environmental data were more easily and widely accessible, for example through GBIF, WorldClim, and Genesys.

Genebank scientists, often in collaboration with academic researchers, began to apply available GIS-based tools to PGRFA conservation around the turn of the century, proceeding to develop new methods, software, and datasets (for early examples, see Guarino, 1995; Greene and Guarino, 1999; Guarino et al., 2002). Global climate datasets were compiled at relatively high spatial resolution (e.g., Hijmans et al., 2005), providing key inputs for species distribution modeling. Current distribution models for plant genetic resources began to be calculated, for example for wild relatives of potatoes and peanuts, while future distributions under climate change also began to be modeled, for example for wild peanuts, potatoes, and cowpeas. Field collecting was informed through these tools, for example for wild clover and wild chile pepper expeditions.

The focus on wild relatives of food and agricultural crops was not haphazard. These species were receiving increasing conservation attention at the time in recognition of their value as genetic resources for crop breeding, and because many wild relatives were known to be threatened in their natural habitats and were underrepresented in ex situ repositories. International conservation targets for crop wild relatives had been set at the Convention on Biological Diversity (CBD) (for 2011 to 2020 and again for 2020 to 2030) and in the United Nations Sustainable Development Goals (SDGs) (for 2015 to 2030). At the same time, species distribution modeling methods had primarily been developed for wild species, i.e. taxa whose distributions are mainly driven by climatic, edaphic, and other environmental factors, rather than human preferences (which are more difficult to model), therefore the application of these methods to crop wild relatives was relatively straightforward and a logical starting point for the agricultural research community.

Programs such as DIVA-GIS and FloraMap were created to make the methods more accessible to researchers and practitioners without extensive GIS experience and computing power. Such efforts continue, for example by CAPFITOGEN.

Through an international genebank initiative called the Global Public Goods Project II, run from 2007-2010, the distributions of the wild relatives of ten CGIAR mandate crops were mapped, with priorities for further collecting for ex situ conservation identified. A major milestone of that project was the publication of a standardized, replicable gap analysis methodology for the ex situ conservation of crop wild relatives, which made use of herbarium and other biodiversity observations acquired through GBIF and other sources, as well as genebank passport data, and which embraced recent advancements in species distribution modeling methods.

Continue reading “A brief history of gap analysis for crop diversity conservation”

Himalayan maize: The saga continues

I decided to dig a little deeper into the climatic adaptation of Himalayan maize. You may remember from my last post on this that Genesys has 96 maize accessions from over 2000 masl in the Himalayas, collected at some 50-odd unique localities. When I ran these accessions through the Subsetting Tool in Genesys, I got the following histogram.

What struck me — and surprised me — was the spike of sites way at the left hand of the precipitation plot. So I took a closer look at the results of the subsetting analysis. And the clustering algorithm it uses to look for similar sites did in fact identify two climatically quite different groups of locations: 45 of the unique high altitude maize collecting sites (the blue ones) are indeed drier than the other 7 (in orange).

Much drier. (And also colder actually, but that’s another story.)

They’re the ones mainly collected in Pakistan and Afghanistan.

Now, I don’t know whether these areas really get 135 mm of annual precipitation, which seems really low, and in any case the agriculture there is clearly irrigated.

But those maize samples, mainly now conserved at CGN in the Netherlands incidentally, the results of something called the 1976 Netherlands-Pakistan Expedition by the Stichting voor Plantenveredeling, do seem to have some very unique adaptations.

Maize location, location location…

A quick search on Genesys revealed 302 maize accessions from above 1500 masl in the Himalayas, and 62 above 2500 masl. Of course, there are many more maize accessions from high altitudes in Central and South America, but their photoperiod adaptation (among other things) is likely to be quite different.

That’s from a post I put up here a few days ago. Some people said I should back up that “among other things,” so here goes.

I extracted from Genesys lat/longs for 2,388 maize landrace accessions collected above 2000 masl in the Andes, and for 96 in the Himalayas. I then asked ChatGPT to calculate separate averages for the two sets of accession collecting localities for two climate variables, i.e. mean annual temperature and precipitation. It asked me to supply it with the WorldClim data as a zip file, which I duly did.

It told me the Andean sites had a mean annual temperature of about 12°C and the Himalayan ones of about 6°C. Mean annual precipitation was around 750mm and 640mm, respectively. So there could well be some significant overall differences in adaptation between the two sets of germplasm.

But…

I used the coarsest WorldClim dataset, which is probably not a great idea in mountain areas. And many accessions were collected at the same sites: those 96 Himalayan maizes for example come from only 52 distinct places. I should probably have only used unique collecting localities to make the calculations. The “Subsetting Tool” in Genesys does do that, and displays nice histograms, but it doesn’t give you average values for the whole subset. Incidentally, when I looked at the histogram for total precipitation for the Himalayan material, there was a suspiciously big spike way at the dry end. Really not sure what’s happening there.

Maybe some climatologists or geographers or GIS jockeys can explain. And do a better analysis. And come up with a really easy way of extracting climate data for a long list of localities.

Stairway to maize diversity

There’s a nice article in Rising Kashmir highlighting that region’s cold-tolerant maize landraces as a unique source of genetic diversity. What I liked about it is that it doesn’t condescend to its audience. It’s unapologetically technical and niche, while successfully (I think) striving to be understood by all. That’s rare. The author, Dr Salika Ramazan, argues that long adaptation to Himalayan environments has produced valuable traits for climate resilience and future maize breeding, and advocates for urgent conservation before this irreplaceable diversity is lost.

A quick search on Genesys revealed 302 maize accessions from above 1500 masl in the Himalayas (yellow on the map below), and 62 above 2500 masl (red). Of course, there are many more maize accessions from high altitudes in Central and South America, but their photoperiod adaptation (among other things) is likely to be quite different.

Distribution of high-altitude maize accessions in the Himalayas (from Genesys).

Brainfood: Spatial data edition