Navigating around germplasm databases can be a frustrating experience. A posting on the CropWildRelativesGroup alerted me to a Science Daily piece on tomato genomics which mentioned the wild relative Lycopersicon pennellii (or Solanum pennellii, but I’m not going there, at least not today). But how many accessions of this species are conserved ex situ? And where is it found in the wild?
Ok, so SINGER first, as that’s been much on my mind — and on this blog — of late. SINGER shows 61 accessions of L. pennellii, all from the AVRDC collection. Most of them are from Peru, although 7 accessions have USA, Mexico, Poland (?) or “unknown” as source country. None of these accessions seem to have geo-references, so no nice map from SINGER this time. Pity. But SINGER does give very neat summaries for your query results.1
GRIN returns 51 accessions. I can’t find any easy way of working out the duplication between these and the AVRDC material, but I imagine it is significant. Again, most of the accessions are from Peru, but it’s kind of difficult to get summary information across all accessions in GRIN at the moment, though I know they are working on this. Now, tomato germplasm is conserved at the C.M. Rick Tomato Genetic Resources Center (GRIN tells you so), and they have a database of their own. Querying it results in 45 hits, but again there’s no easy way I can see of looking at summary information across all these. You have to look at each individual accession in turn to find out where they’re from, and if you do you get a little map too. The thing I don’t quite understand is why the accessions are geo-referenced in the Tomato Genetic Resources Center database, but not in GRIN. Maybe they’re upgrading the data gradually at the Centre and haven’t passed the latest version on to GRIN? That may also explain the discrepancy in accession numbers. It looks like they’re working on the geo-spatial part of the database, and it may well be possible to get a map of all the accessions of a particular species eventually.
You can of course do that in GBIF right now, but GBIF only has 8 geo-referenced L. pennellii records: from the Missouri Botanical Garden, the Dutch genebank and the European germplasm database, EURISCO. Too bad the Tomato Genetic Resources Center is not a GBIF data provider. And, indeed, that its geo-reference data is not included in GRIN, which is a GBIF provider.
So the answers to the questions I started with are: at least, and probably not much more than, 112, but that probably includes duplicates; and Peru. But I cannot produce a decent map of the distribution of L. pannellii online. I would have to mess around and download the data from the Tomato Genetic Resources Centre database, and then map it myself. Which I may well do, just to show it can be done. But this little exercise does show that there’s a lot of work to be done to improve the data in — and fully integrate — existing agrobiodiversity databases.