Tweeting crops

Jack Grieve is a computational linguist at Aston University in Birmingham, England. I came across him on Twitter, where he occasionally posts fun maps showing the geographic distributions (usually within the USA) of different words, usually dialectical variants, based on their appearance in geocoded tweets. He very kindly ran a couple of crop names through his magic box for us, and this is what he got. I wanted to know if the distribution of crops could be inferred from where people tweet about it more than the average. I’ve placed his map for each crop side by side with the relevant distribution map from USDA.

crops in USA

Not a perfect match by any means, but not too bad. Except for cotton, that is. Any ideas why people should be tweeting so much about cotton in the northern Great Plains? They’re certainly not growing it.

Jack’s dataset apparently only covers the US and the UK at the moment, which means I can’t check whether Kenyans, say, are tweeting about maize particularly assiduously where they’re growing it, or indeed about maize lethal necrosis where they’re worried about it. Google famously tried to predict flu outbreaks from search patterns, but that seems to have fizzled out. Could tweeting trends help pinpoint crops (or livestock?) and their pests and diseases in space and time? I don’t see much of that kind of thing in the discussion of ICTs in agriculture.

Featured: Data, data everywhere…

Cindy Cox of HarvestChoice sort of, kind of, maybe agrees with some parts of what we say in a recent post about the usability of data.

Although a new generation of development practitioners and analysts are increasingly taking advantage of so-called alternative data sources like we used in our paper, such resources are still underutilized in socio-economics. And without a story, data are meaningless.

Sure, and who has those stories? I suspect it’s not the people who can do the analysis.

Brainfood: Animal genomics, Konjac diversity, New wild cassava, New wild cowpeas, Saline breeding, Land sparing, Sorghum diversity

Well, since we’re calling for paradigm shifts…

There’s a lot that’s both nice and deliciously ironic about IFPRI’s recent blog post “Granular socioeconomic data are increasingly becoming available in agricultural research.” This summarizes a letter to Nature Climate Change from HarvestChoice scientists which adds some nuance to a previous commentary in that journal calling on socioeconomists to up their data game. The point is a good one, and it’s stated right up front in the post, quoting the letter:

Spatially explicit, harmonized socio-economic data products are increasingly available to the public, such as population and poverty grids, microdata derived from national household surveys, and rasterized sociodemographic indicators. While these products are often overlooked in the economic literature, they are well suited to the study of climate’s impact on human geography across scales.

But, then comes the irony.

Screen Shot 2016-02-01 at 11.33.52 AMFirst, both the letter and the original article, helpfully linked to in the blog post, are of course behind paywalls. Second, the map included in the post is provided with an incorrect caption. There’s a screen grab here on the left. As you can see, the caption suggests that the map illustrates that childhood wasting is more prevalent in the drier areas of sub-Saharan Africa. But the map shows no such thing, as a glance at the legend, or indeed the map in the original letter, will prove. What the map caption should actually be is “Subnational Demographic and Health Surveys (DHS) data showing centroids of DHS clusters overlaid on Agro-ecological Zones.” Not quite so catchy. The map showing the relationship between wasting and agroecological zones is this one, and it’s in the Supplementary Materials to the letter. ((Which, bizarrely, seem to be freely available.)) I hope I don’t get into trouble for reproducing it here, but it is pretty cool.

Screen Shot 2016-02-01 at 11.37.11 AM

And thirdly, and most importantly, frankly neither the socioeconomic datasets nor the agroecological map which the HarvestChoice researchers cleverly mashed up to make their point about data availability are exactly easy for the average non-GIS geek to use, let alone to combine. Try it and see.

The original paper calls for a “new paradigm in data gathering.” The blog post echoes the follow-up letter in saying “the paradigm shift is alive and kicking already.” Oh good. But I’d like to be able to look at the distribution of stunting and other nutritional indicators together with the distribution of different crops and varieties without having to beg a GIS person to do it for me, or spending half a day putzing around trying to understand what this means

Data layers are available in comma-separated values format (.csv) suitable for MSExcel, in ESRI ASCII Raster (.asc) and GeoTIFF formats (.tif) suitable for any desktop GIS tool. To view ASCII or GeoTIFF rasters in ArcMap or QuantumGIS simply drag and drop the downloaded files onto the layer pane.

Sure, have your paradigm shift in data gathering. But can I also have one in usability, please?