Jack Grieve is a computational linguist at Aston University in Birmingham, England. I came across him on Twitter, where he occasionally posts fun maps showing the geographic distributions (usually within the USA) of different words, usually dialectical variants, based on their appearance in geocoded tweets. He very kindly ran a couple of crop names through his magic box for us, and this is what he got. I wanted to know if the distribution of crops could be inferred from where people tweet about it more than the average. I’ve placed his map for each crop side by side with the relevant distribution map from USDA.
Not a perfect match by any means, but not too bad. Except for cotton, that is. Any ideas why people should be tweeting so much about cotton in the northern Great Plains? They’re certainly not growing it.
Jack’s dataset apparently only covers the US and the UK at the moment, which means I can’t check whether Kenyans, say, are tweeting about maize particularly assiduously where they’re growing it, or indeed about maize lethal necrosis where they’re worried about it. Google famously tried to predict flu outbreaks from search patterns, but that seems to have fizzled out. Could tweeting trends help pinpoint crops (or livestock?) and their pests and diseases in space and time? I don’t see much of that kind of thing in the discussion of ICTs in agriculture.
A couple of things spring to mind, though I imagine that they are already taken into account.
1 What is the context of the words used in the tweet? Any other way “cotton” may be commonly used in these areas? Tweeting about the Cotton Bowl in basketball for example?
2 How does geocoded data deal with the fact that now a lot of people use VPNs. If this is masking your geolocation, could it be that you are picking up the location of VPN IP addresses? Maybe the geolocation tracking is smarter than that, and maybe there is not much influence of VPNs on mobile data usage, but I do wonder….
I don’t think there is sufficient tweeting about pests and diseases, along with location information, to get anything useful enough in developing countries. I would be very interested to know if anyone has attempted such analysis though. In the past we have looked into GDELT but again not found enough about the crop/pests we are interested in being discussed on social media.
Examining the “tweet distribution” on corn, it looks like the South Dakotans, Nebraskans and Kansans are disproportionately interested in corn…is it because they are becoming more “corny” than ever, as a result of climate change? Of course, at U of Nebraska, they are the “Cornhuskers”–so maybe that accounts for the corn talk….
Perhaps “cotton” is a euphemism for something else???