Tuesday, January 19, 2016

Meaningless Data Viz

This Google Trends data visualization is horrible. It does indeed show "top searched candidate by state", I would guess, but that doesn't at all mean what the map implies it means -- that is, positive popularity of that candidate and also a lead over the other candidates. It doesn't even come close to showing that.



The data underlying this map could be any one of these completely different scenarios, using just the first three listed candidates to show the problem:

Some Example Possibilities
CandidateState AState BState C
1. Trump11,000,0001,000,000
2. Cruz00999,999
3. Rubio00999,999

The order of the candidates in the image may be from the data, or it may be from polls, or it may be something else, we don't know.

In theoretical State A, Trump does lead, but it's meaningless and no one is searching.

In theoretical State B, Trump leads, in a statistically meaningful manner, and people are searching (but we don't know exactly on what terms, "Trump liar" and "Trump bankruptcy" and "Trump racist" are not endearing search terms).

In theoretical State C, Trump leads, but it's a statistical tie, and lots of people are searching.

Each of these scenarios are massively different, yet they would all result in the same visualization.

There are other numerical combinations, this is just a sample of three.

This visualization also conflate geography for population, that is it doesn't have any state level per-capita correction. For this you need, I have learned, a cartogram (I think I've linked to that page before, it's really informative--here's one for the world with a slightly different approach). And, it only considers people who have internet access and who are using Google and who are actively searching during the debate. That leaves out lots of people.

And, it leaves out anything that isn't a state (such as Puerto Rico), although I assume Washington, DC, is in there (who can tell?). It also, and this is a minor peeve, makes it look like the top of Minnesota is connected by land (it isn't).

Edit: Apparently, this map is actually from Google, their "Google News Lab" according to one video where I got this map for the Democrats and it suffers the exact same problem: