eBird’s new Histograms
Posted in Uncategorized on December 7th, 2008 by TomToday, eBird announced a revision of their histograms. And while its since to see them working on improvements, this particular feature improvement wasn’t the best. Most notably, they added more bar widths to their scheme. Why more widths? In my book, this makes it harder to discern what “category” a given line width is in. Formerly, there were five line widths. I thought this was suitable. Now there are nine and I think its harder to read.
If you look at the new legend, it begs the question, “Why not just use an unclassified, stretched legend?” There is practically little difference at this point. It’s doubtful that anyone is going to discern that much more information out of a bar width when it’s barely 0.5 mm thicker. On my monitor, the largest category is 4mm and the smallest just under 0.5mm. Our visual acuity can not do much better than that, yet this scaling would seem to indicate that we can. I’m sure psychophysicists applying the Just Noticeable Difference threshold would agree.
All of this is especially true considering the fact that there are no explicit categories for these bar widths. They range from “rare” to “widespread.” Anyone with any experience in bird distributions knows that the curve is anything but linear. There are vagrants, accidentals, and casual species. There are residents, migrants, transients, and visitors. Some species are ubiquitous throughout, others stage in large groups.
Why not stick with five categories? Rare. Uncommon. Regular. Common. Widespread. This way matching the histogram to the legend has meaning at both the visual and conceptual levels. This is one case where less is better and more just makes things a mess.
As a cartographer, you’re taught to stick to five categories for sequential quantitative schemes. Otherwise, the differences between the classes become too hard to tell apart and information in the map gets muddled. This addition of bar chart categories seems like violation of that principle. ColorBrewer doesn’t let you create color schemes for more than nine classes. I’m sure the same thing should apply to bar widths, considering how small distances used are in the first place.
What would seem to be smart to me would be to dual-encode the legend, making value pre-attentive for two visual variables. It would look something like these:
Either of these seem like an improvement over the current eBird version. Dual-encoding will make parsing abundance differences twice as easy. Also, they relate the values on the legend to conceptions of the values. This could be extended further by linking these values directly to checklist abundance numbers. Then they would have the same legend as the maps eBird produces, allowing easy comparison and relation.
I’ve never understood why the histogram uses two dimensions to map the same amount. That seems like a waste of digital data-ink space. You could put so much more information on the screen if you cut the bar widths in half in the first place. Then these graphs would edge closer to Tufte’s hallowed micro/macro emergence principle.
Why am I noticing this? Because design matters and people purveying data often forget that. It’s also probably because the people purveying data aren’t trained in designing data. Or at least, they’re not letting data designers handle that aspect. And as a result, using the data becomes more difficult, unwieldy and inefficient.


But the real highlight was a first-year male Lark Bunting, present at Art Weaver’s house earlier this week. I made it out on Monday to see the bird and was lucky to get one crappy photo of it. This bird was my 300th species in Michigan, a long awaited milestone. It was a great bird to hit it with and I’m still a little bit in awe of the fact that I have actually seen 300 different bird species in Michigan. But, they add up fast. Next goal, 300 in the Upper Peninsula. Only seven to go.