If you are not already familiar with eBird and the myriad other data resources available at the Cornell Lab of Ornithology you should take a few minutes right now to have a quick look around. Among other things the lab coordinates massive citizen science projects, does interesting research, and creates data-rich investigative curriculum materials. These are all topics I plan to tackle in future posts, but for today my goal is to share some reflections about working with a data visualization of bird migrations.
eBird describes itself as a “real-time, online checklist program” where birders and other enthusiasts can record their bird observation data. Even more importantly, from my perspective, the accumulated citizen science data can be explored through a web interface or downloaded for local analysis. The recently released reference dataset 3.0 contains over 41 million records so we are talking about a rich data resource.
Today I’m focusing in on a sub-project of eBird where the observation data were used to build spatial models of species occurrence over time. The models predict occurrence at unsampled locations by combining the available observations for each species with environmental data. They describe the project here and have links to all the species maps. What a great set of resources. There is some natural history background for each species, the maps have county boundaries, and expected occurrence is shown for each week of the year.
What follows are my ruminations about some broad “working with data” messages I plan to emphasize when sharing these data with students.
Getting Oriented:
You always have to start out by investing some effort into getting oriented to the data resources. In order to get past what some people describe as the “look-see ain’t it pretty” level of engagement ask yourself these kinds of questions:
- What is being shown? Where does the data come from? In what ways is it limited? What are some of the broad patterns in the data? What types of similarities and differences are there between datasets (species). What are some biological questions that could be addressed with this data? Do I have technical questions about the data or visualization?
Looking at the data more systematically:
At first blush these data may seem difficult to work with – you can’t control the animation, you can only see one species at a time, and you don’t really have values to work with.
Take control of the data. I downloaded the animation (be sure to get the large map version) by right clicking on the image and saving it to my hard drive. On the Mac you can use the “Preview” application to open the file and you will see that you have a collection of 52 images that are easily navigate and manipulate. [I’d be very interested in suggestions for an equivalent software tool for dealing with animated gifs on a PC.]
Now it should be easier to extract some measurements from this data. Maybe you are interested in when species W arrives in county X? How many weeks is it there? How does that compare to species Y or county Z.
Other ways to quantify the data:
It is unfortunate that we don’t have the raw data underlying these maps but there are still ways to work with what we have. We can make measurements of distance, area, and color intensity pretty easily using an image analysis tool like ImageJ.
Getting more data:
This last suggestion involves going beyond the data at hand by seeking out other data sources. These models are based (in part) on observations recorded in eBird. The project provides other tools for extracting and visualize eBird data that would allow you to see some of the raw data or bring in other species that aren’t included in the occurrence maps project. There are also lots of sources for weather, climate, habitat, and land use data that might allow you to pursue more sophisticated research.
By going to the NOAA Satellite and Information Service site and entering a zip code I was able to quickly identify 127 weather stations within 30 minutes of Hawk Mountain – a great place to observe hawk migrations in central PA.
Do you have strategies you use to help students get engage with data? Can you think of other ways to dig information out of the occurrence maps to address interesting biological questions?
You can learn a lot more about eBird here:
Wood, C., Sullivan, B., Iliff, M., Fink, D., & Kelling, S. (2011). eBird: Engaging Birders in Science and Conservation. PLoS Biology, 9(12), e1001220. Public Library of Science. Retrieved from http://dx.plos.org/10.1371/journal.pbio.1001220
I was reminded of these maps by a recent post on Flowing Data. “FlowingData explores how designers, statisticians, and computer scientists are using data to understand ourselves better – mainly through data visualization.”