lesson plans & tutorials
Exploring water quality in Durham’s Ellerbe Creek An intro to cleaning and visualizing “dirty” data in R, with Jonny Behrens In ecology and watershed sciences, large datasets often come from a variety of sources like continuous automated sensors, water grab samples, and community-collected scientific data. Overcoming these challenges is critical to explore the prevalence, persistence, and impact of degraded water quality on human society and wildlife. This project exposes students to approaches for merging and cleaning two disparate data sources, basic tools for statistical analyses, and data visualization. This project was conducted with funding from Duke University's Data Expeditions Initiative. |
Understanding coastal bird responses to hurricanes An introduction to Big Data and R, with C. Lane Scher Many scientific fields are currently experiencing a heyday of Big Data. Although citizen science datasets can help us understand many ecological questions, they also present unique challenges for analysis. This tutorial leads students through an analysis of eBird data, where they will be exposed to basic tools in Exploratory Data Analysis, manipulating large datasets, and data visualization. This project was conducted with funding from Duke University's Data Expeditions Initiative. |
Introduction to R R is a versatile, open-source, and user-friendly tool used to solve complex data science questions from ecology to statistics to biology. This three-part tutorial walks students through the basics of R; the layout of R's user interface, RStudio; and some basic programming and troubleshooting techniques. |
Generalized Joint Attribute Modeling (GJAM) As long-term ecological datasets promise to open the doors for complex long-term community analysis, ecologists still struggle to model species interactions at the scale of the data. GJAM is a structure for analyzing large datasets, often zero-inflated or of varying data types, in a format meaningful to the researcher. GJAMTime is an extension of the original GJAM package, specifically used to analyze dynamic, time-series datasets. I apply GJAMTime to an abstracted example dataset of large ungulates in the Kruger National Park, South Africa. |