-
Notifications
You must be signed in to change notification settings - Fork 10
Description
2020/01/26 TODOs and suggestions from @smroecker
Core R data manipulation functional themes:
- loading/fetching data
- filtering
- transforming
- aggregating
- iterating
TODO
Top priorities:
- 1. Move mapView and sf examples to Spatial Chapter
- 2. simple example comparing the diagnostic slot to information parsed from the horizon slot
Demo reproducible examples for functions:
- "filter" (
subset
) - "transform" (
mutate
,slice
,segment
,spc2mpspline
) - "aggregate" (
slab
) - "iterate" (
profileApply
) - soilDB
get_extended_data_from_NASIS_db
,get_vegplot_from_NASIS_db
- fetch functions:
fetchOSD
,fetchSDA
,fetchNASISWebReport
,fetchHenry
- New taxonomic functions
- (maybe; these interfaces are a bit fluid right now, but students are typically very interested in taxonomic information...)
The changes described below are from the original PR: #21
Re-organization of data chapter, and Part 1 chapters, into bookdown format. Most of the changes pertain to separation of portions of the data chapter out and moving them elsewhere, or a little bit vice versa.
This is the proposed order of sections:
Precourse, Intro, Data, EDA, Spatial, Sampling
Intro
Mostly same content.
There is currently no exposition on the materials in the "appendix" for the Data chapter -- but we need to spend some time focusing on that type of material. I would think it could be part of chapter 1 as a more of a "basic R syntax and concepts" section
Data
"Data" contains the same essential elements/content; really motivating the discussion around pedon data specifically. I would like to see some more references to ecosite data in this chapter eventually.
The examples code still includes exercises like simple plots of point locations, but no detailed exposition on spatial data types. That now comes after EDA. I leave some stubs to allude to future Spatial sections ( and also EDA with dplyr ?) but my thought is we do not get too prescriptive -- just say that there are many ways of doing these things once you have access to the data in a data.frame.
Data now features Soil Reports at the end (previously end of EDA). My thought is these are, or can be, fun exercise that motivate the need for understanding distributions, descriptive statistics etc. and maybe gets people "excited" about what they can do with existing R tools.
EDA
EDA is mostly unchanged except for moving the Soil Reports stuff into Data.
I think we can have them running reports and looking at the output before we really get into the details of the stats. It is nice to have a hard example of something in front of you when learning something new -- that way they can really get a jump start thinking about how they can apply it to their own data / final project etc.
Spatial
Spatial data after that. Emphasize the data.frame skills covered in Precourse/external AgLearn courses/Intro/Data by focusing on sf data.frame objects first. Then cover their interop/conversion to sp objects. This will prepare the students better for sf stuff they will encounter in the wild as sf is the only interface to many packages. They still need essential sp context, links and demos that they will need for examples, existing code etc. but sf is a subclass of data.frame so that sticks with some central themes for the class
Interactive maps, along with the new exactextractr example are our shiny R spatial examples, both featuring sf interfaces. Spatial chapter now really tries to draw parallels to data.frames, and between the methods used for reading/writing, setting CRS etc. across sf, sp and raster objects.
Sampling
Finally, the sampling chapter has examples of using sp objects for spatial sampling. I think this section could be enhanced significantly as a resource, not so much as something covered in detail in class. I would like to provide subsections so we have identical sf st_sample and sp spSample examples to draw parallels. We have the sampling presentation and other materials, so can spend as much time on applying the code examples as is interesting to the group -- but the thought is this chapter should mostly be a self-contained set of reference examples of different sampling strategies applied to simple, but realistic, data. I consider the specific details in this chapter to be more like an end matter for Part 1, something that fits well after discussing details of data, describing data, and how to describe data in space.