S4SS - Statistics for Soil Survey Part 1 Revisions 2021

2020/01/26 TODOs and suggestions from @smroecker 

## Core R data manipulation functional themes:
 - loading/fetching data
 - filtering
 - transforming 
 - aggregating 
 - iterating 

### TODO

 Top priorities:
  - [x] 1. Move mapView and sf examples to Spatial Chapter
  - [ ] 2. simple example comparing the diagnostic slot to information parsed from the horizon slot

 Demo reproducible examples for functions:
  - [x] "filter" (`subset`)
  - [ ] "transform" (`mutate`, `slice`, `segment`, `spc2mpspline`)
  - [ ] "aggregate" (`slab`)
  - [ ] "iterate" (`profileApply`)  
  - [x] soilDB 
     - `get_extended_data_from_NASIS_db`, `get_vegplot_from_NASIS_db` 
     - fetch functions: `fetchOSD`, `fetchSDA`, `fetchNASISWebReport`, `fetchHenry`
  - [ ] New taxonomic functions 
     - (_maybe_; these interfaces are a bit fluid right now, but students are typically very interested in taxonomic information...)
----



 
---
The changes described below are from the original PR: https://github.com/ncss-tech/stats_for_soil_survey/pull/21
---

Re-organization of data chapter, and Part 1 chapters, into bookdown format. Most of the changes pertain to separation of portions of the data chapter out and moving them elsewhere, or a little bit vice versa.

This is the proposed order of sections:

Precourse, Intro, Data, EDA, Spatial, Sampling

### Intro
Mostly same content.

There is currently no exposition on the materials in the "appendix" for the Data chapter -- but we need to spend some time focusing on that type of material. I would think it could be part of chapter 1 as a more of a "basic R syntax and concepts" section

### Data
"Data" contains the same essential elements/content; really motivating the discussion around pedon data specifically. I would like to see some more references to ecosite data in this chapter eventually.

The examples code still includes exercises like simple plots of point locations, but no detailed exposition on spatial data types. That now comes after EDA. I leave some stubs to allude to future Spatial sections ( and also EDA with dplyr ?) but my thought is we do not get too prescriptive -- just say that there are many ways of doing these things once you have access to the data in a data.frame.

Data now features Soil Reports at the end (previously end of EDA). My thought is these are, or can be, fun exercise that motivate the need for understanding distributions, descriptive statistics etc. and maybe gets people "excited" about what they can do with existing R tools.

### EDA
EDA is mostly unchanged except for moving the Soil Reports stuff into Data.

I think we can have them running reports and looking at the output before we really get into the details of the stats. It is nice to have a hard example of something in front of you when learning something new -- that way they can really get a jump start thinking about how they can apply it to their own data / final project etc.

### Spatial
Spatial data after that. Emphasize the data.frame skills covered in Precourse/external AgLearn courses/Intro/Data by focusing on sf data.frame objects first. Then cover their interop/conversion to sp objects. This will prepare the students better for sf stuff they will encounter in the wild as sf is the only interface to many packages. They still need essential sp context, links and demos that they will need for examples, existing code etc. but sf is a subclass of data.frame so that sticks with some central themes for the class

Interactive maps, along with the new exactextractr example are our shiny R spatial examples, both featuring sf interfaces. Spatial chapter now really tries to draw parallels to data.frames, and between the methods used for reading/writing, setting CRS etc. across sf, sp and raster objects.

### Sampling
Finally, the sampling chapter has examples of using sp objects for spatial sampling. I think this section could be enhanced significantly as a resource, not so much as something covered in detail in class. I would like to provide subsections so we have identical sf st_sample and sp spSample examples to draw parallels. We have the sampling presentation and other materials, so can spend as much time on applying the code examples as is interesting to the group -- but the thought is this chapter should mostly be a self-contained set of reference examples of different sampling strategies applied to simple, but realistic, data. I consider the specific details in this chapter to be more like an end matter for Part 1, something that fits well after discussing details of data, describing data, and how to describe data in space.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

S4SS - Statistics for Soil Survey Part 1 Revisions 2021 #23

Core R data manipulation functional themes:

TODO

The changes described below are from the original PR: #21

Intro

Data

EDA

Spatial

Sampling

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

S4SS - Statistics for Soil Survey Part 1 Revisions 2021 #23

Description

Core R data manipulation functional themes:

TODO

The changes described below are from the original PR: #21

Intro

Data

EDA

Spatial

Sampling

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions