learner-python-R
diff --git a/‎CS_11_ParallelProcessing.Rmd‎
Lines changed: 93 additions & 0 deletions b/‎CS_11_ParallelProcessing.Rmd‎
Lines changed: 93 additions & 0 deletions
diff --git a/‎CS_11_ParallelProcessing.md‎
Lines changed: 73 additions & 0 deletions b/‎CS_11_ParallelProcessing.md‎
Lines changed: 73 additions & 0 deletions
diff --git a/‎CS_11_ParallelProcessing_files/figure-html/unnamed-chunk-3-1.png‎
184 KB b/‎CS_11_ParallelProcessing_files/figure-html/unnamed-chunk-3-1.png‎
184 KB
diff --git a/‎CS_11_ParallelProcessing_files/figure-html/unnamed-chunk-4-1.png‎
184 KB b/‎CS_11_ParallelProcessing_files/figure-html/unnamed-chunk-4-1.png‎
184 KB
diff --git a/‎Schedule.md‎
Lines changed: 1 addition & 1 deletion b/‎Schedule.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎TK_11.Rmd‎
Lines changed: 24 additions & 2 deletions b/‎TK_11.Rmd‎
Lines changed: 24 additions & 2 deletions
diff --git a/‎TK_11.md‎
Lines changed: 22 additions & 2 deletions b/‎TK_11.md‎
Lines changed: 22 additions & 2 deletions
diff --git a/‎TK_12.Rmd‎
Lines changed: 35 additions & 17 deletions b/‎TK_12.Rmd‎
Lines changed: 35 additions & 17 deletions
@@ -0,0 +1,93 @@
+---
+title: "Parallel Computing with R"
+subtitle: Write a parallel for loop
+week: 11
+type: Case Study
+reading:
+   - CRAN Task View [High-Performance and Parallel Computing with R](http://cran.r-project.org/web/views/HighPerformanceComputing.html)
+   - Parallel [Computing with the R Language in a Supercomputing Environment](https://link.springer.com/chapter/10.1007/978-3-642-13872-0_64)
+tasks:
+   - Write parallel for loops to speed up computation time.
+---
+
+```{r setup, include=FALSE, purl=F}
+source("functions.R")
+source("knitr_header.R")
+```
+
+# Reading
+
+```{r reading,results='asis',echo=F,purl=F}
+md_bullet(rmarkdown::metadata$reading)
+```
+
+
+# Tasks
+
+```{r tasks,results='asis',echo=F, purl=F}
+md_bullet(rmarkdown::metadata$tasks)
+```
+
+## Background
+
+```{r cache=F, message=F,warning=FALSE}
+library(tidyverse)
+library(spData)
+library(sf)
+
+## New Packages
+library(foreach)
+library(doParallel)
+registerDoParallel()
+getDoParWorkers() # check registered cores
+```
+
+
+Write an Rmd script that:
+
+* Loads the `world` dataset in the `spData` package
+* Runs a parallel `foreach()` to loop over countries (`name_long`) and:
+   * `filter` the world object to include only on country at a time.
+   * use `st_is_within_distance` to find the distance from that country to all other countries in the `world` object within 100000m Set `sparse=F` to return a simple array of `T` for countries within the distance.
+   * set `.combine=rbind` to return a simple matrix.
+* Confirm that you get the same answer without using foreach:
+   * imply use `st_is_within_distance` with the transformed `world` object as both `x` and `y` object.
+   * compare the results with `identical()`
+   * you can also check the time difference with `system.time()`.
+   
+```{r, echo=F, purl=F}
+data("world")
+proj="+proj=robin +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs "
+dist=100000 # distance in m
+world2=st_transform(world,proj)
+
+#system.time(
+  x_seq<-world2%>%
+  st_is_within_distance(world2,dist,sparse=F)
+#)
+
+#system.time(
+  x_par <- foreach(i=unique(world$name_long),.combine=rbind) %dopar% {
+  world2%>%
+    filter(name_long==i)%>%
+    st_is_within_distance(world2,dist=dist,sparse = F)
+  }
+#)
+
+#identical(x_seq,x_par)
+```
+
+This approach could be used to identify which countries were 'close' to others.  For example, these countries are within `r dist`m of Costa Rica:
+```{r}
+i=which(world2$name_long=="Costa Rica")
+# neighbor countries
+world2[x_par[i,],]$name_long
+```
+
+```{r echo=F}
+ggplot()+
+  geom_sf(data=world2[x_par[i,],])+
+  geom_sf(data=world2[i,],col="red")
+```
+
+Note that in this example the sequential version typically runs faster than the 
@@ -0,0 +1,73 @@
+---
+title: "Parallel Computing with R"
+subtitle: Write a parallel for loop
+week: 11
+type: Case Study
+reading:
+   - CRAN Task View [High-Performance and Parallel Computing with R](http://cran.r-project.org/web/views/HighPerformanceComputing.html)
+   - Parallel [Computing with the R Language in a Supercomputing Environment](https://link.springer.com/chapter/10.1007/978-3-642-13872-0_64)
+tasks:
+   - Write parallel for loops to speed up computation time.
+---
+
+
+
+# Reading
+
+- list(`CRAN [Task View` = "High-Performance and Parallel Computing with R](http://cran.r-project.org/web/views/HighPerformanceComputing.html)")
+- Parallel [Computing with the R Language in a Supercomputing Environment](https://link.springer.com/chapter/10.1007/978-3-642-13872-0_64)
+
+
+# Tasks
+
+- Write parallel for loops to speed up computation time.
+
+## Background
+
+
+```r
+library(tidyverse)
+library(spData)
+library(sf)
+
+## New Packages
+library(foreach)
+library(doParallel)
+registerDoParallel()
+getDoParWorkers() # check registered cores
+```
+
+```
+## [1] 2
+```
+
+
+Write an Rmd script that:
+
+* Loads the `world` dataset in the `spData` package
+* Runs a parallel `foreach()` to loop over countries (`name_long`) and:
+   * `filter` the world object to include only on country at a time.
+   * use `st_is_within_distance` to find the distance from that country to all other countries in the `world` object within 100000m Set `sparse=F` to return a simple array of `T` for countries within the distance.
+   * set `.combine=rbind` to return a simple matrix.
+* Confirm that you get the same answer without using foreach:
+   * imply use `st_is_within_distance` with the transformed `world` object as both `x` and `y` object.
+   * compare the results with `identical()`
+   * you can also check the time difference with `system.time()`.
+   
+
+
+This approach could be used to identify which countries were 'close' to others.  For example, these countries are within 10^{5}m of Costa Rica:
+
+```r
+i=which(world2$name_long=="Costa Rica")
+# neighbor countries
+world2[x_par[i,],]$name_long
+```
+
+```
+## [1] "Panama"     "Costa Rica" "Nicaragua"
+```
+
+![](CS_11_ParallelProcessing_files/figure-html/unnamed-chunk-4-1.png)<!-- -->
+
+Note that in this example the sequential version typically runs faster than the 
@@ -32,7 +32,7 @@ Homeworks are due at 5pm on the Friday of the week specified below.
  |  8 |  10/16/18 |  [<i class='fas fa-desktop'>    </i>](presentations/PS_08_repro.html){target='_blank'} |  [Create Final Project Webpage](./TK_08.html) |  [One Script, Many Products](./CS_08.html) |  6 |
  |  9 |  10/23/18 |  [<i class='fas fa-desktop'>    </i>](presentations/PS_09_weather.html){target='_blank'} |  [APIs, time-series, and weather Data](./TK_09.html) |  [Tracking Hurricanes!](./CS_09.html) |  7 |
  |  10 |  10/30/18 |  [<i class='fas fa-desktop'>    </i>](presentations/PS_10_RS.html){target='_blank'} |  [Remote Sensing](./TK_10.html) |   -  |  8 |
- |  11 |  11/6/18 |   |  [Project First Draft](./TK_11.html) |   -  |  9 |
+ |  11 |  11/6/18 |  [<i class='fas fa-desktop'>    </i>](presentations/PS_11_ParallelProcessing.html){target='_blank'} |  [Project First Draft](./TK_11.html) |  [Parallel Computing with R](./CS_11_ParallelProcessing.html) |  9 |
  |  12 |  11/13/18 |  [<i class='fas fa-desktop'>    </i>](presentations/PS_12.html){target='_blank'} |  [Project Peer Review](./TK_12.html) |  [Dynamic HTML graph of Daily Temperatures](./CS_12.html) |  10 |
  |  13 |  11/20/18 |   |  [Thanksgiving Week (Tuesday Class Optional)](./TK_13.html) |   -  |   |
  |  14 |  11/27/18 |   |  [Final Project 2nd Draft / Building and summarizing models](./TK_14.html) |   -  |   |
 
@@ -1,10 +1,11 @@
 ---
 title:  Project First Draft
-subtitle:  Review project drafts from your peers
+subtitle:  Submit the first draft of your project for peer review
 week: 11
 type: Task
+presentation: PS_11_ParallelProcessing.html
 reading:
-  - GitHub [Pull Requests](https://help.github.com/articles/about-pull-requests/)
+  - Documentation for [RMarkdown Websites](https://rmarkdown.rstudio.com/rmarkdown_websites.htm)
 tasks:
    - Commit your first draft of your project to GitHub
 ---
@@ -22,6 +23,12 @@ source("knitr_header.R")
 md_bullet(rmarkdown::metadata$reading)
 ```
 
+# Tasks
+
+```{r reading,results='asis',echo=F}
+md_bullet(rmarkdown::metadata$tasks)
+```
+
 ### First Draft
 
 The first draft of your project will be assessed by your peers in GitHub. The objectives of the peer evaluation are:
@@ -30,3 +37,18 @@ The first draft of your project will be assessed by your peers in GitHub. The ob
 * Provide an opportunity to share your knowledge to improve their project
 
 You should use the project website template (or similar) to generate a html version of your project report. If your project requires any data not available in public repositories, you should put it in a folder called `/data` in your project's home directory and then import it into R with `read.csv('data/filname.csv')` or similar so that anyone with a copy of the repository can re-create the HTML output.
+
+## Required components of first draft
+
+1) **Introduction**  [~ 200 words]: Clearly stated background and questions / hypotheses / problems being addressed. Sets up the analysis in an interesting and compelling way.
+2) **Data**: Script downloads at least one dataset automatically through the internet or loads the data from the `data/` folder.  This could use a direct download (e.g. download.file()) or an API (e.g. anything from ROpenSci).
+3) **Figure**: The HTML file includes at least one figure of the data.
+2) **Reproducibility**: The .Rmd should generate the HTML output when "Build Website" is clicked.
+
+### Confirming 'reproducibility'
+
+After pushing the files to GitHub, try downloading it as a zip file, opening in RStudio, and clicking build website - it should work.
+
+## Common issues
+
+1) Importing data from somewhere on your computer.  You should not have any commands such as `read.csv("~/projects/inputdata.csv")` that read any data from your computer other than the `data/` folder in your repository.
@@ -1,10 +1,11 @@
 ---
 title:  Project First Draft
-subtitle:  Review project drafts from your peers
+subtitle:  Submit the first draft of your project for peer review
 week: 11
 type: Task
+presentation: PS_11_ParallelProcessing.html
 reading:
-  - GitHub [Pull Requests](https://help.github.com/articles/about-pull-requests/)
+  - Documentation for [RMarkdown Websites](https://rmarkdown.rstudio.com/rmarkdown_websites.htm)
 tasks:
    - Commit your first draft of your project to GitHub
 ---
@@ -17,6 +18,10 @@ tasks:
 
 - GitHub [Pull Requests](https://help.github.com/articles/about-pull-requests/)
 
+# Tasks
+
+- Commit your first draft of your project to GitHub
+
 ### First Draft
 
 The first draft of your project will be assessed by your peers in GitHub. The objectives of the peer evaluation are:
@@ -25,3 +30,18 @@ The first draft of your project will be assessed by your peers in GitHub. The ob
 * Provide an opportunity to share your knowledge to improve their project
 
 You should use the project website template (or similar) to generate a html version of your project report. If your project requires any data not available in public repositories, you should put it in a folder called `/data` in your project's home directory and then import it into R with `read.csv('data/filname.csv')` or similar so that anyone with a copy of the repository can re-create the HTML output.
+
+## Required components of first draft
+
+1) **Introduction**  [~ 200 words]: Clearly stated background and questions / hypotheses / problems being addressed. Sets up the analysis in an interesting and compelling way.
+2) **Data**: Script downloads at least one dataset automatically through the internet or loads the data from the `data/` folder.  This could use a direct download (e.g. download.file()) or an API (e.g. anything from ROpenSci).
+3) **Figure**: The HTML file includes at least one figure of the data.
+2) **Reproducibility**: The .Rmd should generate the HTML output when "Build Website" is clicked.
+
+### Confirming 'reproducibility'
+
+After pushing the files to GitHub, try downloading it as a zip file, opening in RStudio, and clicking build website - it should work.
+
+## Common issues
+
+1) Importing data from somewhere on your computer.  You should not have any commands such as `read.csv("~/projects/inputdata.csv")` that read any data from your computer other than the `data/` folder in your repository.
@@ -8,7 +8,7 @@ reading:
   - GitHub [Pull Requests](https://help.github.com/articles/about-pull-requests/)
   - Chapter [28 in R4DS](http://r4ds.had.co.nz)
 tasks:
-  - Review at least two other students' projects and make comments via a _pull request_ in GitHub before next class next week. 
+  - Review at least two other students' projects and make comments via a _pull request_ in GitHub. 
   - Browse the [Leaflet website](http://rstudio.github.io/leaflet/) and take notes in your readme.md about potential uses in your project. What data could you use?  How would you display it?
   - Browse the [HTML Widgets page](http://gallery.htmlwidgets.org/) for many more examples. Take notes in your readme.md about potential uses in your project.
 ---
@@ -25,29 +25,47 @@ source("knitr_header.R")
 ```{r reading,results='asis',echo=F}
 md_bullet(rmarkdown::metadata$reading)
 ```
+
 # Tasks
 
-```{r reading,results='asis',echo=F}
+```{r tasks,results='asis',echo=F}
 md_bullet(rmarkdown::metadata$tasks)
 ```
 
-## Evaluation Instructions
+# Project Peer Evaluation
+
+## Instructions
+
+Select two repositories and evaluate them according to the instructions listed in the [Project First Draft task](TK_11.html) 
+
+![](project_assets/project_evaluation.png)
 
-Select two repositories and evaluate them according to the instructions and rubric below.  
+### Download and reproduce the project
 
-1) Explore the final projects in the [class repositor](https://github.com/AdamWilsonLabEDU)
-2) Open the repository and check if there have already been two reviews by checking if there are 2 (or more) "Pull Requests".  For example, in the image below, there are 0 pull requests, so this repository would be available for you to review.  If there are already 2 pull requests, select another repository. ![](assets/pull_reqeust.png)
-2) Go to the github page linked in the assignment and download the repository as a zip file (click on the <img src='assets/download.png' width=100> button).
+1) Explore the final projects in the [class repository](https://github.com/AdamWilsonLabEDU?q=finalproject)
+2) Select two projects that do not already have two evaluations (pull requests). For example, in the image above, there are 0 pull requests, so this repository would be available for you to review.  If there are already 2 pull requests, select another repository.
+2) Go to the github page linked in the assignment and download the repository as a zip file (click on the <img src='project_assets/download.png' width=100> button).
 3) Unzip the file after it downloads
-4) Open the project or `index.Rmd` in RStudio and click `knit` or  `Build Website` in the `Build` tab in the upper right.
-
-Evaluate the following provide any feedback via pull request.
-1) Website
-  1) **Introduction**  [~ 200 words]: Clearly stated background and questions / hypotheses / problems being addressed. Sets up the analysis in an interesting and compelling way.
-  2) **Data**: Script downloads at least one dataset automatically through the internet.  This could use a direct download (e.g. download.file()) or an API (anything from ROpenSci).
-  3) **Figure**: The HTML file includes at least one figure of the data.
-2) **Output:** The .Rmd produces HTML output with
-  1) section headers for all the major sections of the paper
-  2) a draft of the complete introduction.  
+4) Open the project or `index.Rmd` in RStudio and click `Build Website` in the `Build` tab in the upper right.
+5) Evaluate whether the project meets the specifications listed in the [Project First Draft task](TK_11.html)
+
+
+### Provide feedback and evaluation via pull request
+
+After you reproduce the project, you will provide feedback via pull request.
+
+The following video will walk you through the steps of providing feedback via a pull request.
+<iframe width="560" height="315" src="https://www.youtube.com/embed/wy9EggBhC-M" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+
+1) In the "Code" tab of the github page for the project, click on the file you want to provide feedback on (typically this will be `index.Rmd`)
+2) Click the pencil icon on the right side to edit the file
+3) You can make changes or comment on the code
+   * To make changes, simply edit the text
+   * To comment, you must still make some sort of change on the lines where you want to cmment.  The easiest thing is simply to add a space at the end of the line (as I do in the video above).
+4) At the bottom of the file, there is a section called "Commit Changes", select the button for **Create a new branch for this commit and start a pull request.** and name the new branch `project_feedback_githubusername`
+5) Click "Propose File change"
+6) Click on the button "Files Changed #1" near the middle of the next page
+7) Hover over lines you would like to comment on and click the little blue plus button.  Then enter your comment and select "Add single comment"
+6) Repeat steps 2-6 for any additional files you want to comment on 
 
 Be sure to install any required libraries (do not complain if it fails because you don't have a library installed).