Skip to content

Commit 96f7a74

Browse files
committed
🔨 update automated reporting 2 with environment info
1 parent a40e8c3 commit 96f7a74

File tree

2 files changed

+27
-10
lines changed

2 files changed

+27
-10
lines changed

tutorials/p-automating_rmd_reports/p-automating_rmd_reports_part_2.Rmd

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,22 +29,28 @@ This tutorial follows from [an earlier one](https://github.com/erikaduan/r_tips/
2929

3030
Creating an automated reporting workflow requires the following setup:
3131

32-
1. A consistent file structure to store code, data and analytical outputs.
32+
1. A consistently named and reproducible file structure to store code, library dependencies, data and analytical outputs.
3333
2. A data ingestion and data cleaning script that can be automatically refreshed.
3434
3. An R Markdown template report that uses yaml parameters instead of hard coded variables.
3535
4. A report automation script for all parameters of interest.
3636
5. (Optional) A CI/CD pipeline using GitHub Actions, which is packaged inside a separate GitHub repository.
3737

3838

39-
# Step 1: Create a consistent project structure
39+
# Step 1: Create a project environment
40+
41+
## Use consistent names
4042

4143
There is no best way to organise your project structure. I recommend starting with a simple naming structure that everyone easily understands. For this tutorial, I have created a separate GitHub repository named [`abs_labour_force_report`](https://github.com/erikaduan/abs_labour_force_report) which contains the folders `code` to store my R scripts and Rmd documents, `data` to store my data, and `output` to store my analytical outputs.
4244

4345
```{r, echo=FALSE, results='hold', out.width="60%"}
4446
knitr::include_graphics("../../figures/p-automating_rmd_reports-project_structure.png")
4547
```
4648

47-
**Note:** The `data` folder contains subfolders `raw_data` and `clean_data` to maintain separation between the raw versus cleaned dataset used for further analysis.
49+
**Note:** The `data` folder contains subfolders `raw_data` and `clean_data` to maintain separation between the raw (read only) and clean dataset used for further analysis.
50+
51+
## Environment reproducibility
52+
53+
Besides your code, data inputs and outputs, a reproducible virtual environment also needs to be created to support your project workflow. In R, a simple way of managing project specific R package dependencies is to use the `renv` package.
4854

4955

5056
# Step 2: Create data ingestion and data cleaning R script

tutorials/p-automating_rmd_reports/p-automating_rmd_reports_part_2.md

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,10 @@ Erika Duan
44
2022-05-22
55

66
- [Introduction](#introduction)
7-
- [Step 1: Create a consistent project
8-
structure](#step-1-create-a-consistent-project-structure)
7+
- [Step 1: Create a project
8+
environment](#step-1-create-a-project-environment)
9+
- [Use consistent names](#use-consistent-names)
10+
- [Environment reproducibility](#environment-reproducibility)
911
- [Step 2: Create data ingestion and data cleaning R
1012
script](#step-2-create-data-ingestion-and-data-cleaning-r-script)
1113
- [Step 3: Create an R Markdown template
@@ -34,8 +36,8 @@ describing the preliminary steps towards automated reporting in R.
3436

3537
Creating an automated reporting workflow requires the following setup:
3638

37-
1. A consistent file structure to store code, data and analytical
38-
outputs.
39+
1. A consistently named and reproducible file structure to store code,
40+
library dependencies, data and analytical outputs.
3941
2. A data ingestion and data cleaning script that can be automatically
4042
refreshed.
4143
3. An R Markdown template report that uses yaml parameters instead of
@@ -44,7 +46,9 @@ Creating an automated reporting workflow requires the following setup:
4446
5. (Optional) A CI/CD pipeline using GitHub Actions, which is packaged
4547
inside a separate GitHub repository.
4648

47-
# Step 1: Create a consistent project structure
49+
# Step 1: Create a project environment
50+
51+
## Use consistent names
4852

4953
There is no best way to organise your project structure. I recommend
5054
starting with a simple naming structure that everyone easily
@@ -58,8 +62,15 @@ outputs.
5862
<img src="../../figures/p-automating_rmd_reports-project_structure.png" width="60%" style="display: block; margin: auto;" />
5963

6064
**Note:** The `data` folder contains subfolders `raw_data` and
61-
`clean_data` to maintain separation between the raw versus cleaned
62-
dataset used for further analysis.
65+
`clean_data` to maintain separation between the raw (read only) and
66+
clean dataset used for further analysis.
67+
68+
## Environment reproducibility
69+
70+
Besides your code, data inputs and outputs, a reproducible virtual
71+
environment also needs to be created to support your project workflow.
72+
In R, a simple way of managing project specific R package dependencies
73+
is to use the `renv` package.
6374

6475
# Step 2: Create data ingestion and data cleaning R script
6576

0 commit comments

Comments
 (0)