You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains R programming tips covering topics across data cleaning, data visualisation, machine learning, statistical theory and data productionisation.
Many kudos to [Dr Chuanxin Liu](https://github.com/codetrainee), my former PhD student and code editor, for teaching me how to code in R in my past life as an immunologist.
13
-
14
-
15
-
# Content summary
16
-
17
-
| Legend | Category |
18
-
|--------|----------|
19
-
| 📚 | Data cleaning |
20
-
| 🎨 | Data visualisation |
21
-
| 🔮 | Machine learning |
22
-
| 🔨 | Productionisation |
23
-
| 🔢 | Statistical theory |
24
-
25
-
26
-
# Tutorials
2
+
27
3
## 🎨 Data visualisation
28
-
29
4
+[An introduction to `ggplot2` using volcano plots](https://github.com/erikaduan/r_tips/blob/master/tutorials/dv-volcano_plots_with_ggplot/dv-volcano_plots_with_ggplot.md) (Updated)
30
5
+[Using `DiagrammeR` to draw flow charts](https://github.com/erikaduan/r_tips/blob/master/tutorials/dv-using_diagrammer/dv-using_diagrammer.md) (Updated)
31
6
32
7
## 📚 Data cleaning
33
-
34
8
+[Data cleaning using `data.table` or `tidyverse` (or Python `Pandas`)](https://github.com/erikaduan/r_tips/blob/master/tutorials/dc-data_table_vs_dplyr/dc-data_table_vs_dplyr.md) (Updated)
35
-
+[Cleaning strings with regular expressions using`stringr`](https://github.com/erikaduan/r_tips/blob/master/tutorials/dc-cleaning_strings/dc-cleaning_strings.md) (Updated)
9
+
+[Cleaning strings using regular expressions with base R or`stringr`](https://github.com/erikaduan/r_tips/blob/master/tutorials/dc-cleaning_strings/dc-cleaning_strings.md) (Updated)
36
10
37
11
## 🔨 Productionisation
38
12
+[Creating SQL <> R workflows - Part 1](https://github.com/erikaduan/r_tips/blob/master/tutorials/p-sql_to_r_workflows/p-sql_to_r_workflows_part_1.md) (Updated)
39
13
+[Creating SQL <> R workflows - Part 2](https://github.com/erikaduan/r_tips/blob/master/tutorials/p-sql_to_r_workflows/p-sql_to_r_workflows_part_2.md) (Updated)
40
14
+[Automating R Markdown report generation - Part 1](https://github.com/erikaduan/r_tips/blob/master/tutorials/p-automating_rmd_reports/p-automating_rmd_reports_part_1.md) (Updated)
41
-
+[Automating R Markdown report generation - Part 2](https://github.com/erikaduan/r_tips/blob/master/tutorials/p-automating_rmd_reports/p-automating_rmd_reports_part_2.md) (updated)
42
-
43
-
## 🔮 Machine learning
44
-
+[Working with dummy variables and factors](https://github.com/erikaduan/r_tips/blob/master/tutorials/2020-04-23_dummy-variables-and-factors/2020-04-23_dummy-variables-and-factors.md)
15
+
+[Automating R Markdown report generation - Part 2](https://github.com/erikaduan/r_tips/blob/master/tutorials/p-automating_rmd_reports/p-automating_rmd_reports_part_2.md) (updated)
45
16
46
-
## 🔢 Statistical theory
17
+
## 🔢 Statistical modelling
47
18
+[Introduction to expectation and variance](https://github.com/erikaduan/r_tips/blob/master/tutorials/st-expectations_and_variance/st-expectation_and_variance.md)
48
19
+[Beyond expectations: centrality measures in statistics](https://github.com/erikaduan/r_tips/blob/master/tutorials/2020-07-26_many-roads-to-the-middle/2020-07-26_many-roads-to-the-middle.md)
49
-
+[Introduction to the normal distribution](https://github.com/erikaduan/r_tips/blob/master/tutorials/st-normal_distribution/st-normal_distribution.md)
50
-
+[Introduction to the Chi-squared and F distribution](https://github.com/erikaduan/r_tips/blob/master/tutorials/st-chi_squared_and_f_distributions/st-chi_squared_and_f_distributions.md)
51
-
+[Introduction to binomial distributions](https://github.com/erikaduan/R_tips/blob/master/tutorials/2020-09-12_binomial_distribution/2020-09-12_binomial-distribution.md)
52
-
+[Introduction to hypergeometric, geometric, negative binomial and multinomial distributions](https://github.com/erikaduan/R_tips/blob/master/tutorials/2020-09-22_hypergeometric-and-other-discrete-distributions/2020-09-22_hypergeometric-and-other-discrete-distributions.md)
20
+
21
+
22
+
## 🔮 Machine learning
23
+
+[Working with dummy variables and factors](https://github.com/erikaduan/r_tips/blob/master/tutorials/2020-04-23_dummy-variables-and-factors/2020-04-23_dummy-variables-and-factors.md)
53
24
54
25
55
26
# Other resources
@@ -61,9 +32,9 @@ The resources below also cover a comprehensive range of practical R tutorials.
61
32
62
33
# Tutorial style guide
63
34
64
-
A painful form of technical debt is inconsistent code style. This repository now contains the following file naming and code style rules.
35
+
This repository now contains the following file naming and code style rules.
65
36
66
-
+ Folders are no longer ordered with a numerical prefix and names are no longer case sensitive e.e.g `r_tips\tutorials\...` and `r_tips\figures\...`
37
+
+ Folders are no longer ordered with a numerical prefix and names are no longer case sensitive e.g `r_tips\tutorials\...` and `r_tips\figures\...`
67
38
+ Tutorial subtopics share the same prefix e.g. `r_tips\tutorials\dv-...` and `r_tips\tutorials\st-...`
68
39
+ File names contain `-` to separate file name prefixes and `_` instead of other white space e.g. `r_tips\figures\dv-using_diagrammer-simple_flowchart.svg`
69
40
+ Comments are styled according to the [tidyverse style guide](https://style.tidyverse.org/functions.html?q=comments#comments-1):
@@ -73,9 +44,9 @@ A painful form of technical debt is inconsistent code style. This repository now
73
44
+ Comments should not be followed by a blank line, unless the comment is a stand-alone paragraph containing in-depth rationale or an alternative solution
74
45
+ R code chunks are styled as follows:
75
46
+ Each R chunk should be named with a short unique description written in the active voice e.g. `create basic plot` and `modify plot labels`
76
-
+ Arguments inside code chunks should not contain white space and boolean argument options should be written in capitals e.g. `{r load libraries, message=FALSE, warning = FALSE}`
47
+
+ Arguments inside code chunks should not contain white space and boolean argument options should be written in capitals e.g. `{r load libraries, message=FALSE, warning=FALSE}`
77
48
+ To render the github document, results are generally suppressed using `results='hide'` and manually entered in a new line beneath the code.
78
-
+ To render the github document, figures are generally outputed using `fig.show='hold'` and figure outputs can then be suppressed at the local chunk level using `fig.show='hide'`
49
+
+ To render the github document, figures are generally outputed using `fig.show='markdown'` and figure outputs can then be suppressed at the local chunk level using `fig.show='hide'`
79
50
+ Set a margin of 80 characters length in RStudio through `Tools\Global options --> Code --> Display --> Show margin` and use this margin as the cut-off for code and comments length
80
51
81
52
# Citations
@@ -88,4 +59,13 @@ Citing packages is a good practice when you are publishing research papers. To d
88
59
1686, https://doi.org/10.21105/joss.01686
89
60
+ H. Wickham. `ggplot2`: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
90
61
+ Matt Dowle and Arun Srinivasan (2021). `data.table`: Extension of `data.frame`. R package
91
-
version 1.14.2. https://CRAN.R-project.org/package=data.table
62
+
version 1.14.2. https://CRAN.R-project.org/package=data.table
63
+
64
+
# Acknowledgements
65
+
66
+
Many kudos to [Dr Chuanxin Liu](https://github.com/codetrainee), my former PhD student and code editor, for teaching me how to code in R in my past life as an immunologist.
0 commit comments