## Todays data
<- format(Sys.time(),"%Y-%m-%d-%H-%M")
analysis_data
## R Packages
library(tidyverse)
library(magrittr)
## Functions
source("fxs.R")
::sourceCpp("fxs.cpp")
Rcpp
## Data
<- read_csv("file.csv")
df1 <- load("file.RData") %>% get df2
4 Scripting and Piping in R
4.1 Commenting
A comment is used to describe your code within an R Script. To comment your code in R, you will use the #
key, and R will not execute any code after the symbol. The #
key can be used to anywhere in the line, from beginning to midway. It will not execute any code coming after the #
.
Additionally, commenting is a great way to debug long scripts of code or functions. You comment certain lines to see if any errors are being produced. It can be used to test code line by line with out having to delete everything.
4.2 Scripting
When writing a script, it is important to follow a basic structure for you to follow your code. While this structure can be anything, the following sections below has my main recommendations for writing a script. The most important part is the Beginning of the Script section.
4.2.1 Beginning of the Script
Load any R packages, functions/scripts, and data that you will need for the analysis. I always like to get the date and time of the
4.2.2 Middle of the Script
Run the analysis, including pre and post analysis.
## Pre Analysis
<- Prep_data(df1)
df1_prep <- Prep_data(df2)
df2_prep
## Analysis
<- analyze(df1_prep)
df1_analysis <- analyze(df2_prep)
df2_analysis
## Post Analysis
<- Prep_post(df1_anlysis)
df1_post <- Prep_post(df2_anlysis) df2_post
4.2.3 End of the Script
Save your results in an R Data file:
## Save Results
<- list(df1 = list(pre = df1_prep,
res analysis = df1_analysis,
post = df1_post),
df2 = list(pre = df2_prep,
analysis = df2_analysis,
post = df2_post))
<- paste0("results_", analysis_data, ".RData")
file_name save(res, file = file_name)
4.3 Pipes
In R, pipes are used to transfer the output from one function to the input of another function. Piping will then allow you to chain functions to run an analysis. Since R 4.1.0, there are two version of pipes, the base R pipe and the pipes from the magrittr package. The table below provides a brief description of each type pipes
Pipe | Name | Package | Description |
---|---|---|---|
|> |
R Pipe | Base | This pipe will use the output of the previous function as the input for the first argument following function. |
%>% |
Forward Pipe | magrittr | The forward pipe will use the output of the previous function as the input of the following function. |
%$5 |
Exposition Pipe | magrittr | The exposition function will expose the named elements of an R object (or output) to the following function. |
%T>% |
Tee Pipe | magrittr | The Tee pipe will evaluate the next function using the output of the previous function, but it will not retain the output of the next function and utilize the output of the previous function. |
%<>% |
Assignment Pipe | magrittr | The assignment pipe will rewrite the object that is being piped into the next function. |
When choosing between Base or magrittr’s pipes, I recommend using magrittr’s pipes due to the extended functionality. However, when writing production code or developing an R package, I recommend using the Base R pipe.
Lastly, when using the pipe, I recommend only stringing a limited amount of functions (~10) to maintain code readability and conciseness. Any more functions may make the code incoherent.
If you plan to use magrittr’s pipe, I recommend loading magrittr
package instead of tidyverse
package.
library(magrittr)
4.3.1 |>
The base pipe will use the output from the first function and use it as the input of the first argument in the second function. Below, we obtain the mpg
variable from mtcars
and pipe it in the mean()
function.
$mpg |> mean() mtcars
[1] 20.09062
4.3.2 %>%
4.3.2.1 Uses
Magrittr’s pipe is the equivalent of Base R’s pipe, with some extra functionality. Below we repeat the same code as before:
$mpg %>% mean() mtcars
[1] 20.09062
Alternatively, we do not have to type the parenthesis in the second function:
$mpg %>% mean mtcars
[1] 20.09062
Below is another example where we will pipe the value 3
into the rep()
with times=5
, this will repeat the value 3
five times:
3 %>% rep(5)
[1] 3 3 3 3 3
If we are interested in piping the output to another argument other than the first argument, we can use the (.
) placeholder in the second function to indicate which argument should take the previous output. Below, we repeat the vector c(1, 2)
three times because the .
is in the second argument:
3 %>% rep(c(1,2), .)
[1] 1 2 1 2 1 2
4.3.2.2 Creating Unary Functions
You can use %>%
and .
to create unary functions, a function with one argument, can be created. The following code will create a new function called logsqrt()
which evaluates \(\sqrt{\log(x)}\):
<- . %>% log(base = 10) %>% sqrt
logsqrt logsqrt(10000)
[1] 2
sqrt(log10(10000))
[1] 2
4.3.3 %$%
The exposition pipe will expose the named elements of an object or output to the following function. For example, we will pipe the mtcars
into the lm()
function. However, we will use the %$%
pipe to access the variables in the data frame for the formula=
argument without having to specify the data=
argument:
%$% lm(mpg ~ hp) mtcars
Call:
lm(formula = mpg ~ hp)
Coefficients:
(Intercept) hp
30.09886 -0.06823
4.3.4 %T>%
The Tee pipe will pipe the contents of the previous function into the following function, but will retain the previous functions output instead of the current function. For example, we use the Tee pipe to push the results from the lm()
function to print out the summary table, then use the same lm()
function results to print out the model standard error:
<- mtcars %$% lm(mpg ~ hp) %T>%
x_lm print(summary(x))) %T>%
(\(x) print(sigma(x))) (\(x)
Call:
lm(formula = mpg ~ hp)
Residuals:
Min 1Q Median 3Q Max
-5.7121 -2.1122 -0.8854 1.5819 8.2360
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.09886 1.63392 18.421 < 2e-16 ***
hp -0.06823 0.01012 -6.742 1.79e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.863 on 30 degrees of freedom
Multiple R-squared: 0.6024, Adjusted R-squared: 0.5892
F-statistic: 45.46 on 1 and 30 DF, p-value: 1.788e-07
[1] 3.862962
4.4 Keyboard Shortcuts
Below is a list of recommended keyboard shortcuts:
Shortcut | Windows/Linux | Mac |
---|---|---|
%>% |
Ctrl+Shift+M | Cmd+Shift+M |
Run Current Line | Ctrl+Enter | Cmd+Return |
Run Current Chunk | Ctrl+Shift+Enter | Cmd+Shift+Enter |
Knit Document | Ctrl+Shift+K | Cmd+Shift+K |
Add Cursor Below | Ctrl+Alt+Down | Cmd+Alt+Down |
Comment Line | Ctrl+Shift+C | Cmd+Shift+C |
I recommend modify these keyboard shortcuts in RStudio
Shortcut | Windows/Linux | Mac |
---|---|---|
%in% |
Ctrl+Shift+I | Cmd+Shift+I |
%$% |
Ctrl+Shift+D | Cmd+Shift+D |
%T>% |
Ctrl+Shift+T | Cmd+Shift+T |
Note you will need to install the extraInserts
package:
::install_github('konradzdeb/extraInserts') remotes