qnorm_cpp(0.5, 0,1,0,0)
[1] 0
qnorm(0.5, 0, 1)
[1] 0
When I was implementing RcppArmadillo, there were a couple scenarios where I couldn’t find an easy function for a certain result. This is partly due to strictly sticking with the Armadillo library for all my needs. I built functions to resolve these issues. However, I got to the point where I couldn’t rely on the armadillo library to compute the needed value. I needed to compute the value from the inverse cumulative distribution function (iCDF) of a standard normal distribution. The armadillo library is only for linear algebra and scientific computing. It did not have any distribution functions1. Therefore, I needed to find a solution to my problem.
Hadley Wickham’s book describes how to use R functions in cpp code. However, I worried that this may computationally taxing by switching from one language to another. Therefore, I searched for potential solutions to compute the iCDF of a normal distribution. This has brought me to this incredible website: Rcpp for everyone.
Rcpp for everyone is an excellent website that provides the complete basics of using cpp in R. It is comprehensive that can answer the majority2 of your questions when implementing Rcpp. Highly recommend reading it before you implement Rcpp in your code.
To obtain the iCDF, I needed to use the R Math Library3. This library seems to be loaded when you included in the Rcpp.h
file. However, you will need to include the line using namespace R
or R::
in your cpp code. Below is an example of making qnorm
executable in R:
double qnorm_cpp(double p, double mean, double sd, int lower, int log){
double val = R::qnorm(p, mean, sd, lower, log);
return val; }
Now it can be used to obtain the iCDF of a standard normal distribution:
qnorm_cpp(0.5, 0,1,0,0)
[1] 0
qnorm(0.5, 0, 1)
[1] 0
For my research, I didn’t need create an executable function for R. However, I did worry about compatibility issues with the Armadillo library. Fortunately, the function returns a double value making it easy to work with.
Once I learned about these different functions, I decided to clean my cpp code with a couple of other functions. For example, I needed to take the factorial of a number. The Armadillo library does not provide a useful function. Therefore, I wrote a function:
double fact (int x){
if (x == 0){
return 1;
} else if (x == 1){
return 1;
} else {
int ide = x - 1;
arma::vec ff = linspace(1, x, x);
arma::vec ccpp = cumprod(ff);
double post = ccpp[ide];
return post;
} }
As you can see, it isn’t a pretty function. However the Rcpp provides the following function: factorial
4. Below is the implementation of the function to obtain a double value:
double fact_cpp(double x){
NumericVector y = NumericVector::create(x);
double val = Rcpp::factorial(y)[0];
return val; }
Below is code to benchmark the 2 functions:
::mark(
benchfact(5),
fact_cpp(5)
)
# A tibble: 2 × 6
expression min median `itr/sec` mem_alloc `gc/sec`
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
1 fact(5) 476ns 541ns 1330868. 0B 133.
2 fact_cpp(5) 543ns 608ns 1364437. 0B 0
Notice the Rcpp function is slightly faster than my Armadillo implementation.
For more accurate information, I highly recommend looking Rcpp for everyone’s Chapter 23 and Chapter 31. I will talk about certain functions I think are important to know.
As the same with R, the R Math library and Rcpp contain four different functions for each distribution. The letter at the beginning of the distribution indicates the functions capabilities.
Letter | Functionality |
---|---|
dXXX |
returns the height of the probability density function |
pXXX |
returns the cumulative density function value |
qXXX |
returns the inverse cumulative density function (percentiles) |
rXXX |
returns a randomly generated number |
The Rcpp distribution functions are vectorized to accept and return vector. The tables below provide more details about each function.
dXXX
Argument | Data Type | Description |
---|---|---|
x |
NumericVector | random variable |
par |
double | parameters |
log |
bool | return log value |
pXXX
Argument | Data Type | Description |
---|---|---|
q |
NumericVector | random variable |
par |
double | parameters |
lower |
bool | return lower value |
log |
bool | return log value |
qXXX
Argument | Data Type | Description |
---|---|---|
p |
NumericVector | probability |
par |
double | parameters |
lower |
bool | return lower value |
log |
bool | return log value |
rXXX
Argument | Data Type | Description |
---|---|---|
n |
int | number of random variables |
par |
double | parameters |
The R Math library distribution functions return scalar values. The tables below describe the functions in more detail.
dXXX
Argument | Data Type | Description |
---|---|---|
x |
double | random variable |
par |
double | parameters |
log |
int | return log value |
pXXX
Argument | Data Type | Description |
---|---|---|
q |
double | random variable |
par |
double | parameters |
lower |
int | return lower value |
log |
int | return log value |
qXXX
Argument | Data Type | Description |
---|---|---|
p |
NumericVector | probability |
par |
double | parameters |
lower |
int | return lower value |
log |
int | return log value |
rXXX
Argument | Data Type | Description |
---|---|---|
par |
double | parameters |
The table below describe a select number of distribution functions available in R and Rcpp.
code | Distributions |
---|---|
unif | Uniform Distribution |
norm | Normal Distribution |
chisq | \(\chi²\) Distribution |
t | t Distribution |
f | F Distribution |
exp | Exponential Distribution |
binom | Binomial Distribution |
pois | Poisson Distribution |
I don’t know