Unofficial Rcpp API Documentation

programming
cpp
Author

TheCoatlessProfessor

Published

October 21, 2016

Warning: This post is a work in progress. It will periodically be updated as time permits.

Introduction

The following unofficial API documentation for Rcpp is based off some personal notes and teaching materials that I have prepared over the years of working with Rcpp. I’ve attempted to reformat the notes in the form of Armadillo’s API, which I think are some of the best documentation out there for a C++ matrix library. At some point, when the documentation becomes a bit more stable or if there is larger contributor interest, I will likely attempt to merge this into the Rcpp project so that a docs subdomain could hopefully be added to http://rcpp.org.

Please note: The post is written using RMarkdown for maximum flexibility.

API Documentation for Rcpp 0.12.11

Preamble

  • The goal of the API documentations are to provide a public facing concise view of Rcpp features. As a result, the documentation will be somewhat long. To help navigate the documentation, it has been split into different section. Furthermore, one should use the built in search functionality to search text for keywords using either CNTRL + F on Windows and Linux or CMD + F on macOS.

  • Presently, any contribution to this document may be down by a pull request (PRs) on GitHub.

  • Please report any bugs to the Rcpp Core Team.

Overview


Vector, Matrix, List, and DataFrame Classes

Vector<RTYPE>, NumericVector, IntegerVector, … Vector class
Matrix<RTYPE>, NumericMatrix, IntegerMatrix, … Matrix class
List List typedef
DataFrame Data frame class
RObject RObject class

Member functions

Operators Mathematical (add, subtract, etc) and logical (inequalities)
Dimensional Information Size attribute information
Element Access Retrieving element values with and without bounds check
Subset Views Subset data structures
Iterators Random access iterators
STL-style Container Functions Standard Library styled functions
Static Member Functions Set of member functions persistant across instances

Exception Handling

External Exception Classes

stop Stop execution
warning Send Warning to console
Rcout<< Write message to console

Internal Exception Classes

Simple Exceptions

not_a_matrix Object is not a matrix
parse_error Unable to parse values
not_s4 Not a valid S4 class
not_reference Object not a reference
not_initialized Object not initialized
no_such_slot S4 Object lacks slot
no_such_field Exception not used
not_a_closure Object not a closure
no_such_function No such function
unevaluated_promise Promise not yet evaluated

Exceptions

S4_creation_error Error creating object of S4 class
reference_creation_error Exception not used
no_such_binding No such binding
binding_not_found Binding not found
binding_is_locked Binding is locked
no_such_namespace No such namespace
function_not_exported Function not exported
eval_error Evaluation error

Advanced exceptions_

not_compatible Not a compatible transformation
index_out_of_bounds Request index is out of bounds

Sugar

Logical Operations

ifelse Vectorized If-else
is_false Is Value False?
is_true Is Value True?
any At Least One Value is True
all All Values Must be True

Complex Operators

Re Real Values of Complex Number
Im Imaginary Values of Complex Number
Mod Modulus (r)
Arg Arg (theta)
Conj Complex Conjugate

Data Operations

head View the First n Values
tail View the Last n Values
abs Absolute Value
sqrt Square Root
pow Raise to the nth Power
sum Summation
sign Extract the Sign of Values
diff Lagged Difference
cumsum Cumulative Sum
cumprod Cumulative Product
cummin Cumulative Minimum
cummax Cumulative Maximum
sin Sine
cos Cosine
tan Tangent
asin Arc Sine
acos Arc Cosine
atan Arc Tangent
sinh Hyperbolic Sine
cosh Hyperbolic Cosine
tanh Hyperbolic Tangent
log Natural Logarithms
exp Expoential
log10 Base 10 Logarithm
log1p Natural Logarithm \(log(1+x)\)
expm1 \(\exp(x) - 1\)
sample Randomly sample values

Rounding of Numbers

ceiling,ceil Smallest integer greater than or equal to x
trunc Truncates the values in x toward 0
floor Largest integer less than or equal to x
round Round values to specified decimal place
signif Rounds values to number of significant digits

Finite, Infinite and NaN Detection

pre-defined Pre-defined NA/NaN/Inf Constants
is_na Detects if values are missing
is_nan Detects if values are not a number (NaN)
is_finite Detects if value is finite
is_infinite Detects if value is infinite
na_omit Remove NA and NaN values
noNA Assert that the object is NA free

The Apply Family

sapply Apply a function to one input and store results in vector
lapply Apply a function to one input and store results in list
mapply Apply a function to multiple inputs and store results in vector

Special Functions of Mathematics

factorial Factorial
lfactorial Factorial Logarithm
choose Combination
lchoose Combination Logarithm
beta Beta Function
lbeta Natural Log Beta Function
gamma Gamma Function
lgamma Natural Log Gamma Function
psigamma General Gamma Derivative
digamma Second Gamma Derivative
trigamma Third Gamma Derivative
tetragamma Fourth Gamma Derivative
pentagamma Fifth Gamma Derivative

Statistical Summaries

min Minimum Value
max Maximum Value
range Range
mean Mean Value
median Median Value
var Variance
sd Standard Deviation

Special Operators

rev Reverse ordering of a vector
pmax Parallel maximum value
pmin Parallel minimum value
clamp Values between a minimum and maximum
which_max Index of the maximum value
which_min Index of the minimum value

Uniqueness Operators

match Find indices of the first value in a separate vector
self_match Find indices of the first occurrence of each value in a vector
in Determine if a match was located for each element of A in B
unique Obtain the unique values
duplicated Obtain a logical vector indicating the duplicate values
sort_unique Obtain the unique values and sort them
table Create a frequency table of occurrences

Set Operations

setequal Equality of Set Values
intersect Intersection of Set Values
union_ Union of Set Values
setdiff Asymmetric Difference of Set Values

Matrix Operations

colSums Column Sums of a Matrix
rowSums Row Sums of a Matrix
colMeans Column Means of a Matrix
rowMeans Row Means of a Matrix
outer Outer Product of Arrays on a Function
lower_tri Extract the Lower Triangle Part of a Matrix
upper_tri Extract the Upper Triangle Part of a Matrix
diag Extract the Diagonal Portion of a Matrix
row Create a matrix of Row Indexes
col Create a matrix of Column Indexes

Object Creation

cbind Create matrix by combing column vectors
seq_along Generate an R index sequence given a vector
seq_len Generate an R index sequence given an integer
rep Replicate vector \(N\) times
rep_each Replicate each element in line \(N\) times
rep_len Replicate values until vector is of length \(N\)

String Operations

collapse Collapse multiple strings into one string
trimws Trim leading and/or trailing whitespace from strings

Statistical Distributions

Discrete Distributions

p/d/q/rbinom Binomial
p/d/q/rgeom Geometric
p/d/q/rhyper Hypergeometric
p/d/q/rnbinom Negative Binomial
p/d/q/rpois Poisson
p/d/q/rwilcox Wilcoxon
p/d/q/rsignrank Wilcoxon Signed Rank

Continuous Distributions

p/d/q/rbeta Beta
p/d/q/rcauchy Cauchy
p/d/q/rchisq Chi-square
p/d/qnchisq Non-central Chi-square
p/d/q/rexp Exponential
p/d/q/rf F
p/d/qnf Non-central F
p/d/q/rgamma Gamma
p/d/q/rnorm Normal
p/d/q/rlnorm Log Normal
p/d/q/rlogis Logistic
p/d/q/rt Student’s T
p/d/q/runif Uniform
p/d/q/rweibull Weibull

Vector, Matrix, List, and DataFrame Classes

Vector

  • The templated Vector class is a one dimensional array-like structure providing storage for homogenous data types, with an interface similar to that of std::vector. Being an implementation of policy-based design, much of the behavior of Vector is determined by the policy classes it inherits from

    • RObjectMethods
    • StoragePolicy
    • SlotProxyPolicy
    • AttributeProxyPolicy
    • NamesProxyPolicy

    as well as the CRTP base class VectorBase. This type is instantiated as Vector<RTYPE>, where RTYPE is one of the following valid SEXPTYPEs:

    • REALSXP
    • INTSXP
    • CPLXSXP
    • LGLSXP
    • STRSXP
    • VECSXP
    • RAWSXP
    • EXPRSXP
  • For convenience, the following typedefs have been defined in the Rcpp namespace:

    • NumericVector = Vector<REALSXP>
    • DoubleVector = Vector<REALSXP>
    • RawVector = Vector<RAWSXP>
    • IntegerVector = Vector<INTSXP>
    • ComplexVector = Vector<CPLXSXP>
    • LogicalVector = Vector<LGLSXP>
    • CharacterVector = Vector<STRSXP>
    • StringVector = Vector<STRSXP>
    • GenericVector = Vector<VECSXP>
    • List = Vector<VECSXP>
    • ExpressionVector = Vector<EXPRSXP>
  • Within this documentation, the default type used will be NumericVector unless another data type is required to show a specific feature.

  • Constructors:

    • Vector()
    • Vector(SEXP x)
    • Vector(const int &size, const stored_type &u)
    • Vector(const std::string &st)
    • Vector(const char *st)
    • Vector(const Vector &other)
    • Vector(const int &size)
    • Vector(const Dimension &dims)
    • Vector(const Dimension &dims, const U &u)
    • Vector(const Vector &other)
  • By default, the vectors constructed from dimensions will always be initialized with all entries being zero (0) or an empty string ("").

  • For the majority of cases, the interface being used is that of the R to C++ interface that relies upon the Vector(SEXP x) constructor, which establishes a pointer to the underlying data. That is, the Vector object points to the memory location of the SEXP R object in order to avoid copying the data into C++. The only exception to this rule is if the data passed is of a different type in which case a deep copy is performed before a pointer is established. For example, if numeric() data is passed to NumericVector the correct handoff occurs. However, if integer() data were to be passed to a NumericVector a clone() would be made to type numeric() which has its pointer then assigned to the NumericMatrix.

  • Examples:

SEXP A;
NumericVector B(A); // from a SEXP

// create a vector of length 5 filled with 0
NumericVector C(5);
// Output: 0 0 0 0 0

// construct a filled vector of size 3 with 2.0
NumericVector D = NumericVector(3, 2.0); 
// Output: 2 2 2

// initialize empty numeric vector of size 5
NumericVector D2 = no_init(5);

// fill vector with 3.0
D2.fill(3.0);
// Output: 3 3 3 3 3

// cloning (deep copy)
NumericVector E = clone(D);
// Output: 2 2 2

Matrix

  • The main class for matrices is the templated Matrix class, which derives from a combination of both the Vector and MatrixBase types. Like the Vector class, Matrix uses the policy-based design pattern to manage the lifetime of its undelying SEXP via the template parameter StoragePolicy, which uses the PreserveStorage policy class by default. Matrices are
    instantiated as Matrix<RTYPE>, where the value RTYPE is one of the following SEXPTYPEs:

    • REALSXP
    • INTSXP
    • CPLXSXP
    • LGLSXP
    • STRSXP
    • VECSXP
    • RAWSXP
    • EXPRSXP
  • For convenience, the following typedefs have been defined in the Rcpp namespace:

    • NumericMatrix = Matrix<REALSXP>
    • RawMatrix = Matrix<RAWSXP>
    • IntegerMatrix = Matrix<INTSXP>
    • ComplexMatrix = Matrix<CPLXSXP>
    • LogicalMatrix = Matrix<LGLSXP>
    • CharacterMatrix = Matrix<STRSXP>
    • StringMatrix = Matrix<STRSXP>
    • GenericMatrix = Matrix<VECSXP>
    • ListMatrix = Matrix<VECSXP>
    • ExpressionMatrix = Matrix<EXPRSXP>
  • Within this documentation, the default type used will be NumericMatrix unless another data type is required to show a specific feature.

  • Constructors:

    • Matrix()
    • Matrix(SEXP x)
    • Matrix(const int& nrows_, const int& ncols)
    • Matrix(const int& nrows_, const int& ncols, Iterator start)
    • Matrix(const int& n)
    • Matrix(const Matrix& other)
  • By default, the matrices constructed from dimensions will always be initialized with all entries being zero (0) or empty strings ("")

  • For the majority of cases, the interface being used is that of the R to C++ interface that relies upon the Matrix(SEXP x) constructor, which establishes a pointer to the underlying data. That is, the Vector object points to the memory location of the SEXP R object in order to avoid copying the data into C++. The only exception to this rule is if the data passed is of a different type in which case a deep copy is performed before a pointer is established. For example, if numeric() data is passed to NumericMatrix the correct handoff occurs. However, if integer() data were to be passed to a NumericMatrix a clone() would be made to type numeric() which has its pointer then assigned to the NumericMatrix.

  • Examples:

SEXP A;
NumericMatrix B(A); // from a SEXP

// create a square matrix (all elements set to 0.0)
NumericMatrix C(2);
// Output:
// 0 0
// 0 0

// of a given size (all elements set to 0.0)
NumericMatrix D(2, 3);
// Output:
// 0 0 0
// 0 0 0

// of a given size with dimensions (all elements set to 0.0)
NumericMatrix D2(Dimension(3, 2));
// Output:
// 0 0
// 0 0
// 0 0

// initialize empty numeric matrix
NumericMatrix D3 = no_init(2, 1);

// fill matrix with 3.0
D3.fill(3.0);
// Output:
// 3.0
// 3.0

// fill matrix using a vector
NumericVector E = NumericVector(15, 2.0); 
NumericMatrix F = NumericMatrix(3, 5, E.begin());
// Output:
// 2 2 2 2 2
// 2 2 2 2 2
// 2 2 2 2 2

// cloning (explicit deep copy)
NumericMatrix G = clone(F);

List

  • The List data structure is a typedef of templated Vector class based on the RTYPE of VECSXP that provides heterogenous storage class. As a result, the List class acts as a generic storage object that can simultaneously hold multiple different RTYPE structures.

  • Constructors:

    • List()
    • List(SEXP x)
    • List(const int &size, const stored_type &u)
    • List(const std::string &st)
    • List(const char *st)
    • List(const Vector &other)
    • List(const int &size)
    • List(const Dimension &dims)
    • List(const Dimension &dims, const U &u)
    • List(const Vector &other)
  • Unlike the Vector class, the List constructed from dimensions will have a NULL value set for each element unless a value is otherwise assigned.

  • Examples:

SEXP A;
List B(A); // from a SEXP

// create an empty List of size 2 
List C(2);
// Output: 
// [[1]]
// NULL
// [[2]]
// NULL

// construct List of size 3 with one element containing 2.0
List D = List(3, 2.0); 
// Output: 
// [[1]]
// [1] 2
// [[2]]
// [1] 2
// [[3]]
// [1] 2

// initialize empty list of size 3
List D2 = no_init(3);

// fill list one element equal to 3.0
D2.fill(3.0);
// Output: 
// [[1]]
// [1] 3
// [[2]]
// [1] 3
// [[3]]
// [1] 3

// cloning (deep copy)
List E = clone(D);
// Output: 
// [[1]]
// [1] 2
// [[2]]
// [1] 2
// [[3]]
// [1] 2


// Create a named list
NumericVector F = NumericVector::create(1.2, 3.5);
CharacterVector G = CharacterVector::create("a", "b", "c");
LogicalVector H = LogicalVector::create(true, false, false, true);

// Create named list
List F = List::create(Named("v1") = F,
                      Named("v2") = G,
                          _["v3"] = H); // Shorthand for Named("V3")
// Output:
// $v1
// [1] 1.2 3.5
// $v2
// [1] "a" "b" "c"
// $v3
// [1]  TRUE FALSE FALSE  TRUE

DataFrame

  • The DataFrame data structure is a typedef of the DataFrame_Impl class, which is a special extension of the templated Vector class that allows for a collection of heterogenous Vector’s of the same length. As the crux of the implementation of is policy-based design focused, much of the behavior of DataFrame is determined by the policy classes it inherits from

    • RObjectMethods
    • StoragePolicy
    • SlotProxyPolicy
    • AttributeProxyPolicy
    • NamesProxyPolicy

    as well as the CRTP base class VectorBase.

  • Constructors:

    • DataFrame()
    • DataFrame(SEXP x)
    • DataFrame(const DataFrame &other)
    • DataFrame(const T &obj)
  • Caveat: All DataFrame columns must be named. Failure to name the columns will result in the run time error of:

    not compatible with STRSXP

    since Rcpp uses an internal call to R to create the DataFrame.

  • Examples:

SEXP A;
DataFrame B(A); // from a SEXP

// Create Data
NumericVector C = NumericVector::create(5.8, 9.1, 3.2); 
CharacterVector D = CharacterVector::create("a", "b", "c");
LogicalVector E = LogicalVector::create(true, false, false);

// Create a new dataframe
DataFrame G = DataFrame::create(Named("C") = C,
                                    _["D"] = D, // shorthand for Named("D")
                                Named("E") = E);

RObject

  • The RObject data structure is a typedef of the RObject_Impl class. Principally, the class can be viewed as the glue of Rcpp due to policy-based design principles. In turn, the RObject class really acts as a “shell” that stores properties of the following policies:

    • PreserveStorage: Member functions that provide the SEXP R object alongside ways to modify and update the object.
    • SlotProxyPolicy: Member functions related to only manipulating S4 objects.
    • AttributeProxyPolicy: Member functions that modify attribute information of the R object.
    • RObjectMethods: Member functions that provide descriptors of the R object such as type, object oriented programming (S3/S4) system, and NULL status.
  • Constructors:

    • RObject()
    • RObject(const RObject &other)
    • RObject(const GenericProxy<Proxy> &proxy)
  • Examples:

// Extract attribute information via AttributeProxyPolicy
RObject A;
RObject B = A.attr("dim");

Member Functions

Operators

  • Operators allow for operations to take place between two different Vector or Matrix objects. The operations are defined to works in an element-wise fashion where applicable.

  • Viable mathematical operations that are able to be performed.

Operation Definition Vector-Vector Vector-Scalar Vector-Matrix Matrix - Matrix Matrix - Scalar
+ Addition Yes Yes No No Yes
- Subtraction Yes Yes No No Yes
/ Division Yes Yes No No Yes
* Multiplication Yes Yes No No Yes
  • Logical Operations
Operation Definition Vector-Vector Vector-Scalar Vector-Matrix Matrix - Matrix Matrix - Scalar
== Equality Yes Yes No No Yes
!= Non-equality Yes Yes No No Yes
>= Greater than or equal to Yes Yes No No Yes
<= Less than or equal to Yes Yes No No Yes
< Less than Yes Yes No No Yes
> Greater than Yes Yes No No Yes
! Negate Yes Yes No No Yes
  • Examples:
// Sample data
NumericVector A = NumericVector::create(1, 2, 3, 4);
NumericVector B = NumericVector::create(2, 3, 4, 5);

// --- Addition

// Add a vector and scalar
NumericVector H = A + 2.0;
// Output: 3 4 5 6

// Add a vector and another vector
NumericVector I = A + B;
// Output: 3 5 7 9

// --- Subtraction

// Subtract a scalar from a vector
NumericVector J = 2.0 - A;
// Output: 1 0 -1 -2

// Subtract vectors
NumericVector K = A - B;
// Output: -1 -1 -1 -1

// --- Multiplication

// Multiple by scalar
NumericVector L = 3 * A;
// Output: 3 6 9 12

// Multiple Vectors
NumericVector M = A * B;
// Output: 2 6 12 20

// --- Division

// Divide by scalar
NumericVector L = 1 / A;
// Output: 1.0000000 0.5000000 0.3333333 0.2500000

// Divide Vectors
NumericVector M = A / B;
// Output: 0.5000000 0.6666667 0.7500000 0.8000000

// --- All together
NumericVector res = 3.0 * A - 1.0 / B + A + B + 5.0;
// Output: 10.50000 15.66667 20.75000 25.80000

Dimensional Information

.nrow(),.rows() number of rows in a Matrix, DataFrame
.ncol(),.cols() number of columns in a Matrix, DataFrame
.size(),.length() number of items in a Matrix, Vector
  • Return type is that of an int, unsigned int, or R_xlen_t

  • Note: As of Rcpp 0.13.0, new size attribute accessors were added to the DataFrame class that mimick those available in Matrix. Previously, to obtain the number of columns, one would have to use the .size() or .length() member function. In addition, the number of observations previously had to be obtained by .nrows().

  • Examples:

// --- Vector Example
NumericVector X(3);

int nelem = X.size();   // Output: 3
int nlens = X.length(); // Output: 3

Rcout << "Vector X has " << nelem << " elements." << std::endl;

// --- Matrix Example
NumericMatrix X(4,5);

int rows  = X.nrow();   // Output: 4
int cols  = X.ncols();  // Output: 5
int elems = X.size();   // Output: 20

Rcout << "Matrix Y has " << rows << " rows and " << cols << " columns." << std::endl;

Element Access

  • Described within this section is the ability to access elements using the position, categorical, and logical indexing systems.

  • The access system provides two retrieval methods for all classes that differ in computational time to obtain values from objects.

    • The preferred method to access elements is (), which that takes slightly longer due to a bounds check being performed that verifies whether the requested element is within the access scope. Furthermore, if the requested element is out of bounds, an exception is raised and the program stops.
    • The other method uses [], which does not perform a bounds check and assumes that the access scope is valid. If an element is out of bounds, the behavior exhibited will be undefined and may cause havoc with later parts of a procedure. Only use this form of accessor if the procedure has been thoroughly tested and debugged.
  • Note: Using accessors without a bounds check is not recommended unless the code has been thoroughly tested as undefined behavior (UB) may emerge. UB is very problematic.

Position Access

  • Access a single element or object using a positional index.

    • (i) provides the ith element or object in addition to performing a bounds check that ensures the requested index is a valid location.
    • [i] similar to the previous case, but does so without a bounds check.
    • at(i,j) provides the i,jth element of a Matrix with a bounds check.
    • (i,j) provides the i,jth element of a Matrix without a bounds check.
  • Caveat: Using either [i] or (i) on List and DataFrame, provides the object (e.g. Vector) at position i whereas the use on Vector or Matrix will provide a scalar element (e.g. double).

  • Note: Unlike R, there is no [] subset operator for matrices with C++. The reason for the lack of operator[] relates to a fundamental design choice made by the creators of C++ related to the presence of the operator,. In essence, after the complete evaluation of the first coordinate x and disposal of the results, only then is the second coordinate y able to be evaluated. Unfortunately, this yields the following -Wall issue:

    left operand of comma operator has no effect.

    Therefore, the only viable matrix subset operators within C++ are operator() and operator at() provide subset operations.

  • Examples:

// Create data 
NumericVector A = NumericVector::create(1, 2, 3, 4);
CharacterVector C = CharacterVector::create("B", "D", "E", "F");

// --- Vector 

// Retrieve the first value from A. (C++ indices start at 0 not 1!)
double a = A(0);
// Output: 1

// Modify the last value using unbound accessor
// Warning: Make sure the point is valid!
A[3] = 5;

// --- Matrix

// Create matrix with elements in A
NumericMatrix B(2, 2, A.begin());

// Output:
// 1  3
// 2  5

// Extract Value at 2, 1
double val_r1c0 = B(1, 0);
// Output: 2

// Modify value at 1, 2
B(0, 1) = 4;

// Output:
// 1  4
// 2  5

// The following shows a bounds throw error
// B.at(1, 2) = 4;

// --- List

// Create a List
List D = List::create(Named("A") = A,
                          _["C"] = C); // shorthand for Named("C")

// Extract A from List
NumericVector E = D[0];

double val2 = E[1];
// Output: 2

// Extract B from List
CharacterVector F = D[1];

// --- Data Frame

// Create a DataFrame
DataFrame G = DataFrame::create(Named("A") = A,
                                    _["C"] = C); // shorthand for Named("C")

// Extract A from DataFrame
NumericVector E = DF[0];

Categorical Access

  • Access element by name within a Vector, List, or DataFrame.
    • (NAME) provides the element associated with the NAME in addition to performing a bounds check that ensures the requested NAME exists.
    • [NAME] similar to the previous case, but does so without a bounds check.
  • Examples:
// Create sample data
NumericVector A = NumericVector::create(Named("Go") = 3,
                                        Named("To") = 4,
                                            _["My"] = 5, // shorthand Named("My")
                                            _["Pi"] = 1);
// Output: (names only printed in R!) 
// Go To My Pi 
//  3  4  5  1

NumericVector B = NumericVector::create(2, 5, 8, 9, 3, 7);

// Alternative way to set name values
B.names() = CharacterVector::create("Bears","Lions","and","Tigers", "Oh", "My");
// Output: (names only printed in R!) 
// Bears  Lions    and Tigers     Oh     My 
//     2      5      8      9      3      7 

// --- Vector

double val_num = A["To"];
// Output: 
// To
//  4

// Subset by vector
CharacterVector C = CharacterVector::create("My", "Pi");

NumericVector D = A[C];
// Output: (names only printed in R!) 
// My Pi 
//  5  1

// Modify values in vector
NumericVector E = NumericVector::create(1, 2);
A[C] = E;
// Output:
// Go To My Pi 
//  3  4  1  2

// --- List

// Create a List
List F = List::create(Named("A") = A,
                          _["B"] = B); // shorthand for Named("B")

// Extract A from List
NumericVector G = F["A"];

double val_go = G["Go"];
// Output: 3

// --- Data Frame

// Create a DataFrame
DataFrame H = DataFrame::create(Named("A") = A,
                                    _["B"] = B); // shorthand for Named("B")

// Extract B vector from DataFrame
NumericVector I = H["B"];

double val_lions = I["Lions"];
// Output: 5

Logical Access

  • Access element by boolean values within a Vector, List, or DataFrame.

    • (BOOL) provides only the elements associated true logical condition in addition to performing a bounds check that ensures the requested element at the BOOL location exists.
    • [BOOL] similar to the previous case, but does so without a bounds check.
  • Caveat: The BOOL must be a LogicalVector of equal size to the object being subset.

  • Examples:

// Sample data
NumericVector A = NumericVector::create(4, 3, 1, 2);
// Output: 4 3 1 2

// Logical Subset
LogicalVector B = LogicalVector::create(true, false, false, true);
// Output: TRUE FALSE FALSE TRUE

NumericVector C = A[B];
// Output: 4 2

// Replace Values
NumericVector D = NumericVector::create(8, 6);

// Logical Replacement
A[B] = D;
// Output: 4 8 6 2

Subset Views

Iterators

  • C++ Standard Template Library (STL) styled random access iterators exist underlying the Vector and Matrix classes.

  • Iterator accessor:

·begin() pointer to the start of the vector
·end() pointer to one past end of vector
  • Kinds of Iterators:
NumericVector::iterator allows for read/write access to elements (stored by column)
ComplexVector::iterator
IntegerVector::iterator
LogicalVector::iterator
CharacterVector::iterator
RawVector::iterator
ExpressionVector::iterator
GenericVector::iterator
NumericMatrix::iterator
ComplexMatrix::iterator
IntegerMatrix::iterator
RawMatrix::iterator
LogicalMatrix::iterator
CharacterMatrix::iterator
StringMatrix::iterator
ExpressionMatrix::iterator
GenericMatrix::iterator
ListMatrix::iterator
NumericVector::const_iterator allows for read access to elements (stored by column)
ComplexVector::const_iterator
IntegerVector::const_iterator
LogicalVector::const_iterator
CharacterVector::const_iterator
RawVector::const_iterator
ExpressionVector::const_iterator
GenericVector::const_iterator
NumericMatrix::const_iterator
ComplexMatrix::const_iterator
IntegerMatrix::const_iterator
RawMatrix::const_iterator
LogicalMatrix::const_iterator
CharacterMatrix::const_iterator
StringMatrix::const_iterator
ExpressionMatrix::const_iterator
GenericMatrix::const_iterator
ListMatrix::const_iterator
// Sample Data
NumericVector A = NumericVector::create(-1, 3.2, 0, 14.2, 38.6);
    
// --- Use iterators to compute sum
double val_total = 0;

for(NumericVector::iterator iter = A.begin(); iter != A.end(); ++iter) {
  val_total += *iter;
}

Rcpp::Rcout << "Sum Value is " << val_total << std::endl;
// Output: Sum Value is 55

STL-style container functions

  • There exists a special class of member functions that mimic how C++ Standard Template Library (STL) implement member functions for container architecture such as vector, deque, and list.

  • Member functions that do not alter the size of the Rcpp object

Member Description
operator() Access elements with checking range
operator[] Access elements without checking range
.length(), .size() Amount of elements in the collection
.fill(u) Fill the collection with element u
  • Member functions that do alter the size of the Rcpp object and result in the object being recreated.
Member Description
·push_back(x) Insert x at end of vector, grows vector
·push_front(x) Insert x at beginning of vector, grows vector
·insert(i, x) Insert x at the ith position of, grows vector
·erase(i) Remove element at ith position, shrinks vector
  • Warning: Using any function to grow or shrink an Rcpp object results in the data being copied from the original object into a new object. As a result, there will be a severe degregation of performance. Therefore, it is highly recommended to convert the Rcpp object to an STL object that can easily be grown or shrunk if the sample size is not known in advance.

Static Member Functions

create

::create(X, Y)
::create(X, Y, …)

::create(X, Y, …)

  • Initializes a Vector, List, or DataFrame by combining objects together sequentially in a manner similar to c(1, 2) in R.

  • In the case of a Vector, the values X and Y must be an atomic value of the same underlying type T, where T is one of the following:

    • int
    • double
    • std::complex<double> / Rcomplex
    • bool
  • For either a DataFrame or List, X, Y are allowed to be any combination of Vector, Matrix, and the previously mentioned supported atomic types.

  • create is defined for any number of arguments between 1 and 20, inclusive.

  • When constructing examples, it is often preferable to use create method to build a vector.

  • Examples:

// Construct a vector
NumericVector A = NumericVector::create(4.2, 1.9, 2, 3.5);
// Output: 4.2 1.9 2.0 3.5

IntegerVector B = IntegerVector::create(1, 2);
// Output: 1 2

CharacterVector C = CharacterVector::create("a", "b", "c", "d");
// Output: "a" "b" "c" "d"


// Unnammed list creation
List D = List::create(1.5, 2.3, 4.5);
// Must be returned into R for output!
// Output:
// [[1]]
// [1] 1.5
// [[2]]
// [1] 2.3
// [[3]]
// [1] 4.5

// Named list creation
List E = List::create(Named("B") = B,           
                       _["nval"] = 2.5);         // shorthand for Named("nval")
// Must be returned into R for output!
// $B
// [1] 1 2
// 
// $val
// [1] 2.5

// DataFrame creation
// The number of elements in vectors must _match_
DataFrame F = DataFrame::create(Named("A") = A,  
                                    _["C"] = C); // shorthand for Named("C")
// Must be returned into R for output!
// Output:
//   A C
// 4.2 a
// 1.9 b
// 2.0 c
// 3.5 d

get_na()

  • Obtain the correct missing value constant for the given RTYPE associated with the Rcpp data structure.

  • Examples:

// Construct a vector
NumericVector A = NumericVector::create(NA_REAL, -1, NumericVector::get_na(), 0);
// Output: NA -1 NA 0

A[3] = NumericVector::get_na();
// Output: NA -1 NA NA

is_na()

  • Determine whether an element within the Rcpp data structure matches a missing constant of the given RTYPE.

  • Examples:

// Construct a vector
NumericVector A = NumericVector::create(NA_REAL, -1, NumericVector::get_na(), 0);

int n = A.size();
LogicalVector B(n);
  
for (int i = 0; i < n; ++i) {
    B[i] = NumericVector::is_na(A[i]);
}

Rcout << "NA Presence: " << B << std::endl;
// Output: NA Presence: 1 0 1 0

import(InputIterator first, InputIterator last)

import_transform(InputIterator first, InputIterator last, F f)

diag(int size, const U &diag_value)

  • This method is available only for the Matrix class.

  • Examples:

eye(int n)

  • This method is available only for the Matrix class.

  • Examples:

ones(int n)

  • This method is available only for the Matrix class.

  • Examples:

zeros(int n)

  • This method is available only for the Matrix class.

  • Examples:

Environment

Function

Language

XPtr

S4 Classes

Exception Handling

External Exception Classes

Internal Exception Classes

Simple Exceptions

Exceptions

Advanced exceptions

Sugar

  • The objective behind Rcpp sugar is to provide a subset of the high-level R syntax in C++. For instance, part of the functionality behind the table( X ) function can be

  • Unless otherwise noted, if a Matrix is supplied, then the Matrix is corcered into column-form vectors before having the sugar functional procedure performed. For example, given a 2x2 matrix the results of the function call will go R1C1, R2C1, R1C2, and R2C2 where R stands for Row and C for column.

Logical Operations

  • Boolean functions that provide a way to analyze the data are provided within.

ifelse( CONDITION, TRUEVAL, FALSEVAL)

  • Vectorized if-else assignment of elements dependent on the CONDITION being true (TRUEVAL) or false (FALSEVAL).

  • To use the vectorized if-else the following criteria must be met:

    • CONDITION: A LogicalVector or a sugar expression that evalutes to a LogicalVector
    • TRUE/FALSE: either
      1. two compatible sugar expression (same RTYPE, same length), OR
      2. one sugar expression and one compatible primitive (same RTYPE)
  • Caveat: Unlike the R equivalent, there is no recycling the occurs if the vectors are of different lengths. In said cases, an error will be raised at runtime indicating the length difference.

// Create data
NumericVector A = NumericVector::create(5, 1, 8, 3);
NumericVector B = NumericVector::create(2, 4, 6, 11);

// a.) Vectors of the same length and type
NumericVector C = ifelse(A < B, A, (A + B)*B);
// Output: 14 1 84 3

// b.) One vector and one constant
NumericVector D = ifelse(A > B, A, 2);
// Output: 5 2 8 2

Single Logical Result

is_true( X ) is_false( X )
  • Convert the result state of any( CONDITION ) and all( CONDITION ) from the SingleLogicalResult template class to an atomic value bool evaluated as either true or false dependent on whether the call to is_true( X ) or is_false( X ) is matched.

  • For example, if any( CONDITION ) evaluates to false than is_true( X ) will return false but is_false( X ) will return true.

  • Examples:

IntegerVector A = seq_len(3);
// Output: 1 2 3
IntegerVector B = clone(A) - 1;
// Output: 0 1 2

bool check_state_true  = is_true( any(A > B) );
// Output: true

bool check_state_false = is_false( any(A > B) );
// Output: false

// Without using the above functions, a compile time error will trigger
// on assignment to bool.
// bool check_state_error = any( A > B );
// Error: invalid use of incomplete type 'class Rcpp::sugar::forbidden_conversion<false>'
// class conversion_to_bool_is_forbidden

all( X )

  • Tests if all elements in a LogicalVector or LogicalMatrix are true.

  • The actual return type of all(X) is an instance of the SingleLogicalResult template class, but the functions is_true and is_false may be used to convert the return value to bool.

  • Examples:

NumericVector A = NumericVector::create(1.0, 2.0, 3.0, 4.0);

Rcout
    << std::boolalpha
    << "all(A < 5): " << is_true(all(A < 5)) << "\n"
    << "all(A < 4): " << is_true(all(A < 4)) << "\n"
    << "all(!is_na(A)): " << is_true(all(!is_na(A))) 
    << "\n\n";
// Output:
// all(A < 5): true
// all(A < 4): false
// all(!is_na(A)): true

NumericMatrix B(2, 2, A.begin());

Rcout
    << std::boolalpha
    << "all(B < 5): " << is_true(all(B < 5)) << "\n"
    << "all(B < 4): " << is_true(all(B < 4)) << "\n"
    << "all(!is_na(B)): " << is_true(all(!is_na(B))) 
    << "\n\n";
// Output:
// all(B < 5): true
// all(B < 4): false
// all(!is_na(B)): true

Rcout
    << std::boolalpha
    << "all({true, true, true}): "
    << is_true(all(LogicalVector::create(true, true, true))) << "\n"
    << "all({true, true, false}): "
    << is_true(all(LogicalVector::create(true, true, false))) 
    << std::endl;
// Output: 
// all({true, true, true}): true
// all({true, true, false}): false

any( X )

  • Tests if any elements in a LogicalVector or LogicalMatrix are true.

  • The actual return type of any(X) is an instance of the SingleLogicalResult template class, but the functions is_true and is_false may be used to convert the return value to bool.

  • Examples:

NumericVector A = NumericVector::create(1.0, 2.0, 3.0, 4.0);

Rcout
    << std::boolalpha
    << "any(A > 3): " << is_true(any(A > 3)) << "\n"
    << "any(A > 4): " << is_true(any(A > 4)) << "\n"
    << "any(is_na(A)): " << is_true(any(is_na(A))) 
    << "\n\n";
// Output:    
// any(x > 3): true
// any(x > 4): false
// any(is_na(x)): false

NumericMatrix B(2, 2, A.begin());
B[0] = NumericMatrix::get_na();

Rcout
    << std::boolalpha
    << "any(B > 3): " << is_true(any(B > 3)) << "\n"
    << "any(B > 4): " << is_true(any(B > 4)) << "\n"
    << "any(is_na(B)): " << is_true(any(is_na(B))) 
    << "\n\n";
// Output:    
// any(B > 3): true
// any(B > 4): false
// any(is_na(B)): true

Rcout
    << std::boolalpha
    << "any({false, false, false}): "
    << is_true(any(LogicalVector::create(false, false, false))) << "\n"
    << "any({true, false, false}): "
    << is_true(any(LogicalVector::create(true, false, false))) 
    << std::endl;
// Output:
// any({false, false, false}): false
// any({true, false, false}): true

Complex Operators

Complex Components

Re( X ) Im( X )
  • Extract the real or imaginary component of a complex number.

  • Definition: \[z = x + i y\], where \(x, y \in \mathbb{R}\).

  • X must be stored within a Vector or Matrix of type Complex. The return type is a NumericVector regardless of whether a Vector or Matrix is supplied.

  • Example:

// Create complex numbers
Rcomplex x, y;

// Assign real, imaginary values
x.r = 5.0; x.i = 12.0;
y.r = 9.2; y.i = -4.0;

// Make a complex vector
ComplexVector A = ComplexVector::create(x, y);
// Output: 5+12i 9.2+-4i

NumericVector B = Re(A);
// Output: 5 9.2

NumericVector C = Im(A);
// Output: 12 -4

Mod( X )

  • Compute the modulus of a complex number or the length from the origin to the point represented in the complex plane (radius in polar coordinates).

  • Definition: \[r = \operatorname{Mod}(z) = \sqrt{x^2 + y^2}\]

  • X must be stored within a Vector or Matrix of type Complex. The return type is a NumericVector regardless of whether a Vector or Matrix is supplied.

  • Example:

// Create complex numbers
Rcomplex x, y;

// Assign real, imaginary values
x.r = 5.0; x.i = 12.0;
y.r = 9.2; y.i = -4.0;

// Make a complex vector
ComplexVector A = ComplexVector::create(x, y);
// Output: 5+12i 9.2+-4i

NumericVector B = Mod(A);
// Output: 13.00000 10.031949
  • See also:

Arg( X )

  • Compute the argument of a complex number or the angle from the positive side of the real axis to the line segment connecting the origin and the point in the complex plane (theta in polar coordinates).

  • Definition: \[\theta = \arctan\left({\frac{y}{x} }\right)\]

  • X must be stored within a Vector or Matrix of type Complex. The return type is a NumericVector regardless of whether a Vector or Matrix is supplied.

  • Example:

// Create complex numbers
Rcomplex x, y;

// Assign real, imaginary values
x.r = 5.0; x.i = 12.0;
y.r = 9.2; y.i = -4.0;

// Make a complex vector
ComplexVector A = ComplexVector::create(x, y);
// Output: 5+12i 9.2+-4i

NumericVector B = Arg(A);
// Output: 1.1760052 -0.4101273
  • See also:

Conj( X )

  • Compute the complex conjugate of a complex number or a number with an equivalent real component by negated imaginary component.

  • X must be stored within a Vector or Matrix of type Complex. The return type is a ComplexVector regardless of whether a Vector or Matrix is supplied.

  • Example:

// Create complex numbers
Rcomplex x, y;

// Assign real, imaginary values
x.r = 5.0; x.i = 12.0;
y.r = 9.2; y.i = -4.0;

// Make a complex vector
ComplexVector A = ComplexVector::create(x, y);
// Output: 5+12i 9.2+-4i

ComplexVector B = Conj(A);
// Output: 5-12i 9.2+ 4i

Data Operations

First or Last Elements

head( X , n ) tail( X , n )
  • Obtain the first or last n observations using head or tail.

  • All types of a Vector or Matrix are supported.

  • Example:

NumericVector A = NumericVector::create(1, 3, 5, 7, 9, 11);

// Retrieve the first two elements
NumericVector B = head(A, 2);
// Output: 1 3

// Retrieve the last two elements
NumericVector C = tail(A, 2);
// Output: 9 11

abs( X )

  • Obtain the absolute value of all elements.

  • Definition: \[\left| X \right| = \begin{cases} X, &\text{if} X \ge 0 \\ -X, &\text{if} X < 0 \end{cases}\]

  • Supported types to perform the operation are Numeric or Integer of a Vector or Matrix.

  • Example:

// Sample data
NumericVector A = NumericVector::create(-2.8, 5.3, 7, -4, 0);

NumericVector B = abs(A);
// Output: 2.8, 5.3, 7, 4, 0

sqrt( X )

  • Compute the square root of all elements.

  • Definition: \[\sqrt{X} = X^{1/2}\]

  • Supported types to perform the operation are Numeric, Integer, or Complex of a Vector or Matrix.

  • Example:

// Sample data
NumericVector A = NumericVector::create(0.0, 1.0, 2.5, 5.0, 9.0);

NumericVector B = sqrt(A);
// Output: 0 1 1.58114 2.23607 3

pow( X , n)

  • Obtain the power of the ith element raised to the n power.

  • Definition: \[Y = X^{n}\]

  • Supported types to perform the operation are Numeric or Integer of a Vector or Matrix.

  • Caveat: Only X is able to be vectorized. The value of n must be a scalar of int or double type.

  • Example:

// Sample data
NumericVector A = NumericVector::create(0.0, 1.0, 2.5, 5.0, 9.0);

NumericVector B = pow( A , 3 );
// Output: 0 1 15.625 125 729

sum( X )

  • Calculate the overall summation of all elements.

  • Definition: \[Y = \sum\limits_{i = 1}^n { {X_i} } \]

  • Supported types to perform the operation are Numeric or Integer of a Vector or Matrix.

  • Example:

// Sample data
NumericVector A = NumericVector::create(3.2, 8.1, 9.5, 8.6, 5.7);

double val_summed = sum(A);
// Output: 35.1

sign( X )

  • Determine the sign of a number or whether a number is positive, negative, or zero.

  • Definition: \[\operatorname{sgn} \left( X \right) = \begin{cases} -1, &\text{if} x < 0 \\ 0, &\text{if} x = 0 \\ 1, &\text{if} x > 0 \end{cases}\]

  • Supported types are Numeric or Integer of a Vector or Matrix.

  • Example:

// Create some sample data
NumericVector A = NumericVector::create(-1, 0, 1, -3.4, 42);

// --- Obtain values of the sign

NumericVector B = sign(A);
// Output: -1  0  1 -1  1

diff( X )

  • Obtain the difference between sucessive Vector elements by \((i+1)\)th and the i-th element.

  • Definition: \[\nabla {X_i} = {X_{i + 1} } - {X_i} \]

  • Supported types are Numeric or Integer of a Vector or Matrix.

  • Example:

// Create some sample data
NumericVector A = NumericVector::create(-2, 0, 0.5, -1, 2.5, 4.75);

// --- Obtain difference

NumericVector B = diff(A);
// Output: 2.00  0.50 -1.50  3.50  2.25

Cumulative Arithmetic

cumsum( X ) cumprod( X )
  • Calculates the cumulative sum (cumsum) or cumulative product (cumprod) of a Vector or Matrix X.

  • If an NA value is encountered, it will be propagated throughout the remaining elements in the result vector.

  • For cumsum, X should be an Integer or Numeric Vector or Matrix.

  • For cumprod, X should be an Integer, Numeric, or Complex Vector or Matrix.

  • In either case, the return type is a Vector of the same underlying SEXPTYPE as the input.

  • Caveat: at the time of writing (Rcpp version 0.12.11), not all Sugar expressions are directly compatible with Vector::operator=, as many of these functions return intermediate template classes which require an explicit conversion to
    Vector, rather than directly returning a Vector. In such cases the user may need to “help” the compiler with the conversion by

    • Constructing a Vector from the result, and assigning that to the target Vector; or
    • Calling an explicit conversion member function of the Sugar class, if such a function exists.

    See the examples below for a demonstration.

  • Examples:

NumericVector x = NumericVector::create(1, 2, 3, 4, 5);
NumericVector cs = cumsum(x), cp = cumprod(x);

Rcout
    << "cumsum(x): \n" << cs << "\n\n"
    << "cumprod(x): \n" << cp << "\n\n";
// Output:
// cumsum(x): 
// 1 3 6 10 15
// 
// cumprod(x): 
// 1 2 6 24 120

x[3] = NumericVector::get_na();

cs = cumsum(x).get();
cp = cumprod(x).get();

// These print as `nan`, but are actually `NA` values
Rcout
    << "cumsum(x): \n" << cs << "\n\n"
    << "cumprod(x): \n" << cp 
    << std::endl;
// Output: 
// cumsum(x): 
// 1 3 6 nan nan
// 
// cumprod(x): 
// 1 2 6 nan nan

// As noted above: 
NumericVector y = NumericVector::create(1, 2, 3, 4, 5);

NumericVector cs = cumsum(y);   // OK, calls copy *constructor*, not 
                                // copy assignment operator

// cs = cumsum(y);      compiler error: no viable conversion from 
//                      'const Rcpp::sugar::Cumsum<14, true, Rcpp::Vector<14, PreserveStorage> >' 
//                      to 'SEXP' (aka 'SEXPREC *')

cs = cumsum(y).get();           // OK, `get()` returns a VectorBase, which is 
                                // assignable to Vector
                                
cs = NumericVector(cumsum(y));  // OK, but requires an additional Vector 
                                // to be created

Cumulative Extremum

cummax( X ) cummin( X )
  • Calculates the cumulative maximum (cummax) or cumulative minimum (cummin) of a Vector or Matrix X.

  • If an NA value is encountered, it will be propagated throughout the remaining elements in the result vector.

  • Supported types are Integer or Numeric of a Vector or Matrix.

  • Caveat: at the time of writing (Rcpp version 0.12.11), not all Sugar expressions are directly compatible with Vector::operator=, as many of these functions return intermediate template classes which require an explicit conversion to
    Vector, rather than directly returning a Vector. In such cases the user may need to “help” the compiler with the conversion by

    • Constructing a Vector from the result, and assigning that to the target Vector; or
    • Calling an explicit conversion member function of the Sugar class, if such a function exists.

    See the examples below for a demonstration.

  • Examples:

NumericVector x = NumericVector::create(1, -2, 5, 10, -4);
NumericVector cmax = cummax(x), cmin = cummin(x);

Rcout
    << "cummax(x): \n" << cmax << "\n\n"
    << "cummin(x): \n" << cmin << "\n\n";
// Output:
// cummax(x): 
// 1 1 5 10 10
// 
// cummin(x): 
// 1 -2 -2 -2 -4

x[3] = NumericVector::get_na();

cmax = cummax(x).get();
cmin = cummin(x).get();

// These print as `nan`, but are actually `NA` values
Rcout
    << "cummax(x): \n" << cmax << "\n\n"
    << "cummin(x): \n" << cmin 
    << std::endl;
// Output: 
// cummax(x): 
// 1 1 5 nan nan
// 
// cummin(x): 
// 1 -2 -2 nan nan

// As noted above: 
NumericVector y = NumericVector::create(1, 2, 3, 4, 5);

NumericVector cmax = cummax(y);   // OK, calls copy *constructor*, not 
                                  // copy assignment operator

// cmax = cummax(y);      compiler error: no viable conversion from 
//                        'const Rcpp::sugar::cummax<14, true, Rcpp::Vector<14, PreserveStorage> >' 
//                        to 'SEXP' (aka 'SEXPREC *')

cmax = cummax(y).get();           // OK, `get()` returns a VectorBase, which is 
                                  // assignable to Vector
                                
cmax = NumericVector(cummax(y));  // OK, but requires an additional Vector 
                                  // to be created

trigonometric element-wise functions

sin( X ) asin( X ) sinh( X )
cos( X ) acos( X ) cosh( X )
tan( X ) atan( X ) tanh( X )
  • Compute the trigonometric value for each element in a given structure.

  • Usage:

    • vector_type Y = func(X)
    • X and Y must be of the same vector_type/matrix_type.
    • where func is one of the following trigonmetric functions:
      • sin family: sin(X), asin(X), sinh(X)
      • cos family: cos(X), acos(X), cosh(X)
      • tan family: tan(X), atan(X), tanh(X)
  • Supported types are Integer, Numeric, or Complex of a Vector or Matrix.

  • Examples:

// Generate Values
NumericVector X  = rnorm(10);

// Compute trigonometric values
NumericVector Y  =  cos(X);
NumericVector Y2 = acos(X);
NumericVector Y3 = cosh(X);

Logarithms and Exponentials

log( X ) log10( X ) log1p( X )
exp( X ) expm1( X )
  • Apply a function to each element

  • Usage:

    • vector_type Y = func(X)
    • X and Y must be of the same vector_type/matrix_type.
  • Traditional use case:

    • log( X ) computes the natural logarithm sometimes representated as \(\ln(X)\)
    • log10( X ) computes the base 10 logarithm.
    • exp( X ) computes the exponential function given by: \[\exp \left( X \right) = \mathop {\lim }\limits_{n \to \infty } {\left( {1 + \frac{X}{n} } \right)^n} = \sum\limits_{k = 0}^\infty {\frac{ { {X^k} } }{ {k!} } } \]
  • Special use cases:

    • log1p( X ) computes \(\log( 1 + X )\) accurately for \(\left|X\right| << 1\).
    • expm1( X ) computes \(\exp( X ) - 1\) accurately for \(\left|X\right| << 1\).
  • Examples:

// Create some sample data
NumericVector A = NumericVector::create(-1, 0, 1, 2.3);

// --- Log and Exp Functions

// Obtain the exponential

NumericVector B = exp(A);
// Output: 0.3678794 1.0000000 2.7182818 9.9741825

// Obtain the natural log, which should recover the initial values

NumericVector C = log(B);
// Output: -1.0  0.0  1.0  2.3

// --- Compare implementations of log1p and expm1 vs. generic

// Create input vector
NumericVector D = no_init(10);

// Fill the vector
for(int i = 0; i < D.length(); ++i){
    D[i] = std::pow(10.0, -1.0*( 1.0 + 2.0*(i+1.0) ) );
}

// Compute values according to definition
NumericVector E = log(1 + D), F = log1p(D), G = exp(D)-1, H = expm1(D);

// Bound values together
NumericMatrix I = cbind(D, E, F, G, H);

// Label columns
colnames(I) = CharacterVector::create("X", "log(1+X)","log1p(X)", "exp(X)+1", "expm1(X)");

// Output:
//           X     log(1+X)     log1p(X)     exp(X)+1     expm1(X)
//  [1,] 1e-03 9.995003e-04 9.995003e-04 1.000500e-03 1.000500e-03
//  [2,] 1e-05 9.999950e-06 9.999950e-06 1.000005e-05 1.000005e-05
//  [3,] 1e-07 1.000000e-07 1.000000e-07 1.000000e-07 1.000000e-07
//  [4,] 1e-09 1.000000e-09 1.000000e-09 1.000000e-09 1.000000e-09
//  [5,] 1e-11 1.000000e-11 1.000000e-11 1.000000e-11 1.000000e-11
//  [6,] 1e-13 9.992007e-14 1.000000e-13 9.992007e-14 1.000000e-13
//  [7,] 1e-15 1.110223e-15 1.000000e-15 1.110223e-15 1.000000e-15
//  [8,] 1e-17 0.000000e+00 1.000000e-17 0.000000e+00 1.000000e-17
//  [9,] 1e-19 0.000000e+00 1.000000e-19 0.000000e+00 1.000000e-19
// [10,] 1e-21 0.000000e+00 1.000000e-21 0.000000e+00 1.000000e-21

sample

sample( n , size , replace , probs )
sample( X , size , replace , probs )
  • Obtain a random sampling of elements from either a positive number of elements ranging from 1 to n or data contained with X.

  • All types are supported of a Vector or Matrix.

  • The parameters available for sample are defined as:

    • size, the number of items to sample
    • replace = false, allow replacement or elements to be added back in if picked.
    • probs = R_NilValue, vector containing probability weights
  • Examples:

/*** R
# in R set a seed for reproducibility
set.seed(111)
*/

// Create some sample data
NumericVector A = NumericVector::create(-3.5, 2, 2.2, 0.1, -.4, -1, 4.75);

// --- Sampling approaches

// Using a positive number
IntegerVector C = sample(10, 4);
// Output: 6 4 9 1

// Sample from a vector
NumericVector B = sample(A, 3);
// Output: -0.4 4.75 2

Rounding of Numbers

Ceiling

ceiling( X ) ceil( X )
  • Compute the smallest integer value not less than the corresponding element of X.

  • Definition: \[\left\lceil X \right\rceil = \min \left[ {n \in \mathbb{Z}|n \ge X} \right]\]

  • Supported types are Numeric or Integer of a Vector or Matrix.

  • Note: ceil is a mapping to ceiling.

  • Examples:

// Create some sample data
NumericVector A = NumericVector::create(-3.5, 2, 2.2, 0.1, -.4, -1, 4.75);

// --- Obtain the ceiling

NumericVector B = ceiling(A);
// Output: -3  2  3  1  0 -1  5

// Same result 
NumericVector C = ceil(A);
// Output: -3  2  3  1  0 -1  5

floor( X )

  • Compute the largest integer value not greater than the corresponding element of X.

  • Definition: \[\left\lfloor X \right\rfloor = \max \left[ {n \in \mathbb{Z}|n \le X} \right]\]

  • Supported types are Numeric or Integer of a Vector or Matrix.

  • Examples:

// Create some sample data
NumericVector A = NumericVector::create(-3.5, 2, 2.2, 0.1, -.4, -1, 4.75);

// --- Obtain the floor

NumericVector B = floor(A);
// Output: -4  2  2  0 -1 -1  4

trunc( X )

  • Obtain the integers formed by truncating the values in X toward 0.

  • Definition: \[\operatorname{trunc}\left( X \right) = \begin{cases} \left\lfloor X \right\rfloor, &\text{if} X > 0 \\ \left\lceil X \right\rceil, &\text{if} X < 0 \end{cases} \]

  • Supported types are Numeric or Integer of a Vector or Matrix.

  • Examples:

// Create some sample data
NumericVector A = NumericVector::create(-3.5, 2, 2.2, 0.1, -.4, -1, 4.75);

// --- Obtain the truncated value

NumericVector B = trunc(A);
// Output: -3  2  2  0  0 -1  4

round( X , digits )

  • Obtain a rounded number to specified number of decimal places.

  • There is no default value for digits. This parameter must be specified with an int.

  • Supported types are Numeric or Integer of a Vector or Matrix.

  • Examples:

// Create some sample data
NumericVector A = NumericVector::create(-3.5, 2, 2.2, 0.1, -.4, -1, 4.75);

// --- Obtain the round values

// Default rounds to no decimal places
NumericVector B = round(A, 0);
// Output: -4  2  2  0  0 -1  5

NumericVector C = round(A, 1);
// Output: -3.5  2.0  2.2  0.1 -0.4 -1.0  4.8

signif( X, digits )

  • Round the number to the appropriate number of significant digits.

  • There is no default value for digits. This parameter must be specified with an int.

  • Supported types are Numeric or Integer of a Vector or Matrix.

  • Examples:

// Create some sample data
NumericVector A = NumericVector::create(11252, 59622, 764, 94512, 4121.5);

// --- Obtain the significant digits

// Default rounds to no decimal places
NumericVector B = signif(A, 2);
// Output: 11000 60000   760 95000  4100

NumericVector C = round(A, 3);
// Output: 11300 59600   764 94500  4120

Finite, Infinite, Missingness, and NaN Detection

  • Finite numerical representations take the form of base 10 numbers like 1, 2, …, 42, and so on. These values are able to be operated upon such that a collection of numerical values can provide a statistical summary. However, when working with values that hold special meanings the representation, is not necessarily ideal. Therefore, a set of tools exists to detect when values with special meanings exist.

  • From Kevin Ushey’s post on StackOverflow, we have a set of truth tables or an indicator of whether the value is detected by a given function, which covers the R interpreter, Rcpp, and R/C API. Note, Rcpp by default follows how R interpreter has crafted the methods.

  • R interpreter:

Function NaN NA
is.na T T
is.nan T F
  • Rcpp:
Function NaN NA
Rcpp::is_na T T
Rcpp::is_nan T F
  • R/C API:
Function NaN NA
ISNAN T T
R_IsNaN T F
ISNA F T
R_IsNA F T
  • Note: The R/C API is highly inconsistent when detecting values.

Setting Infinite, Missingness, and NaN Values

  • To indicate missingness or NA values, the following pre-defined constants have been made available for specific Rcpp data types:
Rcpp Data Type Rcpp Value Description
Numeric NA_REAL Numeric Missing Value
Integer NA_INTEGER Integer Missing Value
Logical NA_LOGICAL Logical Missing Value
Character NA_STRING String Missing Value
  • To set a missing value type for any type of Vector or Matrix regardless of whether a pre-defined exists, one can use the static member ::get_na(), e.g. ComplexVector::get_na() creates an NA value for a complex vector.

  • The Numeric and double data types also support the following special constant values:

Rcpp Value Value Description
R_PosInf Inf Positive Infinity
R_NegInf -Inf Negative Infinity
R_NaN NaN Not a Number
// Create an NA value for each type
NumericVector A   = NumericVector::create(NA_REAL);
IntegerVector B   = IntegerVector::create(NA_INTEGER);
LogicalVector C   = LogicalVector::create(NA_LOGICAL);
CharacterVector D = CharacterVector::create(NA_STRING);
// Output: NA

// Group all of the above together
List E = List::create(A, B, C, D);

Finiteness

is_finite( X ) is_infinite( X )
  • Determines whether each element of X are finite (is_finite) or infinite (is_infinite).

  • Support exisits for only the Numeric type of a Vector or Matrix.

  • Note: Infinite detection is only applicable to Numeric types as these values are only defined for double types.

  • Caveat: Not a Number is not considered to be a finite nor infinite value. If this value exists within the object, it must be detected with is_nan or is_na.

  • Examples:

NumericVector A = NumericVector::create(R_NegInf, -5, 0, 12, R_PosInf, 42,  R_NaN) ;

LogicalVector B = is_finite( A );
// Output: FALSE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE

LogicalVector C = is_infinite( A );
// Output: TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE

Missing Values and NaN Detection

is_na( X ) is_nan( X )
  • Determines the missing values (is_na) or not a number (is_nan).

  • All types are supported of a Vector or Matrix.

  • Note: The difference between the two functions is explained by in the truth table

Function NaN NA
Rcpp::is_na T T
Rcpp::is_nan T F
  • Examples:
NumericVector X = NumericVector::create(R_NaN, 1, NA_REAL, 3 ) ;

LogicalVector result_na = is_na( X );
// Output: TRUE, FALSE, TRUE, FALSE

LogicalVector result_nan = is_nan( X );
// Output: TRUE, FALSE, FALSE, FALSE

na_omit( X )

  • Removes values that are either NA or NaN.

  • All types are supported of a Vector or Matrix.

  • Example:

// Create some sample data
NumericVector A = NumericVector::create(R_NaN, 1, NA_REAL, 3 , NA_REAL) ;

// Remove NA and NaN Values
NumericVector B = na_omit(A);
// Output: 1 3

noNA( X )

  • Assert the object is NA-free to avoid checking whether each value is not missing.

  • All types are supported of a Vector or Matrix.

  • Warning: Using noNA with a Matrix defaults the underlying class from a Matrix to VectorBase! Thus, the matrix dimensional information is lost.

  • Example:

// Create some sample data
NumericVector A = NumericVector::create(1, 2, 3, 4) ;

// Assert vector is NA-free
NumericVector B = noNA(A);
// Output: 1 2 3 4

The Apply Family

sapply

  • Apply a C++ function or functor to each element of an object and receive a Vector back.

  • The return type is automatically detected by the supplied C++ function or functor of the Vector class.

  • All types are supported of a Vector or Matrix.


// --- C++ Function Approach
template <typename T>
T square( const T& x){
    return x * x ;
}

IntegerVector test_sapply_function() {
  // Create sample data
  IntegerVector A = seq_len(10);

  // Call C++ Function
  IntegerVector B = sapply(A, square<int>);

  return B;
}

test_sapply_function();
// Output: 1   4   9  16  25  36  49  64  81 100


// --- C++ Functor Approach
template <typename T>
struct square : std::unary_function<T,T> {
  T operator()(const T& x) const { return x * x; }
};
// std::unary_function is deprecated in C++11

IntegerVector test_sapply_functor() {
  // Create sample data
  IntegerVector A = seq_len(10);

  // Call C++ Function
  IntegerVector C = sapply(A, square<int>());

  return C;
}

test_sapply_functor();
// Output: 1   4   9  16  25  36  49  64  81 100

lapply

  • Apply a C++ function or functor to each element of an object and receive a List back.

  • All types are supported of a Vector or Matrix.

// --- C++ Function Approach
template <typename T>
T square( const T& x){
    return x * x ;
}

List test_lapply_function() {
  // Create sample data
  IntegerVector A = seq_len(3);

  // Call C++ Function
  List B = sapply(A, square<int>);

  return B;
}

test_lapply_function();
// Output:
// [[1]]
// [1] 1
// [[2]]
// [1] 4
// [[3]]
// [1] 9

// --- C++ Functor Approach
template <typename T>
struct square : std::unary_function<T,T> {
  T operator()(const T& x) const { return x * x; }
};
// std::unary_function is deprecated in C++11

IntegerVector test_lapply_functor() {
  // Create sample data
  IntegerVector A = seq_len(3);

  // Call C++ Function
  List C = lapply(A, square<int>());

  return C;
}

test_lapply_functor();
// Output: 
// [[1]]
// [1] 1
// [[2]]
// [1] 4
// [[3]]
// [1] 9

mapply

  • Apply a C++ function or functor on up to three input objects and receive a Vector.

  • The return type is automatically detected by the supplied C++ function or functor of the Vector class.

  • All types are supported of a Vector or Matrix.

  • Example:

// --- C++ Function Example
template <typename T>
T sum_val(T x, T y, T z) {
     return x + y + z ;
}

// Sample Data
NumericVector A = NumericVector::create(1, 2, 3);
NumericVector B = NumericVector::create(2, 3, 4);
NumericVector C = NumericVector::create(3, 4, 5);

NumericVector D = mapply(A, B, C, sum_val<double>);
// Output: 6 9 12

Special Functions of Mathematics

Beta

beta( A, B ) lbeta( A, B )
  • Compute value of the beta function, \(B \left(a,b\right)\), and the natural logarithm of the beta function, \(\log \left( {B \left(a,b\right)} \right)\).

  • Definition:

\[\begin{aligned} B\left( {a,b} \right) &= \frac{ {\Gamma \left( a \right)\Gamma \left( b \right)} }{ {\Gamma \left( {a + b} \right)} } \\ & = \int\limits_0^1 { {t^{\left( {a - 1} \right)} }{ {\left( {1 - t} \right)}^{\left( {b - 1} \right)} }dt} \\ \log \left( { B\left( {a,b} \right) } \right) &= \log \left( { \int\limits_0^1 { {t^{\left( {a - 1} \right)} }{ {\left( {1 - t} \right)}^{\left( {b - 1} \right)} }dt} } \right) \end{aligned}\]

  • Supported types are Integer or Numeric of a Vector or Matrix.

  • Examples:

// Sample Data
NumericVector A = NumericVector::create(10, 9, 8, 7, 6);
NumericVector B = NumericVector::create(5, 4, 3, 2, 1);

// --- Sample Vectorized Calls

NumericVector C = beta(A, B);
// Output: 9.99001e-005 0.000505051 0.00277778 0.0178571 0.166667

NumericVector D = lbeta(A, B);
// Output: -9.21134 -7.59085 -5.8861 -4.02535 -1.79176

// --- Optional Scalar

NumericVector E = beta(10, B);
// Output: 9.99001e-005 0.00034965 0.00151515 0.00909091 0.1

NumericVector F = beta(A, 5);
// Output: 9.99001e-005 0.0001554 0.000252525 0.0004329 0.000793651

Gamma

gamma( X ) lgamma( X )
  • Compute value of the gamma function, \(\Gamma \left(x\right)\), and the natural logarithm of the absolute value of the gamma function, \(\log \left( {\left| {\Gamma \left( x \right)} \right|} \right)\).

  • Definition:

\[\begin{aligned} \Gamma \left( x \right) &= \int\limits_0^\infty { {t^{\left( {x - 1} \right)} }\exp \left( { - t} \right)dt} \\ \log \left( {\left| {\Gamma \left( x \right)} \right|} \right) &= \log \left( {\left| { \int\limits_0^\infty { {t^{\left( {x - 1} \right)} }\exp \left( { - t} \right)dt} } \right|} \right) \end{aligned}\]

  • Supported types are Integer or Numeric of a Vector or Matrix.

  • Examples:

// Sample Data
NumericVector A = NumericVector::create(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5);

NumericVector B = gamma(A);
// Output: 1 0.886227 1 1.32934 2 3.32335 6 11.6317 24

NumericVector C = lgamma(A);
// Output: 0 -0.120782 0 0.284683 0.693147 1.20097 1.79176 2.45374 3.17805

Gamma Derivatives

psigamma( X , deriv )
digamma( X ) trigamma( X )
tetragamma( X ) pentagamma( X )
  • Obtain the nth derivative of the logarithm of the gamma function using psigamma. For convenience, derivatives of the second, digamma, through fifth, pentagamma, order have been defined.

  • Definition: \[\begin{aligned} {\psi _n}\left( x \right) &= \frac{ { {d^{n + 1} } } }{ {d{x^{n + 1} } } }\ln \Gamma \left( x \right) \\ &= \frac{ { {d^n} } }{ {d{x^n} } }\frac{ {\Gamma '\left( x \right)} }{ {\Gamma \left( x \right)} } \\ \end{aligned}\]

  • The deriv parameter of the psigamma function specifies the derivative to take of the logarithm of the gamma function.

  • For the psigamma function, only the Numeric type of the Vector and Matrix class is supported. The convenience derivative functions have support for both Integer and Numeric type of the Vector and Matrix class.

  • Examples:

NumericVector A = NumericVector::create(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5);

NumericVector B = digamma(A);
// Output: -0.577216 0.03649 0.422784 0.703157 0.922784 1.10316 1.25612 1.38887 1.50612

NumericVector C = trigamma(A);
// Output: 1.64493 0.934802 0.644934 0.490358 0.394934 0.330358 0.283823 0.248725 0.221323

NumericVector D = tetragamma(A);
// Output: -2.40411 -0.828797 -0.404114 -0.236204 -0.154114 -0.108204 -0.0800397 -0.0615568 -0.0487897

NumericVector E = pentagamma(A);
// Output: 6.49394 1.40909 0.493939 0.223906 0.118939 0.0703058 0.0448653 0.0303225 0.0214278

// --- Trigamma derivative
NumericVector F = psigamma(A, 1.0);
// Output: 1.64493 0.934802 0.644934 0.490358 0.394934 0.330358 0.283823 0.248725 0.221323

Factorials

factorial( X ) lfactorial( X )
  • Compute the product of all positive integers less than or equal to n using factorial and the natural logarithm of the absolute value of the factorial with lfactorial.

  • Definition: \[\begin{aligned} n! &= \prod\limits_{k = 1}^n { {k_i} } = 1 \times 2 \times \cdots \times \left( {n - 1} \right) \times n \\ &= \begin{cases} 1, &\text{if } n = 0 \\ n\left( {n - 1} \right)!, &\text{if } n > 0 \end{cases} \\ &= \Gamma\left(n+1\right) \\ \log \left( {\left| {n!} \right|} \right) &= \log \left( {\left| {\Gamma \left( {n + 1} \right)} \right|} \right) \end{aligned}\]

  • Only the Numeric type of the Vector and Matrix class is supported.

  • Examples:

// Sample Data
NumericVector A = NumericVector::create(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5);

NumericVector B = factorial(A);
// Output: 1 1.32934 2 3.32335 6 11.6317 24 52.3428 120

NumericVector C = lfactorial(A);
// Output: 0 0.284683 0.693147 1.20097 1.79176 2.45374 3.17805 3.95781 4.78749

Combinatorics

choose( N , K ) lchoose( N , K)
  • Compute the binomial coefficients for all real numbers n and integer k using choose and the natural logarithm of the absolute value of the binomial coefficients with lchoose.

  • Definition: \[\begin{aligned} {n \choose k} &= \frac{n!}{k!\left( {n - k} \right)!} = \frac{n\left( {n - 1} \right) \cdots \left( {n - k + 1} \right)}{k!} \\ \log \left( {\left| {n \choose k} \right|} \right) &= \log \left( {\left| {\frac{n\left( {n - 1} \right) \cdots \left( {n - k + 1} \right)}{k!} } \right|} \right) \end{aligned}\]

  • Only the Numeric type of the Vector and Matrix class is supported.

  • Examples:

// Sample Data
NumericVector A = NumericVector::create(10, 9, 8, 7, 6);
NumericVector B = NumericVector::create(5, 4, 3, 2, 1);

// --- Vectorize Choose

NumericVector C = choose(A, B);
// Output: 252 126 56 21 6

NumericVector D = lchoose(A, B);
// Output: 5.52943 4.83628 4.02535 3.04452 1.79176

// --- Scalar Choose

NumericVector E = choose(10.0, B);
// Output: 252 210 120 45 10

NumericVector F = choose(A, 5.0);
// Output: 252 126 56 21 6

Statistical Summaries

Minimum and Maximum

min( X ) max( X )
  • Obtain the extremum value of either a maximum and minimum from within Vector or Matrix.

  • Examples:

// Sample Data
NumericVector X = NumericVector::create(3, 4, 9, 5, 1, 2);

// Obtain max value for X
double X_max = max(X);
// Output: 9

// Obtain min value for X
double X_min = minx(X);
// Output: 1

range( X )

  • Computes the range or the minimum and maximum values of the sample.

  • Supported types are Integer or Numeric of a Vector or Matrix.

// Sample Data
NumericVector X = NumericVector::create(3, 4, 9, 5, 1, 2);

// Obtain range value for X
NumericVector X_range = range(X);
// Output: 1 9

mean( X )

  • Computes the overall sample mean by summing each observation and dividing by the total number of observations.

  • Definition: \[\bar{X} = \frac{1}{n}\sum\limits_{i = 1}^n { {X_i} } \]

  • Supported types are Integer, Numeric, Complex, or Logical of a Vector or Matrix.

// Sample Data
NumericVector X = NumericVector::create(3, 4, 9, 5, 1, 2);

// Obtain mean value for X
double X_mean = mean(X);
// Output: 4

median( X , na_rm)

  • Computes the sample median by ordering elements from largest to smallest and then selecting the middle element. If an even number of elements is present, then the median consists of an average between the two middle numbers.

  • All types of a Vector or Matrix are supported.

  • Examples:

// Sample Data
NumericVector X = NumericVector::create(3,4,9,5,NA_REAL,1,2);

// Obtain the median value for X by removing NAs
double X_median = median(X, true);
// Output: 3.5

// By default, NA is not removed
double X_median_na = median(X);
// Output: NaN

Variance

var( X ) sd( X )
  • Computes the sample variance and standard deviation formula by taking the corrected sum of squares and dividing it by \(N-1\).

  • Definition: \[\begin{aligned} \operatorname{var}\left( X \right) &= \frac{1}{n-1}\sum\limits_{i = 1}^n { { {\left( { {X_i} - \bar X} \right)}^2} } \\ \operatorname{sd}\left( X \right) &= \sqrt{\operatorname{var}\left( X \right)} \end{aligned}\]

  • Note: No parameter support exists to switch between population (\(\frac{1}{n}\)) and sample (\(\frac{1}{n-1}\)) definitions. If necessary, multiple by the multiplication by double(n-1)/n should provide the appropriate conversion.

  • Supported types are Integer, Numeric, Complex, or Logical of a Vector or Matrix.

// Sample data
NumericVector X = NumericVector::create(5.3, 1.9, 7.4, 4.5, 2.5);

// --- Compute the Variance
double var_val = var(X);
// Output: 4.912

// --- Compute the SD
double sd_val = SD(X);
// Output: 2.216303

Special Operations

rev( X )

  • Reverse the position of the elements within the vector.

  • All types are supported of a Vector or Matrix.

  • Example:

// Sample data
NumericVector A = NumericVector::create(1, 10, 100, 1000, 10000);

// --- Reverse the vector
NumericVector B = rev(A);
// Output: 10000  1000   100    10     1

Parallel Extremum

pmax( X, Y ) pmin( X, Y )
  • Obtain a parallel extremum value of either maximum and minimum for either a scalar and a Vector or just two Vector objects.

  • For instance, the parallel minimum of X = 0, 1 and Y = 2, -3 would be Z = 0, -3.

  • Examples:

// Sample Data
NumericVector A = NumericVector::create(1,3,2,4,6,5);
NumericVector B = NumericVector::create(2,1,3,5,4,6);

// --- Parallel Maximum

// Scalar and Vector
NumericVector C = pmax(2, A);
// Output: 2 3 2 4 6 5

// Two vectors
NumericVector D = pmax(A, B);
// Output: 2 3 3 5 6 6

// --- Parallel Minimum

// Scalar and Vector
NumericVector E = pmin(2, A);
// Output: 1 2 2 2 2 2

// Two vectors
NumericVector F = pmin(A, B);
// Output: 1 1 2 4 4 5

clamp( min , X , max )

  • Bound elements of X between min and max by replacing the element by the boundary if \(X < min\) or \(X > max\).

  • An alternate version of this function can be obtained with pmax(min, pmin(X, max) ).

  • Example:

// Sample Data
NumericVector A = NumericVector::create(-3.4, 5.1, -8.1, 10.8, 2.9, 4.3, 15.5);

// --- Clamp values

// Remove negative values and any value above 10
NumericVector C = clamp(0.0, A, 10.0);
// Output: 0.0 5.1 0.0 10.0 2.9 4.3 10.0

Extremum Indice

which_min( X ) which_max( X )
  • Provides the position in the vector of the minimum or maximum value.

  • All types are supported of a Vector or a Matrix.

  • Examples:

// Sample data
NumericVector A = NumericVector::create(3.2, 5.2, -9.7, 4.3, 8.8);

// Return index for min value
int min_idx = which_min(A);
// Output: 2 (because C++ indices starts at 0!)

// Return index for max value
int max_idx = which_max(A);
// Output: 4 (because C++ indices starts at 0!)

Uniqueness Operators

match( X , Y )

  • Obtain the first locations of a match between elements in X and Y.

  • All types are supported of a Vector or a Matrix.

  • Example:

CharacterVector A = CharacterVector::create("a", "b");
CharacterVector B = CharacterVector::create("c", "a", "b", "c", "e", "a", "b", "d");

IntegerVector C = match(A, B);
// Output: 2 3

self_match( X )

  • Obtain the locations of where each element occurs in the object.

  • All types are supported of a Vector or a Matrix.

  • Note: This operation is similar to the R command match(x, unique(x)).

  • Example:

CharacterVector A = CharacterVector::create("a", "b", "c", "c", "e", "b", "d");

IntegerVector B = self_match(A);
// Output: 1 2 3 3 4 2 5

in( X , Y )

  • Determine whether elements in X are found in Y.

  • All types are supported of a Vector or a Matrix.

  • Example:

CharacterVector A = CharacterVector::create("a", "b", "c", "c", "e", "b", "d");

CharacterVector B = CharacterVector::create("a", "b");

LogicalVector C = in(A, B);
// Output: TRUE TRUE FALSE FALSE FALSE TRUE FALSE

Unique

duplicated( X ) unique( X )
  • Determine whether duplicates exist within a Vector or what the unique values are.

  • Duplicated values are identified by a LogicalVector such that the first, second, and so on replicates are labeled as TRUE while unique values are FALSE.

  • Unique provides only the first occurrence of a given value. In essence, only the values that appear as FALSE within the duplicate function.

  • Examples:

// Sample data
CharacterVector A = CharacterVector::create("a","b","c","a","b","c","a");

// Detect duplicates within a string
LogicalVector B = duplicated(A);
// Output: FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE

// Obtain only unique values
CharacterVector C = unique(A);
// Output: "a", "b", "c"

sort_unique( X )

  • Determine the unique elements within an object and sort them in increasing order.

  • All types are supported of a Vector or a Matrix.

  • Examples:

// Sample data
CharacterVector A = CharacterVector::create("a","b","c","a","b","c","a");
NumericVector B = NumericVector::create(1.1, 1, 1, 2.5, 2.5, 3, 3);

// --- Find unique values and sort them

CharacterVector C = sort_unique(A);
// Output: "a" "b" "c"

NumericVector D = sort_unique(B);
// Output: 1 1.1 2.5 3

table( X )

  • Given a Vector of either NumericVector, IntegerVector, LogicalVector, or CharacterVector, compute an IntegerVector that contains a count of each value.

  • Examples:

// Needs to have a way to extract what value is represented at each position
// names attribute? 

// Create Numerical Data
NumericVector A = NumericVector::create(3.2, 1.2, -0.5, NA_REAL, 1.2, 2.0, NA_REAL);

// Tabulate
IntegerVector B = table(A);

// Create Integer Data
IntegerVector C = IntegerVector::create(2, -2, NA_INTEGER, NA_INTEGER, 200, 2);

// Tabulate the integer data
IntegerVector D = table(C);

// Create Character Data
CharacterVector E = CharacterVector::create("a", "a", "b", NA_STRING, "c", NA_STRING);

// Tabulate
IntegerVector F = table(E);

// Create Logical Data
LogicalVector G = LogicalVector::create(true, NA_LOGICAL, false, true, NA_LOGICAL);

// Tabulate
IntegerVector H = table(G);

Set Operations

setequal(X, Y)

  • Determine if the values of two objects are equal.

  • Definition: \[A = B = \left\{ {\forall x:x \in A \wedge x \in B} \right\}\]

  • All types are supported of a Vector or Matrix.

  • Examples:

// Sample Data for equal case
CharacterVector A = CharacterVector::create("a", "b", "c");
CharacterVector B = CharacterVector::create("c", "b", "a");

// Sample Data for unequal case
CharacterVector C = CharacterVector::create("a", "b");
CharacterVector D = CharacterVector::create("c");

bool val_equal_set = setequal(A, B);
// Output: TRUE

bool val_unequal_set = setequal(C, D);
// Output: FALSE

intersect(X, Y)

  • Find all elements two objects have in common.

  • Definition: \[A \cap B = \left\{ {x:x \in A \wedge x \in B} \right\}\]

  • All types are supported of a Vector or Matrix.

  • Examples:

// Sample Data for equal case
CharacterVector A = CharacterVector::create("a", "b", "c");
CharacterVector B = CharacterVector::create("c", "b", "d");

// Sample Data for empty set
CharacterVector C = CharacterVector::create("a", "b");
CharacterVector D = CharacterVector::create("c");

CharacterVector E = intersect(A, B);
// Output: "b" "c"

CharacterVector F = intersect(C, D);
// Output: 
// (Empty set/None)

union_(X, Y)

  • Obtaining only one copy of all elements that exist in both sets.

  • Definition: \[A \cup B = \left\{ {x:x \in A \vee x \in B} \right\}\]

  • All types are supported of a Vector or Matrix.

  • Note: Union has a postfix of an underscore (_) because union is a keyword in C++.

  • Examples:

// Sample Data for overlapping elements
CharacterVector A = CharacterVector::create("a", "b", "c");
CharacterVector B = CharacterVector::create("c", "b", "d");

// Sample Data for no shared elements
CharacterVector C = CharacterVector::create("a", "b");
CharacterVector D = CharacterVector::create("c");

// --- Calling union_

CharacterVector E = union_(A, B);
// Output: "d" "a" "c" "b"

CharacterVector F = union_(C, D);
// Output: "a" "c" "b"

setdiff(X, Y)

  • Obtain the elements in A not in B or the intersection of A with the complement of B.

  • Definition: \[A\backslash B = A - B = A \cap {B^C} = \left\{ {x:x \in A \wedge x \notin B} \right\}\]

  • All types are supported of a Vector or Matrix.

  • Examples:

// Sample Data for overlapping elements
CharacterVector A = CharacterVector::create("a", "b", "c");
CharacterVector B = CharacterVector::create("c", "b", "d");

// Sample Data for no shared elements
CharacterVector C = CharacterVector::create("a", "b");
CharacterVector D = CharacterVector::create("c");

// --- Calling setdiff

CharacterVector E = setdiff(A, B);
// Output: "a"

// Note order results in different values!
CharacterVector F = setdiff(B, A);
// Output: "d"

CharacterVector G = setdiff(C, D);
// Output: "a" "b"

Matrix Operations

Row and Column Sums

colSums( X , na_rm) rowSums( X , na_rm )
  • Computes the summation of elements either by column (colSums) or row (rowSums) of a Matrix.

  • Definition:

\[\begin{aligned} \text{(Row) } { {X}_{i} } &= \sum\limits_{j = 1}^J { {X_{i,j} }} \\ \text{(Column) } { {X}_{j} } &= \sum\limits_{i = 1}^I { {X_{i,j} }} \\ \end{aligned} \]

  • Supported types are Numeric, Integer or Complex of the Matrix class.

  • The return type is the equivalent input type but of the Vector class.

  • Examples:

NumericVector A = NumericVector::create(1.0, 2.0, 3.0, 4.0);
NumericMatrix B = NumericMatrix(2, 2, A.begin());
// Output:
// 1.00000 3.00000
// 2.00000 4.00000

// --- Various matrix sums

NumericVector C = colSums(B);
// Output: 3 7

NumericVector D = rowSums(B);
// Output: 4 6
  • See also:

Row and Column Means

colMeans( X , na_rm) rowMeans( X , na_rm)
  • Computes the means of elements either by column (colMeans) or row (rowMeans) of a Matrix.

  • Definition:

\[\begin{aligned} \text{(Row) } { {\bar X}_{i} } &= \frac{1}{J}\sum\limits_{j = 1}^J { {X_{i,j} }} \\ \text{(Column) }{ {\bar X}_{j} } &= \frac{1}{I}\sum\limits_{i = 1}^I { {X_{i,j} }} \\ \end{aligned} \]

  • Supported types are Numeric, Integer or Complex of the Matrix class.

  • The return type is the equivalent input type but of the Vector class.

  • Examples:

NumericVector A = NumericVector::create(1.0, 2.0, 3.0, 4.0);
NumericMatrix B = NumericMatrix(2, 2, A.begin());
// Output:
// 1.00000 3.00000
// 2.00000 4.00000

// --- Various matrix means

NumericVector C = colMeans(B);
// Output: 1.5 3.5

NumericVector D = rowMeans(B);
// Output: 2 3

outer( X , Y , Function)

  • Applies a Function to two Vector objects to obtain the outer product.

  • Definition: \[\begin{aligned} f\left( {\vec u \otimes \vec v} \right) &= f\left( {\vec u{ {\vec v}^T} } \right) = f\left( {\left[ {\begin{array}{*{20}{c} } { {u_1} } \\ \vdots \\ { {u_n} } \end{array} } \right]\left[ {\begin{array}{*{20}{c} } { {v_1} }& \cdots &{ {v_n} } \end{array} } \right]} \right) \hfill \\ &= \left[ {\begin{array}{*{20}{c} } {f\left( { {u_1}{v_1} } \right)}&{f\left( { {u_1}{v_2} } \right)}& \cdots &{f\left( { {u_1}{v_n} } \right)} \\ {f\left( { {u_2}{v_1} } \right)}&{f\left( { {u_2}{v_2} } \right)}&{}&{f\left( { {u_2}{v_n} } \right)} \\ \vdots &{}& \ddots & \vdots \\ {f\left( { {u_n}{v_1} } \right)}&{f\left( { {u_n}{v_2} } \right)}&{}&{f\left( { {u_n}{v_n} } \right)} \end{array} } \right] \hfill \\ \end{aligned}\]

  • All types of the Vector class are supported. However, the supplied Function must have the correct parameter input types to receive values from each Vector.

  • The returned value is a Matrix class with its type given by the valued returned by the Function.

  • The Function parameter also accepts a C++ functor in place of an Rcpp Function. In such cases, the underlying type T must be defined. Select one of the following arguments for T:

    • int
    • double
    • std::complex<double> / Rcomplex
    • bool
  • Numeric Output

Functor Meaning
std::plus<T>() x + y
std::minus<T>() x - y
std::multiplies<T>() x * y
std::divides<T>() x / y
std::modulus<T>() x % y
  • Logical Output
Functor Meaning
std::equal_to<T>() x == y
std::not_equal_to<T>() x != y
std::greater<T>() x > y
std::less<T>() x < y
std::greater_equal<T>() x >= y
std::less_equal<T>() x <= y
  • Examples:
// Sample Data
NumericVector A = NumericVector::create(0.5, 1.0, 1.5, 2.0);
NumericVector B = NumericVector::create(0.0, 0.5, 1.0, 1.5);

// --- Applying outer with functors

NumericMatrix C = outer(A, B, std::plus<double>());
// Output:
//  0.5  1.0  1.5  2.0
//  1.0  1.5  2.0  2.5
//  1.5  2.0  2.5  3.0
//  2.0  2.5  3.0  3.5

LogicalMatrix D = outer(A, B, std::less<double>());
// Output:
// FALSE FALSE  TRUE  TRUE
// FALSE FALSE FALSE  TRUE
// FALSE FALSE FALSE FALSE
// FALSE FALSE FALSE FALSE

Triangle Matrix Views

lower_tri( X , diag) upper_tri( X , diag )
  • Creates a Matrix of Logical values of the same dimensions as the supplied Matrix with values in the lower or upper triangle portion being true.

  • All types are supported of the Matrix class.

  • By default, the diag parameter is false and, therefore, does not include the major diagonal (upper left to lower right).

  • Note: Prior to Rcpp 0.13.0, neither the upper_tri or the lower_tri function worked.

  • Examples:

NumericVector A = NumericVector::create(1, 2, 3, 4, 5, 6, 7, 8, 9);
NumericMatrix B = NumericMatrix(3, 3, A.begin());
// Output:
// 1.00000 4.00000 7.00000
// 2.00000 5.00000 8.00000
// 3.00000 6.00000 9.00000

// --- Lower Triangular Matrix

LogicalMatrix E = lower_tri(B);
// Output:
// FALSE FALSE FALSE
//  TRUE FALSE FALSE
//  TRUE  TRUE  TRUE

LogicalMatrix F = lower_tri(B, true);
// Output:
// TRUE FALSE FALSE
// TRUE  TRUE FALSE
// TRUE  TRUE  TRUE

// --- Upper Triangular Matrix

LogicalMatrix C = upper_tri(B);
// Output:
// FALSE  TRUE  TRUE
// FALSE FALSE  TRUE
// FALSE FALSE FALSE

LogicalMatrix D = upper_tri(B, true);
// Output:
//  TRUE  TRUE  TRUE
// FALSE  TRUE  TRUE
// FALSE FALSE  TRUE

diag( X )

  • Extracts the major diagonal going from the upper left to lower right of a Matrix.

  • All types of the Matrix class are supported.

  • The return type is a Vector of an equivalent type to the input Matrix.

  • Example:

NumericVector A = NumericVector::create(1, 2, 3, 4, 5, 6, 7, 8, 9);
NumericMatrix B = NumericMatrix(3, 3, A.begin());
// Output:
// 1.00000 4.00000 7.00000
// 2.00000 5.00000 8.00000
// 3.00000 6.00000 9.00000

NumericVector C = diag(B);
// Output: 1 5 9

Matrix Indexes

col( X ) row( X )
  • Creates a Matrix where each element contains either a 1-based index for its column (col) or row (row).

  • All types of the Matrix class are supported.

  • The return type is an IntegerMatrix.

  • Examples:

NumericVector A = NumericVector::create(1.0, 2.0, 3.0, 4.0);
NumericMatrix B = NumericMatrix(2, 2, A.begin());
// Output:
// 1.00000 3.00000
// 2.00000 4.00000

// --- Various Matrix Indexes

IntegerMatrix C = col(B);
// Output:
// 1 2
// 1 2


IntegerMatrix D = row(B);
// Output:
// 1 1
// 2 2

Object Creation

cbind

cbind(X, Y)
cbind(X, Y, …)

cbind(X, Y, …)

  • Creates a Matrix by joining objects together in a column-wise manner.

  • X, Y may be any combination of Vector, Matrix, or atomic value of the same underlying type T, where T is one of

    • int
    • double
    • std::complex<double> / Rcomplex
    • bool
  • cbind is defined for any number of arguments between 2 and 50, inclusive.

  • Let S1 and S2 be scalar (atomic) values, V be a Vector with length k, and M be a Matrix with with m rows and n columns. The cbind function exhibits the following behavior:

    • cbind(S1, S2) returns a 1 x 2 Matrix.
    • cbind(S1, V) and cbind(V, S1) return a k x 2 Matrix, where S1 is recycled k times.
    • cbind(S1, M) and cbind(M, S1) return an m x (n + 1) Matrix, where S1 is recycled m times.
    • If k and m are equal, cbind(V, M) and cbind(M, V) return an m x n Matrix.
    • If k and m are not equal, cbind(V, M) and cbind(M, V) will throw an exception at runtime.
    • S1 and S2 may be consecutive arguments in a cbind expression IFF:
      • they are the only arguments used; or
      • all other arguemnts are also scalars; or
      • non-scalar, adjacent arguments are vectors of length one, or matrices with one row.
    • All other cases involving consecutive arguments S1 and S2 will generate a runtime error.
  • Examples:

double d = 1.0;
NumericVector v(3, 2.0);
NumericMatrix m(3, 2); 
m.fill(3.0);

Rcout 
    << std::setprecision(2) 
    << "cbind(1.5, 2.5):\n" << cbind(1.5, 2.5) << "\n"
    << "cbind(d, v):\n" << cbind(d, v) << "\n"
    << "cbind(v, d):\n" << cbind(v, d) << "\n"
    << "cbind(d, v, m, v, d):\n" << cbind(d, v, m, v, d) 
    << std::endl;
// Output:
// cbind(1.5, 2.5):
// 1.5 2.5
// 
// cbind(d, v):
// 1.0 2.0
// 1.0 2.0
// 1.0 2.0
// 
// cbind(v, d):
// 2.0 1.0
// 2.0 1.0
// 2.0 1.0
// 
// cbind(d, v, m, v, d):
// 1.0 2.0 3.0 3.0 2.0 1.0
// 1.0 2.0 3.0 3.0 2.0 1.0
// 1.0 2.0 3.0 3.0 2.0 1.0

Sequence Generation

seq_along( X ) seq_len( n )
  • Generate an Integer sequence with the index beginning at 1 either based on an object (seq_along) or length (seq_len).

  • All types are supported of a Vector or Matrix.

  • The return type is that of an IntegerVector.

  • Examples:

NumericVector A = NumericVector::create(-1, 0, 1);

// By default, seq_along returns R indices
IntegerVector B = seq_along(A);
// Output: 1 2 3

// Generates a vector of specified length
IntegerVector C = seq_len(5);
// Output: 1 2 3 4 5

Replicate Elements

rep( X, n ) rep_each( X, n ) rep_len( X, n )
  • Replicate elements in three flavors:

    • rep: duplicate the object n times retaining the initial element order.
    • rep_each: each element is repeated n times consecutively.
    • rep_len: duplicate object until the new object has length of n.
  • All types are supported of a Vector or Matrix.

  • Examples:

NumericVector A = NumericVector::create(-1, 0, 1);

NumericVector B = rep(A, 3);
// Output: -1 0 1 -1 0 1 -1 0 1

NumericVector C = rep_each(A, 3);
// Output: -1 -1 -1 0 0 0 1 1 1

NumericVector D = rep_len(A, 5);
// Output: -1 0 1 -1 0

String Operations

collapse( X )

  • Collapse multiple strings values into a single string.

  • Only the Character and String types of a Vector or Matrix are supported.

  • Note: The function is equivalent to paste(c('a', 'b'), collapse = "").

  • Example:

CharacterVector A = CharacterVector::create("w","o","r","l","d");

std::string val_str = collapse(A);
// Output: world

trimws( X, which )

  • Trim leading and/or trailing whitespace from strings.

  • The which argument controls how the trimming is performed. This is required and can be specified with either "l" (leading), "r" (trailing), or "b" (both).

  • Only the Character and String types of a Vector or Matrix are supported.

  • This function provides support for returning either a String, Vector, or Matrix depending on the supplied parameter.

  • Definition: Whitespace in the context of this function is considered to be either: space (), horizontal tab (\t), line feed (\n), or carriage return (\r).

  • Note: The function is equivalent to matching the following regular expression (regex) patterns:

    • ^[ \t\r\n]+ (left/leading)
    • [ \t\r\n]+$ (right/trailing)
    • ^[ \t\r\n]+|[ \t\r\n]+$ (both)
  • Example:

// -- Example Data
CharacterVector A = CharacterVector::create("  a b c", "a b c  ", "  a b c  ");

Rcpp::String B =  "  \t\r\na b c\t\r\n  ";

CharacterMatrix C = cbind(A, B);

// -- Vectors 

CharacterVector D = trimws(A, "r");
// Output: "  a b c" "a b c" "  a b c"

CharacterVector E = trimws(A, "l");
// Output: "a b c" "a b c  " "a b c  "

CharacterVector F = trimws(A, "b");
// Output: "a b c" "a b c" "a b c"

// -- Single Rcpp String object

Rcpp::String G = trimws(B, "b");
// Output: "a b c"


// -- Matrix

CharacterMatrix H = trimws(C, "b");
// Output:
//      [,1]    [,2]   
// [1,] "a b c" "a b c"
// [2,] "a b c" "a b c"
// [3,] "a b c" "a b c

Statistical Distributions

  • There exists two approaches for working with statistical distribution functions within Rcpp. The approaches differ on how the result is returned. Specifically, statistical distributions within the Rcpp:: namespace return type NumericVector whereas functions within the R:: namespace return a single double scalar value.

  • For drawing large samples with fixed distribution parameters, sampling under one should sample under the Rcpp:: namespace to obtain a NumericVector. There is an added benefit to working under this scheme of having default parameters for log probability and lower tail sampling akin to traditional R versions.

  • For drawing samples with changing distribution parameters, sampling under the R:: namespace with a for loop is perferred as parameters for each draw can be customized.

Discrete Distributions

Binomial Distribution

dbinomial(X, size, prob, log_p)
pbinomial(Q, size, prob, lower_tail, log_p)
qbinomial(P, size, prob, lower_tail, log_p)
rbinomial(n, size, prob)
// Consider X ~ Bin(n = 100, p = 0.5)

IntegerVector xx = IntegerVector::create(46, 47, 48, 49, 50, 51, 52, 53, 54);

// Vector returns
NumericVector densities = Rcpp::dbinom(xx, 100, .5, false)
NumericVector probs     = Rcpp::pbinom(xx, 100, .5, true, false);
NumericVector qvals     = Rcpp::qbinom(probs, 100, .5, true, false);
NumericVector rsamples  = Rcpp::rbinom(20, 100, .5); 

// Scalar Returns
double dval  = R::dbinom(46, 100, .5, false);
double pval  = R::pbinom(46, 100, .5, true, false);
double qval  = R::qbinom(0.242, 100, .5, true, false);
int rdraw = R::rbinom(46, 100);

Geometric Distribution

dgeom(X, prob, log_p)
pgeom(Q, prob, lower_tail, log_p)
qgeom(P, prob, lower_tail, log_p)
rgeom(n, prob)
// Consider X ~ Geom(p = 0.25)
IntegerVector xx = seq_len(5);

// Vector returns
NumericVector densities = Rcpp::dgeom(xx, .25, false)
NumericVector probs     = Rcpp::pgeom(xx, .25, true, false);
IntegerVector qvals     = Rcpp::qgeom(probs, .25, true, false);
IntegerVector rsamples  = Rcpp::rgeom(20, .25); 

// Scalar Returns
double dval  = R::dgeom(2, .25, false);
double pval  = R::pgeom(2, .25, true, false);
int qval     = R::qgeom(0.578125, .25, true, false);
int rdraw    = R::rgeom(.25);

Hypergeometric Distribution

dhyper( X, m, n, k, log_p)
phyper( Q, m, n, k, lower_tail, log_p)
qhyper( P, m, n, k, lower_tail, log_p)
rhyper(nn, m, n, k)
// Consider X ~ Hypergeo(m = 10, n = 7, k = 8)
IntegerVector xx = IntegerVector::create(3, 4, 5, 6, 7);

// Vector returns
NumericVector densities = Rcpp::dhyper(xx, 10, 7, 8, false)
NumericVector probs     = Rcpp::phyper(xx, 10, 7, 8, true, false);
NumericVector qvals     = Rcpp::qhyper(probs, 10, 7, 8, true, false);
NumericVector rsamples  = Rcpp::rhyper(20, 10, 7, 8); 

// Scalar Returns
double dval  = R::dhyper(46, 10, 7, 8, false);
double pval  = R::phyper(46, 10, 7, 8, true, false);
int qval     = R::qhyper(0.4193747, 10, 7, 8, true, false);
int rdraw    = R::rhyper(10, 7, 8);

Negative Binomial Distribution

dnbinomial(X, size, prob, log_p)
pnbinomial(Q, size, prob, lower_tail, log_p)
qnbinomial(P, size, prob, lower_tail, log_p)
rnbinomial(n, size, prob)
// Consider X ~ NegBin(n = 100, p = 0.5)
IntegerVector xx = IntegerVector::create(46, 47, 48, 49, 50, 51, 52, 53, 54);

// Vector returns
NumericVector densities = Rcpp::dnbinom(xx, 100, .5, false)
NumericVector probs     = Rcpp::pnbinom(xx, 100, .5, true, false);
NumericVector qvals     = Rcpp::qnbinom(probs, 100, .5, true, false);
NumericVector rsamples  = Rcpp::rnbinom(20, 100, .5); 

// Scalar Returns
double dval  = R::dnbinom(46, 100, .5, false);
double pval  = R::pnbinom(46, 100, .5, true, false);
double qval  = R::qnbinom(0.242, 100, .5, true, false);
int rdraw    = R::rnbinom(46, 100);

Poisson Distribution

dpois(X, lambda, log_p)
ppois(Q, lambda, lower_tail, log_p)
qpois(P, lambda, lower_tail, log_p)
rpois(n, lambda)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Poisson distribution.

  • When simulating a vector or scalar from a Poisson distribution, the value returned is within the natural numbers (e.g. \(0, 1, 2, \ldots , 42, \ldots , \mathbb{N}\)).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ Pois(4)
IntegerVector xx = seq_len(10);

// Vector returns
NumericVector densities = Rcpp::dpois(xx, 4, false)
NumericVector probs     = Rcpp::ppois(xx, 4, true, false);
NumericVector qvals     = Rcpp::qpois(probs, 4, true, false);
NumericVector rsamples  = Rcpp::rpois(20, 4); 

// Scalar Returns
double dval  = R::dpois(46, 4, false);
double pval  = R::ppois(46, 4, true, false);
double qval  = R::qpois(0.242, 4, true, false);
int rdraw    = R::rpois(46);

Wilcox Distribution

dwilcox( X, m, n, log_p)
pwilcox( Q, m, n, lower_tail, log_p)
qwilcox( P, m, n, lower_tail, log_p)
rwilcox(nn, m, n)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Wilcox distribution.

  • When simulating a vector or scalar from a Wilcox distribution, the value returned is within the natural numbers (e.g. \(0, 1, 2, \ldots , 42, \ldots , \mathbb{N}\)).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

Wilcoxon Signed Rank Distribution

dsignrank( X, n, log_p)
psignrank( Q, n, lower_tail, log_p)
qsignrank( P, n, lower_tail, log_p)
rsignrank(nn, n)

Continuous Distributions

Beta Distribution

dbeta(X, shape1, shape2, log_p)
pbeta(Q, shape1, shape2, lower_tail, log_p)
qbeta(P, shape1, shape2, lower_tail, log_p)
rbeta(n, shape1, shape2)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Beta distribution.

  • When simulating a vector or scalar from a Beta distribution, the value returned is within [0, 1].

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ Beta(1,1)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dbeta(xx, 1.0, 1.0, false);
NumericVector probs     = Rcpp::pbeta(xx, 1.0, 1.0, true, false);
NumericVector qvals     = Rcpp::qbeta(probs, 1.0, 1.0, true, false);
NumericVector rsamples  = Rcpp::rbeta(5, 1.0, 1.0);

// Scalar Returns
double dval  = R::dbeta(0.5, 1.0, 1.0, false);
double pval  = R::pbeta(0.5, 1.0, 1.0, true, false);
double qval  = R::qbeta(0.85, 1.0, 1.0, true, false);
double rdraw = R::rbeta(1.0, 1.0);

Cauchy Distribution

dcauchy(X, location, scale, log_p)
pcauchy(Q, location, scale, lower_tail, log_p)
qcauchy(P, location, scale, lower_tail, log_p)
rcauchy(n, location, scale)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Cauchy distribution.

  • When simulating a vector or scalar from a Cauchy distribution, the value returned is within (-infty, infty).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ Cauchy(loc = 0, scale = 1)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dcauchy(xx, 0.0, 1.0, false)
NumericVector probs     = Rcpp::pcauchy(xx, 0.0, 1.0, true, false);
NumericVector qvals     = Rcpp::qcauchy(probs, 0.0, 1.0, true, false);
NumericVector rsamples  = Rcpp::rcauchy(20, 0.0, 1.0); 

// Scalar Returns
double dval  = R::dcauchy(0.25, 0.0, 1.0, false);
double pval  = R::pcauchy(0.25, 0.0, 1.0, true, false);
double qval  = R::qcauchy(0.578, 0.0, 1.0, true, false);
double rdraw = R::rcauchy(0.0, 1.0);

Chi-square Distribution

dchisq(X, df, log)
pchisq(Q, df, lower_tail, log_p)
qchisq(P, df, lower_tail, log_p)
rchisq(n, df)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Chi-squared distribution.

  • When simulating a vector or scalar from a Chi-squared distribution, the value returned is within [0, infty).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • log_p = FALSE, probabilities, densities are returned as log(p).
    • lower_tail = TRUE, probabilities are calculated by P(X <= x) instead of P(X > x).
  • Examples:

// Consider X ~ X^2(df = 2)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dchisq(xx, 2, false)
NumericVector probs     = Rcpp::pchisq(xx, 2, true, false);
NumericVector qvals     = Rcpp::qchisq(probs, 2, true, false);
NumericVector rsamples  = Rcpp::rchisq(20, 2); 

// Scalar Returns
double dval  = R::dchisq(0.25, 2, false);
double pval  = R::pchisq(0.5, 2, true, false);
double qval  = R::qchisq(0.22, 2, true, false);
double rdraw = R::rchisq(2);

Non-central Chi-square Distribution

dnchisq(X, df, ncp, log_p)
pnchisq(Q, df, ncp, lower_tail, log_p)
qnchisq(P, df, ncp, lower_tail, log_p)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Non-central Chi-squared distribution.

  • When simulating a vector or scalar from a Non-central Chi-squared distribution, the value returned is within \(\left[0, \infty\right)\).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ X^2(df = 2, ncp = 2.5)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dchisq(xx, 2, 2.5, false)
NumericVector probs     = Rcpp::pchisq(xx, 2, 2.5, true, false);
NumericVector qvals     = Rcpp::qchisq(probs, 2, 2.5, true, false);

// Scalar Returns
double dval  = R::dnchisq(0.25, 2, false);
double pval  = R::pnchisq(0.5, 2, true, false);
double qval  = R::qnchisq(0.22, 2, true, false);

Exponential Distribution

dexp(X, rate, log_p)
pexp(Q, rate, lower_tail, log_p)
qexp(P, rate, lower_tail, log_p)
rexp(n, rate)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from an Exponential distribution under the lambda parameterization: f(v) = lambda x exp(-lambda x v)

  • When simulating a vector or scalar from an Exponential distribution, the value returned is within \(\left[0, \infty\right)\).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • rate = 1, rate refers to the lambda parameter within an exponential
    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ Exp(Rate = 1)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dexp(xx, 1.0, false)
NumericVector probs     = Rcpp::pexp(xx, 1.0, true, false);
NumericVector qvals     = Rcpp::qexp(probs, 1.0, true, false);
NumericVector rsamples  = Rcpp::rexp(20, 1.0); 

// Scalar Returns
double dval  = R::dexp(0.25, 2, false);
double pval  = R::pexp(0.5, 2, true, false);
double qval  = R::qexp(0.22, 2, true, false);
double rdraw = R::rexp(2);

F Distribution

df(X, df1, df2, log_p)
pf(Q, df1, df2, lower_tail, log_p)
qf(P, df1, df2, lower_tail, log_p)
rf(n, df1, df2)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from an F distribution.

  • When simulating a vector or scalar from an F distribution, the value returned is within [0, infty).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ F(1, 5)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::df(xx, 1.0, 5.0, false);
NumericVector probs     = Rcpp::pf(xx, 1.0, 5.0 true, false);
NumericVector qvals     = Rcpp::qf(probs, 1.0, 5.0, true, false);
NumericVector rsamples  = Rcpp::rf(20, 1.0, 5.0); 

// Scalar Returns
double dval  = R::df(0.25, 1.0, 5.0, false);
double pval  = R::pf(0.5, 1.0, 5.0, true, false);
double qval  = R::qf(0.49, 1.0, 5.0, true, false);
double rdraw = R::rf(1.0, 5.0);

Non-central F Distribution

Gamma Distribution

dgamma(X, shape, rate, log_p)
pgamma(Q, shape, rate, lower_tail, log_p)
qgamma(P, shape, rate, lower_tail, log_p)
rgamma(n, shape, rate)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Gamma distribution.

  • When simulating a vector or scalar from a Gamma distribution, the value returned is within \(\left[0, \infty\right)\).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • rate = 1,
    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ Gamma(1, 1)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dgamma(xx, 1.0, 1.0, false);
NumericVector probs     = Rcpp::pgamma(xx, 1.0, 1.0, true, false);
NumericVector qvals     = Rcpp::qgamma(probs, 1.0, 1.0, true, false);
NumericVector rsamples  = Rcpp::rgamma(20, 1.0, 1.0); 

// Scalar Returns
double dval  = R::dgamma(0.25, 1.0, 1.0, false);
double pval  = R::pgamma(0.5, 1.0, 1.0, true, false);
double qval  = R::qgamma(0.393, 1.0, 1.0, true, false);
double rdraw = R::rgamma(1.0, 1.0);

Normal Distribution

dnorm(X, mean, sd, log_p)
pnorm(Q, mean, sd, lower_tail, log_p)
qnorm(P, mean, sd, lower_tail, log_p)
rnorm(n, mean, sd)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Normal distribution.

  • When simulating a vector or scalar from a Normal distribution, the value returned is within \(\left(-\infty, \infty\right)\).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • MEAN = 0, the mean of the distribution
    • SD = 1, the standard derivation of the distribution
    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ Norm(0, 1)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dnorm(xx, 0.0, 1.0, false);
NumericVector probs     = Rcpp::pnorm(xx, 0.0, 1.0, true, false);
NumericVector qvals     = Rcpp::qnorm(probs, 0.0, 1.0, true, false);
NumericVector rsamples  = Rcpp::rnorm(20, 0.0, 1.0); 

// Scalar Returns
double dval  = R::dnorm(0.25, 0.0, 1.0, false);
double pval  = R::pnorm(0.95, 0.0, 1.0, true, false);
double qval  = R::qnorm(1.96, 0.0, 1.0, true, false);
double rdraw = R::rnorm(0.0, 1.0);

Log Normal Distribution

dlnorm(X, meanlog, sdlog, log_p)
plnorm(Q, meanlog, sdlog, lower_tail, log_p)
qlnorm(P, meanlog, sdlog, lower_tail, log_p)
rlnorm(n, meanlog, sdlog)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Log Normal distribution.

  • When simulating a vector or scalar from a Log Normal distribution, the value returned is within \(\left[0,\infty\right)\).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • meanlog = 0, the log mean of the distribution
    • sdlog = 1, the log standard derivation of the distribution
    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ LogNorm(0, 1)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dlnorm(xx, 0.0, 1.0, false);
NumericVector probs     = Rcpp::plnorm(xx, 0.0, 1.0, true, false);
NumericVector qvals     = Rcpp::qlnorm(probs, 0.0, 1.0, true, false);
NumericVector rsamples  = Rcpp::rlnorm(20, 0.0, 1.0); 

// Scalar Returns
double dval  = R::dlnorm(0.25, 0.0, 1.0, false);
double pval  = R::plnorm(0.5, 0.0, 1.0, true, false);
double qval  = R::qlnorm(0.0452, 0.0, 1.0, true, false);
double rdraw = R::rlnorm(0.0, 1.0);

Logistic Distribution

dlogis(X, location, scale, log_p)
plogis(Q, location, scale, lower_tail, log_p)
qlogis(P, location, scale, lower_tail, log_p)
rlogis(n, location, scale)
  • Computes either the density (d), probability (p), quantile (q), or a random (r) sample of a vector or scalar from a Logistic distribution.

  • When simulating a vector or scalar from a Logistic distribution, the value returned is within \(\left(-\infty,\infty\right)\).

  • Under vectorization, e.g. Rcpp::, the default distribution parameters are as follows:

    • location = 0, the shift component of the distribution
    • scale = 1, the dispersion parameter of the distribution that changes the spread e.g. if the scale is small, then the distribution is concentrated.
    • log_p = FALSE, probabilities, densities are returned as \(\log(p)\).
    • lower_tail = TRUE, probabilities are calculated by \(P(X \le x)\) instead of \(P(X > x)\).
  • Examples:

// Consider X ~ Logis(0, 1)
NumericVector xx = NumericVector::create(0.0, 0.25, 0.5, 0.75, 1.0);

// Vector returns
NumericVector densities = Rcpp::dlogis(xx, 0.0, 1.0, false);
NumericVector probs     = Rcpp::plogis(xx, 0.0, 1.0, true, false);
NumericVector qvals     = Rcpp::qlogis(probs, 0.0, 1.0, true, false);
NumericVector rsamples  = Rcpp::rlogis(20, 0.0, 1.0); 

// Scalar Returns
double dval  = R::dlogis(0.25, 0.0, 1.0, false);
double pval  = R::plogis(0.5, 0.0, 1.0, true, false);
double qval  = R::qlogis(0.0452, 0.0, 1.0, true, false);
double rdraw = R::rlogis(0.0, 1.0);

Student’s T Distribution

rt()

Uniform Distribution

runif()

Weibull Distribution

rweibull()

Appendix

RTYPES

RTYPE SEXPTYPE Description
0 NILSXP NULL
1 SYMSXP symbols
2 LISTSXP pairlists
3 CLOSXP closures
4 ENVSXP environments
5 PROMSXP promises
6 LANGSXP language objects
7 SPECIALSXP special functions
8 BUILTINSXP builtin functions
9 CHARSXP internal character strings
10 LGLSXP logical vectors
13 INTSXP integer vectors
14 REALSXP numeric vectors
15 CPLXSXP complex vectors
16 STRSXP character vectors
17 DOTSXP dot-dot-dot object
18 ANYSXP make “any” args work
19 VECSXP list (generic vector)
20 EXPRSXP expression vector
21 BCODESXP byte code
22 EXTPTRSXP external pointer
23 WEAKREFSXP weak reference
24 RAWSXP raw vector
25 S4SXP S4 classes not of simple type
99 FUNSXP functions of type CLOSXP, SPECIALSXP and BUILTINSXP