R and RStudio in a Secured Environment on CentOS 7

programming
r
Author

TheCoatlessProfessor

Published

July 21, 2018

Editor’s Note: A big thank you goes out to Kevin Ushey, an IDE developer for RStudio, who helped in isolating some issues that arose when taking RStudio offline (for public-facing discussion see rstudio/rstudio#2266 and rstudio/rstudio#2288).

Intro

The following write up is a memory dump of how we established the R and RStudio configuration in the College of Engineering at the University of Illinois’ Computer-based Testing Facility (CBTF).

For those who are not at Illinois, the CBTF provides faculty and staff with the ability to assess students individually using a secured computational environment. The logistics and administrative backend has also shifted from a faculty member to a permanent coordinator and, thus, enables the faculty member to focus on generating a high-quality exam question pool. Moreover, exams are now able to be held weekly with the ability to do retakes instead of mandating testing twice per semester due to the nightmare of writing and grading multiple test variants.

You can see the configuration used in the CBTF here:

Procedure

The procedure here for enabling R and RStudio is very much so a traditional installation with a few caveats at the end of each section. In particular, the caveats relate to installing packages once, updating packages once, and disabling calls to CRAN to prevent console crashes.

Code for the procedure can be found at:

https://github.com/coatless/r-centos7/

As a result, the code contained within this post may be outdated.

R Installation on CentOS 7

Installing R on the CentOS 7 environment mirrors that of prior installation tutorials and manuals. In particular, we have the following routine:

############## Add Development Tools

# Install development tools
sudo yum groupinstall -y "Development Tools"

############## Install Latest R Version

# Add the latest release of Extra Packages for Enterprise Linux
sudo yum install -y epel-release

# Install R from the repository
sudo yum install -y R

############## Add additional system libraries

# Required for png and jpeg packages used with mapping
sudo yum install -y libpng-devel  \
                    libjpeg-turbo-devel

# These libraries relate to web scraping technology
# They are primarily used by rvest + devtools
sudo yum install -y libxml2-devel \
                    libcurl-devel \
                    openssl-devel \
                    libssh2-devel
Installation and Updation of R Packages

At this point, there are three parts to adding in external packages:

  1. Use Rscript to run an R script that installs each package desired;
  2. check RStudio’s package dependency mannger for any changes; and,
  3. write a custom .Rprofile that disables CRAN by redirecting the repos parameter in install.packages() to a local directory that is not setup to act like CRAN.
    • Note: You can actually implement a mini-CRAN using drat and miniCRAN.

For those who have been through this rodeo before, the second step might throw you. Why do we need to look at RStudio’s dependency manager? Well, the dependency manager is hard coded and non-accessible via R. As a result, there may be a requirement RStudio checks that wouldn’t be listed in an R package’s DESCRIPTION file. Thus, you may inadvertently run into trouble when working with a feature in RStudio.

Without further ado, let’s quickly write a script that tracks what packages are being installed onto the system.

cat <<- EOF > rpkg-install.R
# R Code Ahead to install packages!
# The following are packages used in STAT 385
pkg_list = c(
             # EDA tools
             'tidyverse', 'rmarkdown', 'shiny', 'flexdashboard', 'shinydashboard',                                               
             # Development tools
             'devtools', 'testthat', 'roxygen2', 'profvis', 'RSQLite', 
             # C++ packages
             'RcppArmadillo', 'rbenchmark', 'microbenchmark',                                    
             # Time series packages
             'zoo', 'xts', 'forecast',                                                          
             # Mapping and graphing packages
             'maps', 'maptools', 'mapproj', 'mapdata', 'ggmap',
             'GGally', 'ggrepel', 'ggraph',
             'cowplot', 'gridExtra', 'patchwork',
             # Text Mining
             'tidytext',
             # Data packages
             'survey', 'fivethirtyeight', 'nycflights13', 
             'babynames', 'neiss', 'ggplot2movies', 
              # Dependencies that are out of date for rmarkdown
             'caTools', 'bitops',
             'PKI', 'RCurl', 'RJSONIO', 'packrat', 'rstudioapi', 'rsconnect', 
             'miniUI'                                                         
             )
             
# Install the package list
install.packages(pkg_list,
                 repos = "https://cloud.r-project.org",
                 # Run installation with a level of parallelization
                 Ncpus = 2)
                 
# Install some data packages on GitHub
devtools::install_github("kjhealy/socviz")
devtools::install_github("coatless/uiucdata")
devtools::install_github("coatless/ucidata")
EOF

# Run the script with sudo to write to `/usr/lib64/R/library`
sudo Rscript rpkg-install.R

# Clean up by removing the script
rm -rf rpkg-install.R

From here, we check to make sure all system packages are up to date. We update the packages in R site-wide library using the RStudio CRAN CDN at https://cloud.r-project.org.

# Request updates if available
Rscript -e "update.packages(ask = FALSE, checkBuilt = TRUE, repos = 'https://cloud.r-project.org')"

Lastly, we need to disable querying CRAN as it has been shown to cause a “lagged” RStudio client due to install.packages() breaking slightly.

# Create directory for a local CRAN
mkdir ~/fakecran

# Write to Rprofile the local of the local file. 
echo 'options(repos = c(CRAN = "file:///~/fakecran/"))' >> ~/.Rprofile

Installation of RStudio Desktop

Within this script, we automatically fetch the latest RStudio version using information detailed in Downloading and Installing RStudio Desktop and [RStudio FAQ: Getting the newest RStudio builds](https://support.rstudio.com/hc/en-us/articles/203842428-Getting-the-newest-RStudio-builds. After that, we add in an icon on the CentOS 7 desktop based off of the SuperUser post How to Make a Desktop Icon on CentOS 7.

# Download latest 64 bit version
wget -O rstudio-latest-x86_64.rpm https://www.rstudio.org/download/latest/stable/desktop/redhat64/rstudio-latest-x86_64.rpm

# Install it
sudo yum install -y --nogpgcheck rstudio-latest-x86_64.rpm

# Remove the installer
rm rstudio-latest-x86_64.rpm

# Add icon to desktop
# Based on: https://superuser.com/questions/806448/how-to-make-a-desktop-icon-on-centos-7
cat <<- EOF > ~/Desktop/RStudio.desktop
[Desktop Entry]
Version=1.0
Type=Application
Terminal=true
Exec=/usr/lib/rstudio/bin/rstudio %F
Name=rstudio
Comment=
Icon=/usr/lib/rstudio/rstudio.png
Comment[en_US.utf8]=
Name[en_US]=RStudio
EOF

# Enable permissions
chmod 755 ~/Desktop/RStudio.desktop

Fin

That’s it. R and RStudio should play nicely offline now.