Caution when installing Oracle R Distribution in Oracle Linux using Yum

Christos - Iraklis Tsatsoulis Oracle R Leave a Comment

Last week we tried to install Oracle R Distribution (ORD) in Oracle Linux 7.1 using Yum, which is the installation method recommended by Oracle. After following closely the instructions provided in the documentation, instead of the Oracle R Distribution 3.2.0, we found ourselves with the latest (3.2.3) version of GNU R installed.

What had happened is that in our /etc/yum.repos.d, apart from the Oracle public yum, we had also Fedora’s Extra Packages for Enterprise Linux (EPEL) repo listed, so the command

# yum install R.x86_64

was pulling the latest R version which could be found in our available repos, i.e. GNU R 3.2.3 from EPEL.

But why we had the EPEL repo available in the first place? Well, simply because it is recommended by RStudio for installing the RStudio server in RedHat/CentOS systems (no special instructions for Oracle Linux here).

So, given our experience, and also that:

  1. ORD’s latest version always lags behind the latest version of GNU R
  2. ORD package name is indistinguishable from that of GNU R in Yum repos
  3. Having more than one Yum repos available is the rule rather than the exception
  4. Such installation tasks are normally performed by administrators, which are not necessarily aware of the (subtle) differences between ORD and GNU R

I think that there should be a relevant warning in the documentation, pointing out the possible issue and advising accordingly.

So what should we do to install ORD from Oracle’s yum repo? We can use some specific options in the command line, so as to instruct Yum to use only a specific repo:

# yum --disablerepo "*" --enablerepo "ol7_addons" install R.x86_64

The argument disablerepo "*" above disables all repos, while enablerepo enables the particular repo given, in our case ol7_addons. This is a quick way, without having to change the yum configuration even temporarily.

More generally, and extending on point #2 above, I am not sure how good an idea it is to have a distinct Yum package (i.e. ORD here) with the exact same name and package info as an already existing and more generally used package (i.e. GNU R):

# yum info R.x86_64
Installed Packages
Name        : R
Arch        : x86_64
Version     : 3.2.0
Release     : 2.el7
Size        : 0.0
Repo        : installed
From repo   : ol7_addons
Summary     : A language for data analysis and graphics
URL         : http://www.r-project.org
License     : GPLv2+
Description : This is a metapackage that provides both core R userspace and
            : all R development components.
            :
            : R is a language and environment for statistical computing and graphics.
            : R is similar to the award-winning S system, which was developed at
            : Bell Laboratories by John Chambers et al. It provides a wide
            : variety of statistical and graphical techniques (linear and
            : nonlinear modelling, statistical tests, time series analysis,
            : classification, clustering, ...).
            :
            : R is designed as a true computer language with control-flow
            : constructions for iteration and alternation, and it allows users to
            : add additional functionality by defining new functions. For
            : computationally intensive tasks, C, C++ and Fortran code can be linked
            : and called at run time.

Available Packages
Name        : R
Arch        : x86_64
Version     : 3.2.3
Release     : 1.el7
Size        : 24 k
Repo        : epel/x86_64
Summary     : A language for data analysis and graphics
URL         : http://www.r-project.org
License     : GPLv2+
Description : This is a metapackage that provides both core R userspace and
            : all R development components.
            :
            : R is a language and environment for statistical computing and graphics.
            : R is similar to the award-winning S system, which was developed at
            : Bell Laboratories by John Chambers et al. It provides a wide
            : variety of statistical and graphical techniques (linear and
            : nonlinear modelling, statistical tests, time series analysis,
            : classification, clustering, ...).
            :
            : R is designed as a true computer language with control-flow
            : constructions for iteration and alternation, and it allows users to
            : add additional functionality by defining new functions. For
            : computationally intensive tasks, C, C++ and Fortran code can be linked
            : and called at run time.

What I would suggest, is that ORD should become a distinct Yum package, different and distinguishable from GNU R, to avoid such confusions in the future.

Christos - Iraklis Tsatsoulis

Christos - Iraklis Tsatsoulis

Christos - Iraklis is one of our resident Data Scientists. He holds advanced graduate degrees in applied mathematics, engineering, and computing. He has been awarded both Chartered Engineer and Chartered Manager status in the UK, as well as Master status in Kaggle.com due to "consistent and stellar results" in predictive analytics contests.
Christos - Iraklis Tsatsoulis

Latest posts by Christos - Iraklis Tsatsoulis (see all)

Leave a Reply

avatar
  Subscribe  
Notify of