Oracle R Enterprise issues in Oracle Big Data Lite VM 4.1.0

Christos - Iraklis Tsatsoulis Oracle R 4 Comments

In the previous post, we examined some configuration issues with Cloudera Manager and Hadoop services in the latest release of Oracle Big Data Lite VM (4.1.0). In this post we report issues with Oracle R Enterprise, and the remedies we applied.

It turns out that if we load the ORE package in R, we subsequently cannot use the help system at all:


> library(ORE)

Attaching package: ‘OREbase’

The following objects are masked from ‘package:base’:

cbind, data.frame, eval, interaction, order, paste, pmax, pmin, rbind, table

Loading required package: OREembed
Loading required package: OREstats
Loading required package: MASS
Loading required package: OREgraphics
Loading required package: OREeda
Loading required package: OREmodels
Loading required package: OREdm
Loading required package: lattice
Loading required package: OREpredict
Loading required package: ORExml
> help(ore.connect) # ORE function
Error in readRDS(f) : unknown input format
> help(median)   # base R function
Error in readRDS(f) : unknown input format

The  Error in readRDS(f) : unknown input format message turns out to be a rather cryptic one, popping up occasionally in the R universe, with the proposed solution usually being to delete the directory with the downloaded packages, a solution we would certainly like to avoid. The error is probably due to some corrupted file(s) in the loaded ORE packages, e.g.  OREbase  does not trigger the problem, but  OREstats  does (if you are following this in your machine, be sure to restart R between the code snippets presented here in order to reproduce the results):


> library(OREbase)
> help(ore.connect) # works OK
> library(OREstats)
Loading required package: MASS
> help(ore.connect)
Error in readRDS(f) : unknown input format

And to make things somewhat more complicated, if we load OREstats from another location (most ORE and ORCH packages exist in two locations – more on this in a second), the problem does not appear at all:


> library("OREstats", lib.loc="/usr/lib64/R/library")
Loading required package: MASS
Loading required package: OREbase
> help(ore.connect) # works OK!

What’s happening?

Locating the error cause

First, the careful user might have already noticed from the Packages tab of RStudio that all ORCH and most ORE packages show up two times in the package list:

Fig. 1: List of available packages in RStudio – notice the duplicate entries for all ORCH and most ORE packages

Fig. 1: List of available packages in RStudio – notice the duplicate entries for all ORCH and most ORE packages

Why the duplicates, and where are these packages located? We get the answer to the second question from the R function .libPaths:

> .libPaths()
[1] "/u01/app/oracle/product/12.1.0.2/dbhome_1/R/library"
[2] "/usr/lib64/R/library"
[3] "/usr/share/R/library"

The first of these directories belongs to user oracle, while the second one to root (the third one is empty).

Now, it takes little effort to verify that the same ORE & ORCH packages exist in both directories numbered ‘1’ and ‘2’ above, hence the duplicate entries in the RStudio package list.

By default, the R command library, if not provided with a location, loads the requested package from the first directory, as listed in the .libPaths  function; only if R cannot find the package in that directory, it proceeds to look for it in the rest of the directories listed. So, in our case, it turns out that the OREstats package in the “default” location  /u01/app/oracle/product/12.1.0.2/dbhome_1/R/library  is corrupted, while the same package in the second location  /usr/lib64/R/library  is OK.

It takes just a little experimentation to verify that the copy of OREstats in the first directory listed above is the single point of failure producing the error; indeed, if we bypass the default setting and load OREstats from the second directory before loading ORE, the help system works properly:

> library("OREstats", lib.loc="/usr/lib64/R/library")
Loading required package: MASS
Loading required package: OREbase

Attaching package: ‘OREbase’

The following objects are masked from ‘package:base’:

cbind, data.frame, eval, interaction, order, paste, pmax, pmin, rbind, table

> library(ORE)
Loading required package: OREembed
Loading required package: OREgraphics
Loading required package: OREeda
Loading required package: OREmodels
Loading required package: OREdm
Loading required package: lattice
Loading required package: OREpredict
Loading required package: ORExml
> help(ore.connect) # works OK

The reason why this is so should be obvious by now: library(ORE) loads its dependencies from the “default” directory, where the copy of  OREstats  is corrupted; by forcing  OREstats  to be loaded from the second available directory (where the copy is OK), we end up with no corrupted packages loaded and no errors.

A (very) simple workaround

 In order to restore the help system functionality, the only thing we have to do is to delete the OREstats package from the first directory listed in .libPaths (we can use the ORACLE_HOME environmental variable for brevity):


[oracle@bigdatalite ~]$ cd $ORACLE_HOME/R/library
[oracle@bigdatalite library]$ pwd
/u01/app/oracle/product/12.1.0.2/dbhome_1/R/library
[oracle@bigdatalite library]$ ls
arules      ORCHcore    OREcommon   OREmodels   ROracle
Cairo       ORCHstats   OREdm       OREpredict  rstudio
DBI         ORCHtestkit OREeda      OREserver   statmod
manipulate  ORE         OREembed    OREstats    png
ORCH        OREbase     OREgraphics ORExml     
[oracle@bigdatalite library]$ rm -rf OREstats

Now OREstats will be automatically loaded from the second directory listed in .listPaths, where it is already available and not corrupted:


> library(ORE)

Attaching package: ‘OREbase’

The following objects are masked from ‘package:base’:

cbind, data.frame, eval, interaction, order, paste, pmax, pmin, rbind, table

Loading required package: OREembed
Loading required package: OREstats
Loading required package: MASS
Loading required package: OREgraphics
Loading required package: OREeda
Loading required package: OREmodels
Loading required package: OREdm
Loading required package: lattice
Loading required package: OREpredict
Loading required package: ORExml
> help(ore.connect) # ORE function - works OK
> help(median)   # base R function - works OK

 Tidying up the package directories

Although the problem has been solved, there is still an issue: personally, I don’t like this situation with duplicated packages as shown in Fig. 1 above. So, we will proceed to move all ORE & ORCH packages to the directory $R_HOME/library (of root ownership), where the default R packages (i.e. the packages shipped along with any R distribution) are also kept. We need superuser privileges (since the target directory belongs to root); also, since the mv command refuses to overwrite existing directories (setting the –force flag has no effect), we answer ‘no’ to the overwrite questions, and afterwards we simply remove the remaining OR* files since they already exist in $R_HOME/library (CAUTION: be sure that you have first removed OREstats as described above, otherwise you will end up having deleted the healthy copy and kept the corrupted one!):


[oracle@bigdatalite library]$ pwd
/u01/app/oracle/product/12.1.0.2/dbhome_1/R/library
[oracle@bigdatalite library]$ su
[root@bigdatalite library]# mv OR* $R_HOME/library
mv: overwrite `/usr/lib64/R/library/ORCH'? n
mv: overwrite `/usr/lib64/R/library/ORCHcore'? n
mv: overwrite `/usr/lib64/R/library/ORCHstats'? n
mv: overwrite `/usr/lib64/R/library/ORCHtestkit'? n
mv: overwrite `/usr/lib64/R/library/OREbase'? n
mv: overwrite `/usr/lib64/R/library/OREcommon'? n
mv: overwrite `/usr/lib64/R/library/OREserver'? n
[root@bigdatalite library]# exit
exit
[oracle@bigdatalite library]$ ls
arules manipulate  ORCHstats   OREcommon ROracle
Cairo   ORCH       ORCHtestkit OREserver rstudio
DBI     ORCHcore   OREbase     png       statmod
[oracle@bigdatalite library]$ rm -rf OR*
[oracle@bigdatalite library]$ ls
arules Cairo DBI manipulate png ROracle rstudio statmod

Now only the six ORE “supporting” packages remain in $ORACLE_HOME/R/library, along with the two RStudio-related packages rstudio and manipulate. And no more duplicate entries in RStudio package list:

Fig. 2: The new package list in RStudio, without duplicated ORE or ORCH entries

Fig. 2: The new package list in RStudio, without duplicated ORE or ORCH entries

Directory $ORACLE_HOME/R/library is the one where we (i.e. user oracle) will normally download any additional packages we may need, and it makes sense to keep it separated from $R_HOME/library directory, which will contain only the R default packages, along with ORE and ORCH. We wouldn’t like to move any more packages in $R_HOME/library, since the packages in our home directory $ORACLE_HOME/R/library are much more straightforward to update from RStudio (we’ll cover the updating of R default packages in $R_HOME/library directory in a subsequent post).

Since we touched the subject, let us close this post with updating the existing packages: from RStudio select Tools -> Check for package updates…:

Fig. 3: Packages to be updated

Fig. 3: Packages to be updated

Do not bother with manipulate (it needs to be in the same version with your RStudio installation, which currently is 0.98.1062), and just select the other three as shown in Fig. 3.

That was it! Despite the unexpected problem with ORE, you are now ready to use Oracle R Enterprise in the VM. For better overall performance, be sure to check also our previous post, where we address some issues with Cloudera Manager…

Christos - Iraklis Tsatsoulis

Christos - Iraklis Tsatsoulis

Christos - Iraklis is one of our resident Data Scientists. He holds advanced graduate degrees in applied mathematics, engineering, and computing. He has been awarded both Chartered Engineer and Chartered Manager status in the UK, as well as Master status in Kaggle.com due to "consistent and stellar results" in predictive analytics contests.
Christos - Iraklis Tsatsoulis

Latest posts by Christos - Iraklis Tsatsoulis (see all)

4
Leave a Reply

avatar
2 Comment threads
2 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
2 Comment authors
R issues in Oracle Big Data LIte VM 4.2.1Christos - Iraklis TsatsoulisSherry LaMonica Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Sherry LaMonica
Guest
Sherry LaMonica

The problem where the help system breaks with the OREstats package in two places in the search path is indeed a problem with the OREstats package in $ORACLE_HOME/R/library. Somehow, the NAMESPACE became corrupted and cannot be accessed during help system lookup. Reinstalling the package resolves the problem. The fact that OREbase, OREcommon, OREserver and OREstats are installed in two locations on BigDataLite is by design. This will happen only if both Oracle R Enterprise and Oracle R Advanced Analytics for Hadoop are in use on the same system. Oracle Database looks for these packages in $ORACLE_HOME/R/library, as this path is… Read more »

trackback

[…] of Oracle Big Data Lite VM, all the R-related issues I had located and reported in the past (see here and here) have been resolved. Nevertheless, some new issues have emerged. Below are my findings and […]