While working with some data in Hive recently using the Oracle R Connectors for Hadoop (ORCH), I tried to use the
ore.make.names function (of package
OREbase ). The function creates valid column names for
ore.frame objects. Here is a reproducible example, copied straight from the function documentation:
> library(ORCH) > xnames <- c("col1", "Col.2", "COL_3", "col 4", "col1", "L_A_S_T C.O.L.U.M.N abcdefghijklmnopqrstuvwxyz") > ore.make.names(xnames) Error in ore.make.names(xnames) : attempt to apply non-function
Experimenting a little, I discovered that
ore.make.names becomes functional after executing
ore.connect. Indeed, using Hive:
> ore.connect(type="HIVE") > ore.make.names(xnames)  "col1"  "col_2"  "col_3"  "col_4"  "col1_1"  "l_a_s_t_c_o_l_u_m_n_abcdefghijklmnopqrstuvwxyz"
Such a behavior (i.e. the requirement for a prior call to
ore.connect ) is not mentioned in the function documentation.
Furthermore, if we disconnect from the database and try to run
ore.make.names, we get again an error message, albeit a different one this time:
> ore.disconnect() > ore.make.names(xnames) Error: '$' is not implemented yet'.ore.QueryEnv' is not implemented yet'makeDBnames' is not implemented yet
I posted a question in the Oracle R Technologies forum, where it was confirmed that the incorrect error messages have been logged as a bug, and that in future releases an appropriate error message will be returned, indicating that a connection via
ore.connect is required.
> sessionInfo() # runs in Oracle BD Lite VM 4.0.1 Oracle Distribution of R version 3.1.1 (--) Platform: x86_64-unknown-linux-gnu (64-bit) [...] other attached packages:  ORCH_2.4.1 ORCHstats_2.4.1 ORCHcore_2.4.1 OREstats_1.4.1  MASS_7.3-33 OREbase_1.4.1
- Streaming data from Raspberry Pi to Oracle NoSQL via Node-RED - February 13, 2017
- Dynamically switch Keras backend in Jupyter notebooks - January 10, 2017
- sparklyr: a test drive on YARN - November 7, 2016