How to use SparkR in Cloudera Hadoop

Christos - Iraklis Tsatsoulis Big Data, R, Spark 17 Comments

Suppose you are an avid R user, and you would like to use SparkR in Cloudera Hadoop; unfortunately, as of the latest CDH version (5.7), SparkR is still not supported (and, according to a recent discussion in the Cloudera forums, we shouldn’t expect this to happen anytime soon). Is there anything¬† you can do? Well, indeed there is. In this …

Bulk load data to HBase in Oracle Big Data Appliance

Christos - Iraklis Tsatsoulis Big Data, HBase Leave a Comment

I ran into an issue recently, while trying to bulk load some data to HBase in Oracle Big Data Appliance. Following is a reproducible description and solution using the current version of Oracle Big Data Lite VM (4.4.0). Enabling HBase in Oracle Big Data Lite VM (Feel free to skip this section if you do not use Oracle Big Data …