Enabling the Green-Marl compiler for Parallel Graph Analytics in Oracle Big Data Lite VM

Panagiotis Konstantinidis Oracle Big Data Spatial & Graph Leave a Comment

Recently, I began working with Parallel Graph Analytics (PGX) on my Oracle Big Data Lite (BDL) VM version 4.7.0.1. I was especially intrigued and curious about the capabilities of a PGX component called Green-Marl (GM), a domain-specific language specially designed for graph data analysis. It was stated to extend PGX’s capabilities and “implement algorithms with no limit”. Especially the last argument sounds pretty exciting, you can’t say.

The issue

Naturally, I started with the “Hello World” Green-Marl example following this tutorial. The code is the following:

/*
* Copyright (C) 2013 - 2016 Oracle and/or its affiliates. All rights reserved.
*/
procedure hello_world() {
    println("Hello World");
}

So I created a GM file with the code above (hello_world.gm) and I started the PGX shell; but when I tried to run the command to compile the Green-Marl code, I got a “compiler not found” error:

$ $PGX_HOME/bin/pgx
...
PGX Shell 2.4.0
type :help for available commands
06:32:48,568 INFO Ctrl$2 - >>> PGX engine running.
variables instance, session and analyst ready to use
pgx> p = session.compileProgram("/opt/oracle/oracle-spatial-graph/property_graph/data/hello_world.gm")
06:28:47,149 ERROR Task - >> [ERROR] UnsupportedOperationException on fast-track-analysis-pool: CREATE_ANALYSIS failed
06:28:47,150 ERROR Task - compiler for language GM not found; make sure you have all required GM JAR files on the classpath
java.lang.UnsupportedOperationException: compiler for language GM not found; make sure you have all required GM JAR files on the classpath
	at oracle.pgx.compilers.Compilers.findCompiler(Compilers.java:39)
	at oracle.pgx.engine.invocation.InvocationManagerImpl.compile(InvocationManagerImpl.java:58)
	at oracle.pgx.engine.CoreAnalysisImpl$1.doCall(CoreAnalysisImpl.java:52)
	at oracle.pgx.engine.CoreAnalysisImpl$1.doCall(CoreAnalysisImpl.java:45)
	at oracle.pgx.engine.exec.Task.call(Task.java:214)
	at oracle.pgx.engine.exec.Task.run(Task.java:159)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
ERROR: java.lang.UnsupportedOperationException: compiler for language GM not found; make sure you have all required GM JAR files on the classpath

Investigation

I tried upgrading to the latest version of the VM (4.8), but the problem persisted; so, I posted a question to the respective Oracle forum.

The answer from Oracle was surprizing: it turns out that the Big Data Spatial and Graph (BDSG) distribution of PGX, which is the one installed in BDL, does not include the Green-Marl compiler; you must download the OTN version of PGX. I was pointed to a cryptic reference in the documentation, that this feature (i.e. the GM compiler) is only available in PGX from the Oracle Tech Network, but you can see some counter-arguments in the discussion that followed my original question above.

The answer from Oracle proved only half-way useful though: after downloading and installing the recommended version, I was presented with yet another, different error when running the above command:

$ cd /opt/oracle/oracle-spatial-graph/property_graph/pgx-2.4.1
$ ./bin/pgx
...
...
PGX Shell 2.4.1
type :help for available commands
05:31:49,606 INFO Ctrl$2 - >>> PGX engine running.
variables instance, session and analyst ready to use
pgx> p = session.compileProgram("/opt/oracle/oracle-spatial-graph/property_graph/data/hello_world.gm")
06:59:55,771 ERROR Task - >> [ERROR] MalformedProgramException on caller-thread: CREATE_ANALYSIS failed
06:59:55,773 ERROR Task - /tmp/PGX_ENGINE_g606uht5i328bsgc0vkj089bkm/GM_gm_comp_k9f0kk6kh1ht1afibdlchal454: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by /tmp/PGX_ENGINE_g606uht5i328bsgc0vkj089bkm/GM_gm_comp_k9f0kk6kh1ht1afibdlchal454)

oracle.pgx.api.MalformedProgramException: /tmp/PGX_ENGINE_g606uht5i328bsgc0vkj089bkm/GM_gm_comp_k9f0kk6kh1ht1afibdlchal454: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by /tmp/PGX_ENGINE_g606uht5i328bsgc0vkj089bkm/GM_gm_comp_k9f0kk6kh1ht1afibdlchal454)

	at oracle.pgx.engine.invocation.InvocationManagerImpl.compile(InvocationManagerImpl.java:80)
	at oracle.pgx.engine.CoreAnalysisImpl$1.doCall(CoreAnalysisImpl.java:54)
	at oracle.pgx.engine.CoreAnalysisImpl$1.doCall(CoreAnalysisImpl.java:47)
	at oracle.pgx.engine.exec.Task.call(Task.java:249)
	...
	...
	ERROR: oracle.pgx.api.MalformedProgramException: /tmp/PGX_ENGINE_g606uht5i328bsgc0vkj089bkm/GM_gm_comp_k9f0kk6kh1ht1afibdlchal454: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by /tmp/PGX_ENGINE_g606uht5i328bsgc0vkj089bkm/GM_gm_comp_k9f0kk6kh1ht1afibdlchal454)

Scooping around in /usr/lib64 revealed the following:

$ cd /usr/lib64
$ ll libstdc++.so.6
lrwxrwxrwx. 1 root root 30 Jun  9 04:10 libstdc++.so.6 -> /usr/lib64/libstdc++.so.6.0.13
$ strings libstdc++.so.6.0.13 | grep GLIBCXX_3.4.1
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13

So, the error made some sense, since the required library GLIBCXX_3.4.15 was indeed nowhere to be found.

Doing some search in the file system, I was able to find a newer version of the file inside /usr/lib/impala/lib/, one with the name libstdc++.so.6.0.20. Repeating the strings command on this file now, revealed that the required library is indeed included here:

$ cd /usr/lim/impala/lib
$ strings libstdc++.so.6.0.20 | grep GLIBCXX_3.4.1
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20

Resolution

So, here are the complete steps followed to make the Green-Marl compiler work, hence unlocking the full capabilities of PGX:

  1. Download the OTN PGX server package.
  2. Unpack the downloaded zip file into a directory of your choice. I unzipped mine into the same folder that contains the preinstalled version of PGX, /opt/oracle/oracle-spatial-graph/property_graph
  3. Change the $PGX_HOME environment variable to the directory set in step 2.
  4. Go into the $PGX_HOME/bin directory, and make sure that PGX boots by starting the PGX shell.
  5. Exit the PGX shell.
  6. Copy the libstdc++.so.6.0.20 file into /usr/lib64/.
  7. Remove the existing symbolic (soft) link libstdc++.so.6.
  8. Create a new symbolic link pointing to the newer version of the file.

And here are the respective commands to accomplish this, after you have downloaded the pgx-2.4.1-server.zip file:

 
$ unzip /home/oracle/pgx-2.4.1-server.zip -d /opt/oracle/oracle-spatial-graph/property_graph 
$ export PGX_HOME=/opt/oracle/oracle-spatial-graph/property_graph/pgx-2.4.1
$ cd $PGX_HOME
$ ./bin/pgx
...
...
PGX Shell 2.4.1
type :help for available commands
05:31:49,606 INFO Ctrl$2 - >>> PGX engine running.
variables instance, session and analyst ready to use
pgx> :q
$ sudo cp /usr/lib/impala/lib/libstdc++.so.6.0.20 /usr/lib64/
$ sudo rm /usr/lib64/libstdc++.so.6
$ sudo ln -s /usr/lib64/libstdc++.so.6.0.20 /usr/lib64/libstdc++.so.6

After that, your Green-Marl compiler should work just fine!

I made a little change in the “Hello World” script to make sure that data is being returned, because the println statement did not work as expected:

procedure hello_world(): string {
    return "Hello World";
}

Finally I tested it inside the PGX shell:

$ $PGX_HOME/bin/pgx
...
...
PGX Shell 2.4.1
type :help for available commands
05:31:49,606 INFO Ctrl$2 - >>> PGX engine running.
variables instance, session and analyst ready to use
pgx> p = session.compileProgram("/opt/oracle/oracle-spatial-graph/property_graph/data/hello_world.gm")
==> CompiledProgram[name=hello_world]
pgx> g = p.run()
==> {
  "success" : true,
  "canceled" : false,
  "exception" : null,
  "returnValue" : "Hello World",
  "executionTimeMs" : 0
}
pgx> g.returnValue
==> Hello World

To permanently set the PGX_HOME environment variable to the destination of the unzipped PGX package, add the following line at the end of your .bashrc file:

export PGX_HOME=/opt/oracle/oracle-spatial-graph/property_graph/pgx-2.4.1
Panagiotis Konstantinidis

Panagiotis Konstantinidis

Panagiotis is our resident Big Data Engineer. He holds a Bachelor's degree in Computing, and he is an Oracle Certified Implementation Specialist for Big Data, ADF (11g & 12c), and Linux.
Panagiotis Konstantinidis

Leave a Reply

Be the First to Comment!

Notify of
avatar
wpDiscuz