Using R to Power Analytics In Java Apps

r_plus_java

R is a powerful, open source platform for statistical computations and data analysis. Visualizations and visual-analytical tools can benefit from capabilities of R and numerous computational packages written for R.

This post is a brief overview of some of the existing ways in which power of R can be leveraged from Java applications.

Integration of R and Java: APIs and Libraries

Following is overview of those APIs and are discussed in no particular order.

RCaller

RCaller is a library to call R from Java. Following are excerpts from a research paper on RCaller:

RCaller converts data structures to R code, sends them to an externally created R process, returns the generated results as XML which is the universal way of storing data. XML structure is then parsed and returned values are accessed directly in Java.  RCaller depends on a single jar file and no more setting up procedure is required.

Example of a computation in R called through a Java application:

Basic Interactions
RCaller wraps complex interactions in an easy way. Since calculations are handled at R side, the full path to Rscript executable file can defined correctly using setRscriptExecutable method. Java arrays can also be passed to R in an easy way. Suppose that matrix is a matrix with dimensions 2 × 2. In the example below, matrix is passed to R and inverse of this matrix is calculated at R side and result is handled in Java again.

RCaller caller = new RCaller();
caller.setRscriptExecutable("path/to/Rscript.exe");

double[][] matrix = new double[][]{{6, 4}, {9, 8}};

RCode code = new RCode();

// Passing Java objects to R
code.addDoubleMatrix("x", matrix);
code.addRCode("s <- solve(x)");
caller.setRCode(code);

// Performing Calculations
caller.runAndReturnResult("s");

// Passing R object to Java
double[][] inverse = caller.getParser().getAsDoubleMatrix("s", 2, 2);

Rserve

Rserve is a TCP/IP server which allows other programs to use facilities of R (see www.r-project.org) from various languages without the need to initialize R or link against R library. Every connection has a separate workspace and working directory. Client-side implementations are available for popular languages such as C/C++, PHP and Java. Rserve supports remote connection, authentication and file transfer. Typical use is to integrate R backend for computation of statstical models, plots etc. in other applications.

The following Java code illustrates the easy integration of Rserve:


RConnection c = new RConnection();
double d[] = c.eval("rnorm(10)").asDoubles();

rJava

rJava is a simple R-to-Java interface. It is comparable to the .C/.Call C interface. rJava provides a low-level bridge between R and Java (via JNI). It allows to create objects, call methods and access fields of Java objects from R.

JRI – Java/R Interface

JRI is a Java/R Interface, which allows to run R inside Java applications as a single thread. Basically it loads R dynamic library into Java and provides a Java API to R functionality. It supports both simple calls to R functions and a full running REPL (read-eval-print loop).

Most functions are documented by JavaDoc (see `org.rosuda.JRI` and `org.rosuda.REngine`).

Examples of using JRI with Java is documented here and an example is below, with a caveat!

import org.rosuda.JRI.Rengine;
import org.rosuda.JRI.REXP;
import org.springframework.core.io.ClassPathResource;

public class HelloRWorld2 {
   Rengine rengine; // initialized in constructor or autowired

   public void helloRWorld() {
      ClassPathResource rScript = new ClassPathResource("helloWorld.R");
      rengine.eval(String.format("source('%s')",
         rScript.getFile().getAbsolutePath()));
      REXP result = rengine.eval("greeting");
      System.out.println("Greeting from R: "+result.asString());
   }
}

Any .R script can be executed like this and all variables it adds to the context will be accessible via JRI. However, this code does not work if the Java application is packaged as a JAR or WAR archive because the .R script will not have a valid absolute path. In this case, copying the script to a regular folder (e.g. java.io.tmpdir) at runtime and passing the temporary file to R is a feasible workaround.

Another post will cover specifics of actual integration of Java and JRI package.

Renjin

Excerpts from http://www.renjin.org and docs.renjin.org:

  • Renjin is a JVM-based interpreter for the R language for statistical computing. This project is an initiative of BeDataDriven, a company providing consulting in analytics and decision support systems.
  • R on the JVM Over the past two decades, the R language for statistical computing has emerged as the de facto standard for analysts, statisticians, and scientists. Today, a wide range of enterprises –from pharmaceuticals to insurance– depend on R for key business uses. Renjin is a new implementation of the R language and environment for the Java Virtual Machine (JVM), whose goal is to enable transparent analysis of big data sets and seamless integration with other enterprise systems such as databases and application servers.
  • Renjin is still under development, but it is already being used in production for a number of our client projects, and supports most CRAN packages, including some with C/Fortran dependencies.

Available Renjin packages can be explored here.

[From Renjin documentation:]

  • Renjin is an interpreter for the R programming language for statistical computing written in Java much like JRuby and Jython are for the Ruby and Python programming languages. The official R project, hereafter referred to as GNU R, is the reference implementation for the R language and Renjin’s current code base is derived from GNU R version 2.14.2.
  • The goal of Renjin is to eventually be compatible with GNU R such that most existing R language programs will run in Renjin without the need to make any changes to the code. Needless to say, Renjin is currently not 100% compatible with GNU R so your mileage may vary.

Example code showing how to use Renjin library


import java.util.*;
import javax.script.*;

public class TryRenjin {

    public void Test () throws Exception {
        
        // create a script engine manager
        ScriptEngineManager manager = new ScriptEngineManager();
        
        // create a Renjin engine:
        ScriptEngine engine = manager.getEngineByName("Renjin");
        
        // check if engine has loaded correctly:
        if (engine == null) {
            throw new RuntimeException ("Renjin Script Engine not found on the classpath.");
        }
        else {
            System.out.println ("Renjin Script Engine initialized!");
        }
        
        // run R script coded manually
        TestLinearRegression (engine);
        
        // run R script from an external .r script file (located in /src/scripts/.)
        TestLinearRegressionFromRFile (engine);
    }

    private void TestLinearRegression (ScriptEngine engine) {
        
        try {
            engine.eval ("df <- data.frame(x=1:10, y=(1:10)+rnorm(n=10))");
            engine.eval ("print (df)");
            
            // NOTE: The ScriptEngine won’t print everything to standard out like the
            // interactive REPL does, so if you want to output something, you’ll need
            // to call the R print() command explicitly.
            engine.eval ("print(lm(y ~ x, df))");
        }
        catch (Exception e) {
            System.out.println (e);
        }
    }
    
    private void TestLinearRegressionFromRFile (ScriptEngine engine) {
        
        try {
            System.out.println ("here " + (new Date ()));
            engine.eval (new java.io.FileReader ("./bin/scripts/scriptLR.r"));
        }
        catch (Exception e) {
            System.out.println (e);
        }
    }
}

 

Comparison of APIs

Anecdotal information on comparison of the different libraries to integrate R with Java is given below

Advertisements

2 thoughts on “Using R to Power Analytics In Java Apps

  1. I am working on an application of integrating Java with R. I have created a jframe & i want to add the graphics generated by using R command to jframe window. Does any body have idea how to exceute r commands in java file , what jar files are needed,example of some code., kindly send me to my email .. I am getting a lot of trouble please reply me…

    1. Bibhuti,
      I used Eclipse IDE to set up Java and R environments and compile my code. rJava worked in my case. You will need to add following JAR files to you project: JRI.jar, JRIEngine.jar, and REngine.jar. Following piece of code shows how to set up rJava and call it from within Java: I found the below example here http://www.rforge.net/DATRAS/files/org/rosuda/JRI/examples/

      public class rtest {
      
      	void RunRTest (String[] args) {
      		// just making sure we have the right version of everything
      		if (!Rengine.versionCheck ()) {
      			System.err.println ("** Version mismatch - Java files don't match library version.");
      			System.exit (1);
      		}
      		System.out.println ("Creating Rengine (with arguments)");
      		// 1) we pass the arguments from the command line
      		// 2) we won't use the main loop at first, we'll start it later
      		// (that's the "false" as second argument)
      		// 3) the callbacks are implemented by the TextConsole class above
      		Rengine re = new Rengine (args, false, new TextConsole ());
      		System.out.println ("Rengine created, waiting for R");
      		// the engine creates R is a new thread, so we should wait until it's
      		// ready
      		if (!re.waitForR ()) {
      			System.out.println ("Cannot load R");
      			return;
      		}
      
      		/*
      		 * High-level API - do not use RNI methods unless there is no other way
      		 * to accomplish what you want
      		 */
      		try {
      			REXP x;
      			
      			System.out.println ("Loading library: forecast");
      			// Holt-Winters
      			x=re.eval("library('forecast')");
      			
      			x=re.eval ("births &lt;- scan(\&quot;http://robjhyndman.com/tsdldata/data/nybirths.dat\&quot;)&quot;);
      			System.out.println (&quot;births: &quot; + (x=re.eval (&quot;births&quot;)));
      			x=re.eval (&quot;bts &lt;- ts(births, frequency=12, start=c(1946,1))&quot;);
      			x=re.eval(&quot;hw &lt;- HoltWinters(bts)&quot;);
      
      			System.out.println(x=re.eval(&quot;hw&quot;));
                		System.out.println(x=re.eval(&quot;hw$coef; hw$SSE&quot;));
      
                		System.out.println(&quot;alpha &quot; + (x=re.eval(&quot;hw$alpha&quot;)));
                		System.out.println(&quot;beta &quot; + (x=re.eval(&quot;hw$beta&quot;)));
                		System.out.println(&quot;gamma &quot; + (x=re.eval(&quot;hw$gamma&quot;)));
                	
                		System.out.println (&quot;******************************************&quot;);
      			System.out.println (&quot;Performing stats on IRIS dataset&quot;);
      			
      			re.eval (&quot;data(iris)&quot;, false);
      			System.out.println (x = re.eval (&quot;iris&quot;));
      			// generic vectors are RVector to accomodate names
      			RVector v = x.asVector ();
      			if (v.getNames () != null) {
      				System.out.println (&quot;has names:&quot;);
      				for (Enumeration e = v.getNames ().elements (); e
      						.hasMoreElements ();) {
      					System.out.println (e.nextElement ());
      				}
      			}
      			// for compatibility with Rserve we allow casting of vectors to
      			// lists
      			RList vl = x.asList ();
      			String[] k = vl.keys ();
      			if (k != null) {
      				System.out.println (&quot;and once again from the list:&quot;);
      				int i = 0;
      				while (i mean(iris[[1]])"));
      			// R knows about TRUE/FALSE/NA, so we cannot use boolean[] this way
      			// instead, we use int[] which is more convenient (and what R uses
      			// internally anyway)
      			int[] bi = x.asIntArray ();
      			{
      				int i = 0;
      				while (i &lt; bi.length) {
      					System.out.print (bi[i] == 0 ? &quot;F &quot; : (bi[i] == 1 ? &quot;T &quot;
      							: &quot;NA &quot;));
      					i++;
      				}
      				System.out.println (&quot;&quot;);
      			}
      
      			// push a boolean array
      			boolean by[] = { true, false, false };
      			re.assign (&quot;bool&quot;, by);
      			System.out.println (x = re.eval (&quot;bool&quot;));
      			// asBool returns the first element of the array as RBool
      			// (mostly useful for boolean arrays of the length 1). is should
      			// return true
      			System.out.println (&quot;isTRUE? &quot; + x.asBool ().isTRUE ());
      
      			// now for a real dotted-pair list:
      			System.out.println (x = re.eval (&quot;pairlist(a=1,b='foo',c=1:5)&quot;));
      			RList l = x.asList ();
      			if (l != null) {
      				int i = 0;
      				String[] a = l.keys ();
      				System.out.println (&quot;Keys:&quot;);
      				while (i &lt; a.length)
      					System.out.println (a[i++]);
      				System.out.println (&quot;Contents:&quot;);
      				i = 0;
      				while (i &lt; a.length)
      					System.out.println (l.at (i++));
      			}
      			System.out.println (re.eval (&quot;sqrt(36)&quot;));
      		} catch (Exception e) {
      			System.out.println (&quot;EX:&quot; + e);
      			e.printStackTrace ();
      		}
              }
      }
      

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s