Part IV
Interfaces
In implementing extensions to R, the principle says we should look widely to find good computational techniques to achieve our goals. If an effective solution has been implemented in a form other than R code, providing an interface from R may be the best approach. Part IV of the book looks at how such interfaces can be implemented and made an integral part of an R-based project.
The principle has always been central to R and to S before. An interface to subroutines was the way to extend the first version of S. Subroutine interfaces have continued to be central to R. The approach to them has changed; Chapter 16 discusses current subroutine interfaces, emphasizing an approach that provides convenience and generality through the widely used Rcpp package.
But our options are hugely extended today, because of the immense growth and great diversity of available software. Important software for extending R can come from languages focussing on computations (Python, Julia, C++, ⋯), on data organization and management (relational DBs, Excel, ⋯) or on display and user interactions (Java, JavaScript, ⋯). Chapter 12 reviews interfaces in general, cites some existing packages and discusses concepts for interface programming.
Chapters 13 to 15 present a unified approach to language interfaces, which I refer to as the XR structure. The goals of the approach are convenience, generality and consistency. Application packages can use features of the unified approach to hide the actual interface programming from their users, who program in a natural mix of functions and classes in R. Arbitrary computations and objects in the server language are potential candidates for an interface. The structure is language-independent, with interfaces to a particular language specialized by methods and by functional extensions.
The unified approach is relatively new, having been developed during the writing of the current book. Chapter 13 presents the approach in general. Chapters 14 and 15 describe two interfaces, to the Python and Julia languages. If your interest is specifically in one of these languages, the corresponding chapter can be read independently, referring back to Chapter 13 for details. The packages described in these chapters are available from the Github site github.com/johnmchambers.
Chapter 12
Understanding Interfaces
12.1 Introduction
The principle suggests that non-R software should be considered a potential resource for extending R. If there is some suitable software, using that rather than starting over to program something equivalent can save time, and more importantly can improve the quality of the final result.
To have a convenient term, I will refer to the “other language” as a server language. This doesn’t imply an actual client-server interface, which may or may not be suitable; simply that we view the non-R software as supplying us with something.
Section 12.2 lists some likely languages and existing interface packages for these, with comments on the sort of applications that tend to use each.
In the remainder of the chapter, we discuss various aspects of interfaces and the steps that applications will likely need to take in order to use the server language software effectively.
A basic distinction is between interfaces to individual subroutines and interfaces to other language evaluators (Section 12.3). Chapter 16 describes subroutine interfaces and in particular the Rcpp interface to C++.
For language evaluator interfaces, Sections 12.4 to 12.7 discuss techniques for the inclusion of server language software, for expressing computations, for managing objects and for converting data between the languages.
These sections are also motivation and an introduction to a proposed unified structure for language interfaces. Chapter 13 presents the structure, incorporated in the XR package.
Chapters 14 and 15 present interfaces to the Python and Julia languages using the XR structure.
12.2 Available Interfaces
Many languages and programming systems have been used to implement a huge range of computations. The principle encourages us to browse widely. There are many useful forms for the other software, even beyond languages in the usual sense. The common ingredient is some mechanism for programming and carrying out computations that goes beyond the R process and evaluation model discussed in Chapter 3.
Some likely candidates:
C, Fortran: These languages were and are the basic implementation languages for S and R. Interfaces to them are still fundamental, and in particular the .Call() interface to C is the basic entry point for any software linked into the R process.
C++: Programming with C++ functions and classes supports a large body of important algorithmic software. The use of object-oriented structure and some modern programming techniques have produced a general and widely used interface package (Chapter 16).
Python, Perl, JavaScript, Julia: These are interactive languages with libraries and capabilities that may be complementary to R. Each provides a general programming environment, in which substantial application software has been implemented, with some tendency to specialize; for example, web-based software in JavaScript, numerical software in Julia.
Java: This was traditionally used for serious design of web-based and other graphical interfaces. Its relatively pure OOP structure and thorough facilities for self-describing objects and classes make it natural for similar interfaces from R.
Haskell: The most actively used functional programming language.
Excel, XML, JSON, Relational DBMS: These languages are particularly important for many projects as repositories for data and, in the case of XML and JSON, as a general mechanism for representing objects to be communicated between languages.
Interfaces to most of these and to other languages have been implemented by many contributors. Table 12.1 lists some R packages providing interfaces.
Our perspective is of interfaces from R to another language. This book is about extending R, assuming that one starts from some programming in R, at least in this part of a project. But the principle is itself agnostic in this respect. Many approaches to bringing good software together have been valuable.
rpy2 [31] is an interface to R from Python that has been widely used. HaskellR [18] is an interesting interface in which R code snippets are inserted into a Haskell program.
Other approaches make multiple languages available from a specialized computing environment. The Jupyter project, https://jupyter.org, is a web-based document-creation environment allowing code from Julia, Python, R and other languages to be embedded in a document. The h2o system [25] integrates a range of statistical models with other data-science techniques supporting potentially very large applications with a combination of languages, notably R and Java.
The principle suggests that any such approach is worth investigating. With flexible attitudes and well-designed R programming, useful extensions can be adapted to the project at hand.
C++
| | Chapter 16 |
Java
| rJava
| Provides classes, methods |
Python
| rPython, rJython
| Chapter 14 |
JavaScript
| V8
| Embedded JavaScript engine |
Perl
| RSPerl
| On www.omegahat.net |
Julia
| | Chapter 15 |
JSON
| rjson, jsonlite, RJSON... |