Partially Applied Functions in R, Scala, and Javascript
Partial function application is the process of fixing a number of arguments to a function. This is an interesting and useful feature in functional programming languages such as Scala, R, and Javascript. A basic example is a function that takes two numbers and adds them together. Here is some R code for this. Conveniently, this code is also syntactically valid Javascript. So if you aren’t familiar with R you can just paste this into your Javascript console as well.
[code language=”javascript”] addNumbers = function(numberOne, numberTwo) { return (numberOne + numberTwo) }[/code]
> addNumbers(4,5) [1] 9
R, Scala, and Javascript all use closure to partially apply a function. Here’s an example that works for both Javascript and R.
[code language=”javascript”] y = 5
addFive = function(number) {
return addNumbers(number, y)
}[/code]
> addFive(5) [1] 10
The point here is that the call to addNumbers(x, y) “closes over” its scope, capturing the value of y that’s active.
In Scala, we also use closure to partially apply a function. More often than not though we see the _ notation used as shorthand. Here’s an example in Scala.
[code language=”scala”] val addNumbers: ((Double, Double) => Double) =
(numberOne, numberTwo) => numberOne + numberTwo
val addFive = addNumbers(_: Double, 5)[/code]
which is shorthand for
[code language=”scala”] val addFive = {
(numberOne: Double) => addNumbers(numberOne, 5)
}[/code]
Let’s dive into a more practical example of partial application. Say you have a data matrix with variables a, b, c, d. For this example I’m using R code and the values for a, b, c, and d are just random samples from a normal distribution.
[code language=”javascript”] data = as.data.frame(cbind(a = rnorm(100),
b = rnorm(100),
c = rnorm(100),
d = rnorm(100)))[/code]
Let’s say we’re interested in seeing how correlated variable “a” is with all the other variables. R provides a function call “cor” that takes two vectors and returns their correlation. The non-functional approach to this problem would be to write a for loop. Let’s take a look at what that looks like:
[code language=”javascript”] correlations = rep(NA,ncol(data[,-1]))
names(correlations) = c("b", "c", "d")
# we want to skip the first column since that is "a"
for (colIndex in 1:(ncol(data[,-1]))) {
correlations[colIndex] = cor(data[,colIndex + 1], data$a)
}[/code]
The above code is messy. There is a lot of different indexing going on that makes the code appear complicated, and we have to initialize a correlations vector with NAs which also isn’t ideal. Here’s where partial function application comes in to save the day.
[code language=”javascript”] corWithA = function(variable) { return(cor(variable, data$a)) } [/code]
We partially apply data$a as one the arguments to the “cor” function to make another function that calculates the correlation of any vector with variable “a”. Now instead of iterating through the data frame with a for loop we are going to use the “apply” function. The apply function is similar to Scala or Javascript’s map function. Here’s a quick example of javascript’s “map” function. Map in the case of an array takes a function and applies it to each element in the array.
[code language=”javascript”] arr = [1,2,3] arr.map(function(x) { return (x + 1) }) # Returns [2,3,4] [/code]
“apply” takes a data frame or matrix , a number 1 (rows) or 2 (columns) to signify what to map over, and finally a function to apply to each row or column. Here are some simple examples on the data frame we are using:
[code language=”javascript”] apply(data, 1, sum) #Row Sums
apply(data, 2, sum) #Column Sums [/code]
Here is our use of “apply” and our partially applied function “corWithA” that shortens up the number of lines of code significantly:
[code language=”javascript”] correlations = apply(data[,-1], 2, corWithA)[/code]
We can EVEN make it one-liner if we pass corWithA in as an anonymous function
[code language=”javascript”] correlations = apply(data[,-1],
2,
function(variable) { return cor(variable, data$a) })[/code]
And that my friends is the power of partially applied functions. We took what was 5 lines of messy code that used a for loop and shortened it down to a 1 liner.