Below is a short script that takes some data and estimates a model. It also creates a function for preprocessing the data. This isn’t really needed in the case below, but it is common to need some form of preprocessing (e.g. creating dummy variables, some estimators need features that have smaller numerical ranges etc).
Finally, the script stores the preprocessing function and the trained model as two “rds”-files. This is R-version of pickle in Python.
library(dplyr)
library(magrittr)
df <-
tibble(x = rnorm(100)*100) %>%
mutate(y = 5 + x + rnorm(100))
preprocessData <-
function(df) {
df$x_scaled = df$x/100
return(df)
}
model <-
df %>%
preprocessData() %>%
lm(y~x, data=.)
saveRDS(model, "model.rds")
saveRDS(preprocessData, "preprocess_data.rds")
We next want to use the model to generate predictions in an API. To do this we first use two local scripts. The first defines the contents of the API. This file is names “score.r”.
Need to be careful adding comments in the script due to syntax requirements of Plumber, but the gist of it is
library(dplyr)
library(magrittr)
model <- readRDS("model.rds")
preprocessData <- readRDS("preprocess_data.rds")
#' @get /get_predicted_score
#' @param x
get_predicted_score <-
function(
x
){
df <-
data.frame(x = as.numeric(x)) %>%
preprocessData(.)
data.frame(
scoring_date = as.Date(Sys.Date()),
predY =
predict(
model,
newdat = df)
)
}
The next file, called “main.r”, initiates a local webservice exposed at port 80.
library(plumber)
r <- plumb("score.R")
r$run(port=80, host="0.0.0.0")
The scripts above wont work when run on a Windows machine due to encoding of the “ø”. Luckily, we can use Docker to avoid this issue.
Define a simple dockerfile, that:
This file is simply called dockerfile, with no extension. The dockerfile needs to be located in the same folder as the main.r, score.r, model.rds and preprocess_data.rds.
FROM trestletech/plumber
MAINTAINER Hong Ooi <hongooi@microsoft.com>
RUN R -e 'install.packages(c("dplyr", "magrittr"))'
RUN mkdir /data
COPY model.rds /data
COPY preprocess_data.rds /data
COPY score.R /data
COPY main.R /data
WORKDIR /data
EXPOSE 80
ENTRYPOINT ["Rscript", "main.R"]
From cmd in Windows, we go to the path of the dockerfile, and issue the command below. This builds the docker image above, named “modelimage”
docker build -t modelimage .
When this is built (takes some minutes!), we can test the service locally. First, fire up the image by issuing the command below in cmd:
docker run --rm -p 80:80 modelimage
I test the response with Python.
import requests
payload = {
"x": 200
}
r = requests.get("http://127.0.0.1/get_predicted_score", params = payload)
r.json()
This returns the desired response:
[{'scoring_date': '2021-10-31', 'predY': 205.0965}]
Playing a bit with parameters to see that the scoring works as intended:
payload = {
"x": -2000
}
r = requests.get("http://127.0.0.1/get_predicted_score", params = payload)
r.json()
[{'scoring_date': '2021-10-31', 'predY': -1995.6117}]
For a serious deployment, I would consider the following exensions:
In principle, the docker image can be deployed to Azure to set up a webservice.