Zepl
Search…
DataRobot

Connect Zepl to DataRobot

DataRobot supports several client libraries to make it easy to connect your Python and R code with your DR account. In order to connect Zepl with DataRobot, you will need to check if desired library is installed and bring your DR API Key to Zepl.

Install client libraries

What libraries do I need?
Place this code at the beginning of your notebook and ensure it runs every time your container starts up.
Python
R
1
%python
2
!pip install datarobot
Copied!
1
%r
2
install.packages("datarobot")
Copied!

Authorize Zepl to access DataRobot

In your DataRobot account:

Follow these steps to access your API key in your DataRobot account: https://community.datarobot.com/t5/resources/where-can-i-find-my-api-key/ta-p/4648​

In your Zepl Org:

There are two ways to bring your DataRobot API key to Zepl:
  1. 1.
    Paste your DataRobot API key in your notebook code
  2. 2.
    (Recommended) Create a secret store data source and attach it to your notebook​
Your secret store will look something like this:

Getting Started Notebook

Open in Zepl

Connect to DataRobot

To use Zepl with DataRobot, you first need to establish a connection between your machine and the DataRobot instance. The fastest way to do that is by pasting your DataRobot API key as a string in your code. If you want to do this in a more secure way, use Zepl's secret store data source.
If you are using the secret store to store your api token (line 9 in the code below), make sure it is attached to the notebook: Attaching a Data Source​
Python
R
1
%python
2
import datarobot as dr
3
​
4
# Enter your API Key here or use the secure secret store method below
5
dr.Client(token='addyourAPIkey' , endpoint='https://app.datarobot.com/api/v2')
6
​
7
# Uncomment to use the secret store to securely access your API key in Zepl. Follow the documentation here: https://new-docs.zepl.com/docs/connect-to-data/secret-store
8
# token = z.getDatasource("datarobot_api")['token']
9
# dr.Client(token=token , endpoint='https://app.datarobot.com/api/v2')
Copied!
1
%r
2
# import library
3
library(datarobot)
4
​
5
# Enter your API Key here or use the secure secret store method below
6
datarobot::ConnectToDataRobot(token ='addyourAPIkey', endpoint = 'https://app.datarobot.com/api/v2'))
7
​
8
# Uncomment to use the secret store to securely access your API key in Zepl. Follow the documentation here: https://new-docs.zepl.com/docs/connect-to-data/secret-store
9
# token <- z.getDatasource("datarobot_api")[["token"]]
10
# datarobot::ConnectToDataRobot(token = token, endpoint = 'https://app.datarobot.com/api/v2')
Copied!

Creating a Project

For Classification, Regression and Multiclass Classification, the process of starting a project (and modeling) is very straightforward. All you have to do is use the datarobot.Project.start method.
If you open a new window and log in to your DataRobot account, you can watch the project start up. This might take some time, depending on the number of workers available.
Python
R
1
%python
2
#I can link directly to my data (file, url) or I can also pass a pandas dataframe to the sourcedata variable
3
url_to_data = "https://s3.amazonaws.com/datarobot_public_datasets/10k_diabetes.csv"
4
​
5
# Start project with all available workers (worker_count = -1)
6
project = dr.Project.start(sourcedata = url_to_data,
7
project_name = '00_Zepl_Starter_NB_Python',
8
target = 'readmitted',
9
worker_count = -1)
10
​
11
# Force our Python Kernel to wait until DataRobot has finished modeling before executing the next series of commands.
12
project.wait_for_autopilot()
Copied!
1
%r
2
# I can link directly to my data (file, url) or I can also pass a dataframe to the dataSource variable
3
url_to_data <- "https://s3.amazonaws.com/datarobot_public_datasets/10k_diabetes.csv"
4
​
5
# Start project with all available workers (worker_count = -1)
6
project <- StartProject(dataSource = url_to_data,
7
projectName = '00_Zepl_Starter_NB_R',
8
target = 'readmitted',
9
workerCount = -1)
10
​
11
# Force our R Kernel to wait until DataRobot has finished modeling before executing the next series of commands.
12
WaitForAutopilot(project = project)
Copied!

Model deployment

If you wish to deploy a model, all you have to do is use the Deployment.create_from_learning_model method. You also need to have the prediction server that you want to host this deployment. Available prediction servers can be retrieved using the PredictionServer.list method.
Python
R
1
%python
2
# Get list of prediction servers
3
prediction_server = dr.PredictionServer.list()[0]
4
​
5
# Create a deployment
6
deployment = dr.Deployment.create_from_learning_model(
7
most_accurate_model.id, label='New Deployment', description='A new deployment',
8
default_prediction_server_id=prediction_server.id)
9
​
10
# Verify deployment was created succesfully
11
deployment
Copied!
1
%r
2
# Get list of prediction servers
3
prediction_server <- ListPredictionServers()[[1]]
4
​
5
# Create a deployment
6
deployment <- CreateDeployment(model = most_accurate_model,
7
label = 'New Deployment (R)',
8
description = 'A new deployment',
9
defaultPredictionServerId = prediction_server$id)
10
​
11
# Verify deployment was created succesfully
12
deployment
Copied!

Model scoring

Now that we have deployed the model let's score using DataRobot’s Batch Prediction API. Note that there are multiple ways to score data and this is just one of them.
Python
1
%python
2
# Create dataframe
3
scoring = pd.read_csv('https://s3.amazonaws.com/datarobot_public_datasets/10k_diabetes_scoring.csv', nrows=100)
4
​
5
# Write dataframe to CSV on the Zepl container
6
scoring.to_csv('scoring.csv',index=False)
7
​
8
# Score predictions and output a results in a new file named, predicted.csv
9
dr.BatchPredictionJob.score_to_file(
10
deployment.id,
11
'scoring.csv',
12
'./predicted.csv')
Copied!

Additional Documentation:

Last modified 25d ago