DataRobot
DataRobot supports several client libraries to make it easy to connect your Python and R code with your DR account. In order to connect Zepl with DataRobot, you will need to check if desired library is installed and bring your DR API Key to Zepl.
What libraries do I need?
Place this code at the beginning of your notebook and ensure it runs every time your container starts up.
Python
R
%python
!pip install datarobot
%r
install.packages("datarobot")
Follow these steps to access your API key in your DataRobot account: https://community.datarobot.com/t5/resources/where-can-i-find-my-api-key/ta-p/4648

There are two ways to bring your DataRobot API key to Zepl:
- 1.Paste your DataRobot API key in your notebook code
- 2.
Your secret store will look something like this:

To use Zepl with DataRobot, you first need to establish a connection between your machine and the DataRobot instance. The fastest way to do that is by pasting your DataRobot API key as a string in your code. If you want to do this in a more secure way, use Zepl's secret store data source.
If you are using the secret store to store your api token (line 9 in the code below), make sure it is attached to the notebook: Attaching a Data Source
Python
R
%python
import datarobot as dr
# Enter your API Key here or use the secure secret store method below
dr.Client(token='addyourAPIkey' , endpoint='https://app.datarobot.com/api/v2')
# Uncomment to use the secret store to securely access your API key in Zepl. Follow the documentation here: https://new-docs.zepl.com/docs/connect-to-data/secret-store
# token = z.getDatasource("datarobot_api")['token']
# dr.Client(token=token , endpoint='https://app.datarobot.com/api/v2')
%r
# import library
library(datarobot)
# Enter your API Key here or use the secure secret store method below
datarobot::ConnectToDataRobot(token ='addyourAPIkey', endpoint = 'https://app.datarobot.com/api/v2'))
# Uncomment to use the secret store to securely access your API key in Zepl. Follow the documentation here: https://new-docs.zepl.com/docs/connect-to-data/secret-store
# token <- z.getDatasource("datarobot_api")[["token"]]
# datarobot::ConnectToDataRobot(token = token, endpoint = 'https://app.datarobot.com/api/v2')
For Classification, Regression and Multiclass Classification, the process of starting a project (and modeling) is very straightforward. All you have to do is use the
datarobot.Project.start
method.If you open a new window and log in to your DataRobot account, you can watch the project start up. This might take some time, depending on the number of workers available.
Python
R
%python
#I can link directly to my data (file, url) or I can also pass a pandas dataframe to the sourcedata variable
url_to_data = "https://s3.amazonaws.com/datarobot_public_datasets/10k_diabetes.csv"
# Start project with all available workers (worker_count = -1)
project = dr.Project.start(sourcedata = url_to_data,
project_name = '00_Zepl_Starter_NB_Python',
target = 'readmitted',
worker_count = -1)
# Force our Python Kernel to wait until DataRobot has finished modeling before executing the next series of commands.
project.wait_for_autopilot()
%r
# I can link directly to my data (file, url) or I can also pass a dataframe to the dataSource variable
url_to_data <- "https://s3.amazonaws.com/datarobot_public_datasets/10k_diabetes.csv"
# Start project with all available workers (worker_count = -1)
project <- StartProject(dataSource = url_to_data,
projectName = '00_Zepl_Starter_NB_R',
target = 'readmitted',
workerCount = -1)
# Force our R Kernel to wait until DataRobot has finished modeling before executing the next series of commands.
WaitForAutopilot(project = project)
If you wish to deploy a model, all you have to do is use the
Deployment.create_from_learning_model
method. You also need to have the prediction server that you want to host this deployment. Available prediction servers can be retrieved using the PredictionServer.list
method.Python
R
%python
# Get list of prediction servers
prediction_server = dr.PredictionServer.list()[0]
# Create a deployment
deployment = dr.Deployment.create_from_learning_model(
most_accurate_model.id, label='New Deployment', description='A new deployment',
default_prediction_server_id=prediction_server.id)
# Verify deployment was created succesfully
deployment
%r
# Get list of prediction servers
prediction_server <- ListPredictionServers()[[1]]
# Create a deployment
deployment <- CreateDeployment(model = most_accurate_model,
label = 'New Deployment (R)',
description = 'A new deployment',
defaultPredictionServerId = prediction_server$id)
# Verify deployment was created succesfully
deployment
Now that we have deployed the model let's score using DataRobot’s
Batch Prediction API
. Note that there are multiple ways to score data and this is just one of them.Python
%python
# Create dataframe
scoring = pd.read_csv('https://s3.amazonaws.com/datarobot_public_datasets/10k_diabetes_scoring.csv', nrows=100)
# Write dataframe to CSV on the Zepl container
scoring.to_csv('scoring.csv',index=False)
# Score predictions and output a results in a new file named, predicted.csv
dr.BatchPredictionJob.score_to_file(
deployment.id,
'scoring.csv',
'./predicted.csv')
Last modified 1yr ago