Version: Cloud

Creating an external deployment with IBM Watsonx

An IBM Watsonx deployment can be created using the External Deployment type. In this article, we will highlight how to integrate IBM Watsonx Granite 13 Billion chat V2 in Deeploy.

Prerequisites

An IBM Watsonx account.

Create a deployment using watsonx

We will use the granite-13b-chat-v2 foundation model that is developed by IBM Research and only available on Watsonx. In order to create an external deployment in Deeploy we need the following:

The model endpoint
An access token to communicate with the model

The foundation model inference endpoint

In IBM watsonx it is possible to use 1 endpoint to inference multiple different foundation models, so the endpoint is not model specific:

https://{cloud_url}/ml/v1/text/generation?version=2023-05-29 for example, https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29. For more information see here

Retrieving an access token

IBM watsonx has the following very secure approach to retrieve tokens:

An API Key can be generated in the IBM watsonx web client > Profile and settings > API key
A token to inference the token can be generated as following (example in Python):

token_url = "https://iam.cloud.ibm.com/identity/token"
token_body = f"grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey={apikey}"
token_headers = {
    "Content-Type": "application/x-www-form-urlencoded",
}

token_response = requests.post(
    token_url,
    headers=token_headers,
    data=token_body
)

if token_response.status_code != 200:
    raise Exception("Non-200 response: " + str(token_response.text))

bearer_token = token_response.json()['access_token']

Note

Be aware that the tokens expire in 1 hour

Connecting watsonx in Deeploy

Connecting watsonx to Deeploy can be done both in the UI as locally with the python client

UI
Python client

Complete the following steps in the Deeploy UI

Login to the Deeploy UI
Navigate to the workspace where you want to add watsonx as an external deployment
Click Create > External
Follow the steps, in step 3 you are expected to add:
- The retrieved inference endpoint (e.g., https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29)
- The access token as a Bearer token
Once the connection check is successful, the deployment can be created.

Make sure you have authenticated with Deeploy as described here. After that the following code snippet provides an example on how to add watsonx as an external model in Deeploy

from deeploy import CreateExternalDeployment

create_options: CreateExternalDeployment = {
    "name": "IBM Watsonx Granite 13b chat",
    "description": "Created with Python client",
    "url": "https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29"
    "password": "ExampleBearerToken"
}

deployment = client.create_external_deployment(create_options)    

Prerequisites​

Create a deployment using watsonx​

The foundation model inference endpoint​

Retrieving an access token​

Connecting watsonx in Deeploy​

Prerequisites

Create a deployment using watsonx

The foundation model inference endpoint

Retrieving an access token

Connecting watsonx in Deeploy