Skip to main content
Version: 1.46

Creating an external deployment with IBM Watsonx

Create an IBM Watsonx Deployment using the External Deployment type. In this article, we will highlight how to integrate IBM Watsonx Granite 13 Billion chat V2 in Deeploy.

Prerequisites

Create a Deployment using watsonx

To illustrate the process, we will use the granite-13b-chat-v2 foundation model developed by IBM Research and only available on Watsonx. The following information is required to create an external Deployment:

  • The model endpoint
  • An access token to communicate with the model

The foundation model inference endpoint

In IBM Watsonx, a single endpoint is used to perform inference on multiple different foundation models, making the endpoint model-agnostic: https://{cloud_url}/ml/v1/text/generation?version=2023-05-29

For example,https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29. For more information refer to the IBM watsonx API documentation.

Retrieving an access token

IBM watsonx has the following very secure approach to retrieve tokens:

  1. Generate an API key in the IBM watsonx web client > Profile and settings > API key
  2. Generate a token to inference the token (example in Python):
token_url = "https://iam.cloud.ibm.com/identity/token"
token_body = f"grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey={apikey}"
token_headers = {
"Content-Type": "application/x-www-form-urlencoded",
}

token_response = requests.post(
token_url,
headers=token_headers,
data=token_body
)

if token_response.status_code != 200:
raise Exception("Non-200 response: " + str(token_response.text))

bearer_token = token_response.json()['access_token']
info

Tokens expire in 1 hour

Connecting watsonx in Deeploy

Connecting watsonx to Deeploy can be done both in the UI as locally with the python client

Complete the following steps in the Deeploy UI

  1. Log in
  2. Navigate to the Workspace where you want to add watsonx as an external Deployment
  3. Click Create and select External
  4. Follow the steps, in step 3 you are expected to add:
    • The retrieved inference endpoint (e.g., https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29)
    • The access token as a Bearer token
  5. Once the connection check is successful, continue and complete the Deployment creation process