Creating an external deployment with IBM Watsonx
Create an IBM Watsonx Deployment using the External Deployment type. In this article, we will highlight how to integrate IBM Watsonx Granite 13 Billion chat V2 in Deeploy.
Prerequisites
Create a Deployment using watsonx
To illustrate the process, we will use the granite-13b-chat-v2
foundation model developed by IBM Research and only available on Watsonx. The following information is required to create an external Deployment:
- The model endpoint
- An access token to communicate with the model
The foundation model inference endpoint
In IBM Watsonx, a single endpoint is used to perform inference on multiple different foundation models, making the endpoint model-agnostic:
https://{cloud_url}/ml/v1/text/generation?version=2023-05-29
For example,https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29
. For more information refer to the IBM watsonx API documentation.
Retrieving an access token
IBM watsonx has the following very secure approach to retrieve tokens:
- Generate an API key in the IBM watsonx web client > Profile and settings > API key
- Generate a token to inference the token (example in Python):
token_url = "https://iam.cloud.ibm.com/identity/token"
token_body = f"grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey={apikey}"
token_headers = {
"Content-Type": "application/x-www-form-urlencoded",
}
token_response = requests.post(
token_url,
headers=token_headers,
data=token_body
)
if token_response.status_code != 200:
raise Exception("Non-200 response: " + str(token_response.text))
bearer_token = token_response.json()['access_token']
Tokens expire in 1 hour
Connecting watsonx in Deeploy
Connecting watsonx to Deeploy can be done both in the UI as locally with the python client
- UI
- Python client
Complete the following steps in the Deeploy UI
- Log in
- Navigate to the Workspace where you want to add watsonx as an external Deployment
- Click Create and select External
- Follow the steps, in step 3 you are expected to add:
- The retrieved inference endpoint (e.g.,
https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29
) - The access token as a Bearer token
- The retrieved inference endpoint (e.g.,
- Once the connection check is successful, continue and complete the Deployment creation process
Make sure you have authenticated with Deeploy as described in the Python Client documentation, and add watsonx as an external Deployment as illustrated in this example:
from deeploy import CreateExternalDeployment
create_options: CreateExternalDeployment = {
"name": "IBM Watsonx Granite 13b chat",
"description": "Created with Python client",
"url": "https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29"
"password": "ExampleBearerToken"
}
deployment = client.create_external_deployment(create_options)