Creating an external deployment with IBM Watsonx
An IBM Watsonx deployment can be created using the External Deployment type. In this article, we will highlight how to integrate IBM Watsonx Granite 13 Billion chat V2 in Deeploy.
Prerequisites
Create a deployment using watsonx
We will use the granite-13b-chat-v2
foundation model that is developed by IBM Research and only available on Watsonx. In order to create an external deployment in Deeploy we need the following:
- The model endpoint
- An access token to communicate with the model
The foundation model inference endpoint
In IBM watsonx it is possible to use 1 enpoint to inference multiple different foundation models, so the endpoint is not model specific:
- https://{cloud_url}/ml/v1/text/generation?version=2023-05-29 for example,
https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29
. For more information see here
Retreiving an acces token
IBM watsonx has the following very secure approach to retrieve tokens:
- An API Key can be generated in the IBM watsonx web client > Profile and settings > API key
- A token to inference the token can be generated as following (example in Python):
token_url = "https://iam.cloud.ibm.com/identity/token"
token_body = f"grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey={apikey}"
token_headers = {
"Content-Type": "application/x-www-form-urlencoded",
}
token_response = requests.post(
token_url,
headers=token_headers,
data=token_body
)
if token_response.status_code != 200:
raise Exception("Non-200 response: " + str(token_response.text))
bearer_token = token_response.json()['access_token']
Be aware that the tokens expire in 1 hour
Connecting watsonx in Deeploy
Connecting watsonx to Deeploy can be done both in the UI as locally with the python client
- UI
- Python client
Complete the following steps in the Deeploy UI
- Login to the Deeploy UI
- Navigate to the workspace where you want to add watsonx as an external deployment
- Click Create > External
- Follow the steps, in step 3 you are expected to add:
- The retrieved inference endpoint (e.g.,
https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29
) - The access token as a Bearer token
- The retrieved inference endpoint (e.g.,
- Once the connection check is succesful, the deployment can be created.
Make sure you have authenticated with Deeploy as described here. After that the following code snippet provides an example on how to add watsonx as an external model in Deeploy
from deeploy import CreateExternalDeployment
create_options: CreateExternelDeployment = {
"name": "IBM Watsonx Granite 13b chat",
"description": "Created with Python client",
"url": "https://eu-de.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29"
"password": "ExampleBearerToken"
}
deployment = client.create_external_deployment(create_options)