Model Inference
Inference methods
After deploying your model (see creating a Deployment), you can inference with it in multiple ways. Test whether everything is working on the test page of your Deployment. Then integrate it in your pipelines or applications using Deeploys Python Client or REST API.
Authentication
Authentication is required for inferencing your Deployment. Authenticate using the Python Client, or authenticate to the REST API:
- Basic auth: create a personal key pair in your account page. Base64 encode the access key and secret key:
echo "access_key:secret_key" | base64
. Use the Base64 encoded string asAuthorization
header, e.g.Authorization: Basic <base64 encoded key>
- Bearer token: create a Deployment token on your Deployment's Token page. Use the token as
Authorization
header, e.g.Authorization: Bearer <token>
Accepted content types
Your models and explainers can accept various content types. Most standard model frameworks only accept JSON input (application/json
), however with custom Docker artifacts you can accept more content types, e.g. an image or PDF. Deeploy accepts the following content types: 'application/json', 'application/pdf', 'application/octet-stream', 'image/*', 'text/*', 'binary/octet-stream'
. Make sure to set the correct content type header when doing inference. If no header is set, it will default to application/json
. If your model or explainer returns a different type other than application/json
, your request's response will be in binary format.
Request and Prediction logs
For each inference request, a single request log and one or more prediction logs are created in Deeploy. The request is split up into multiple prediction logs if the request body contains multiple entries in the instances
or inputs
array, providing that the predictions
and explanations
in the response body have the same length.
The prediction logs contain the request and response bodies. These are only stored when the content types can be recognized as application/json
. The request body is stored as a binary file in object storage for other types.