Skip to main content
Version: 1.42

Create a feedback loop with evaluations

Deeploy is built on the fundamental belief that providing clear explanations of deployed models' prediction processes is crucial for both present understanding and future reference. As such, each deployment within Deeploy is equipped with an endpoint that enables the collection of feedback from experts or end-users. This feedback is solicited and recorded on a per-prediction basis, allowing for comprehensive evaluation.

Evaluate a prediction

We assume you have already created a Deployment and made inference calls through the UI, API, or Python client. It's not possible to evaluate predictions using the Deeploy UI, instead use the Deeploy API or Python client to add evaluations. See Collecting evaluations example for an example interface for collecting evaluations in your own application.

tip

You need the request log ID and prediction log ID to connect an evaluation to a prediction. Both IDs can be found in the output of your inference call.

Update an evaluation

You have the option to adjust the evaluation of a prediction. For instructions, consult the aforementioned documentation.

Monitor evaluations

Monitor evaluation for a Deployment on the Evaluation tab on your Deployment’s Monitoring page. Specifically, you can monitor:

  • Disagreement ratio Assess the extent to which evaluators are in disagreement with the predictions made by the model.
  • Disagreement per class Identify the specific areas within the outcomes where the highest number of disagreements occur.

Read monitoring a Deployment for more details.

View evaluation explanations

View the evaluation of a specific prediction by clicking on a prediction on the Predictions page within a Deployment. Scroll down to view:

  • the token used for the evaluation
  • the desired response
  • the explanation

Collecting evaluations example

An example interface for collecting evaluations in your own application UI is available in the Credit Scoring Hugging Face space. To use this space, deploy the Scikit-Learn Census example. Get the base API link from the Details page in your Deployment, and add a token on the Tokens page. Make sure to allow evaluations with this token.

In the Hugging Face space, add the base API in the model URL section, and your token in the Deeploy API token section. Click Send load application to view a prediction, explanation, and submit an evaluation. Go back to Deeploy to view the evaluation on the Predictions page in your Deployment.