Skip to main content
Version: Cloud

Creating Hugging Face Deployments

Deploying Hugging Face models generally follows the steps outlined in Create a Deployment. However, there are additional configuration steps specific to Hugging Face models. In this article, we will highlight only the parts unique to Hugging Face Deployments.

Beware

Hugging Face model deployments are still experimental.

Prerequisites

  • You added a Repository that adheres to the requirements.

  • We can create reference.json for model in one of the two formats.

    • Include a Blob URL reference, as illustrated in this example:

      {
      "reference": {
      "blob": {
      "url": "s3://path-to-model"
      },
      }
      }
  • Include a huggingface model_id for open source non authorization model for deployment.

    {
    "reference": {
    "huggingface": {
    "model": "bigscience/bloom-560m"
    },
    }
    }
Note

Currently private models or model which need approval can only be deployed using blob.

The deployment steps follows the usual approach with few distinct options specific to Hugging Face models. These are listed below.

Model

Select Hugging Face in the Model framework dropdown.

Explainer

For text-to-text generation and text generation type models we provide two standard explainers; saliency and attention. See saliency and attention in standard explainers. To use these explainers, select Standard explainer and choose an option in the Explainer framework dropdown.