Creating Hugging Face Deployments
Deploying Hugging Face models generally follows the steps outlined in Create a Deployment. However, there are additional configuration steps specific to Hugging Face models. In this article, we will highlight only the parts unique to Hugging Face Deployments.
Hugging Face model deployments are still experimental.
Prerequisites
You added a Repository that adheres to the requirements. s
We can create reference.json for model in one of the two formats.
Include a Blob URL reference, as illustrated in this example:
{
"reference": {
"blob": {
"url": "s3://path-to-model"
},
}
}
Include a huggingface model_id for open source non authorization model for deployment.
{
"reference": {
"huggingface": {
"model": "bigscience/bloom-560m"
},
}
}
Currently private models or model which need approval can only be deployed using blob.
The deployment steps follows the usual approach with few distinct options specific to Hugging Face models. These are listed below.
Model
Select Hugging Face in the Model framework dropdown.
Explainer
For text-to-text generation and text generation type models we provide two standard explainers; saliency and attention. See saliency and attention in standard explainers. To use these explainers, select Standard explainer and choose an option in the Explainer framework dropdown.