Creating HuggingFace Deployments
In order to deeploy Huggingface models the user has to follow a similar create flow as for other deployments. However there are some additional things to configure that are specific to huggingface model. In this article, we will highlight only the parts unique to Huggingface Deployments.
Huggingface model deployments are still experimental.
Prerequisites
You added a Repository that adheres to the requirements. s
We can create reference.json for model in one of the two formats.
Include a Blob URL reference, as illustrated in this example:
{
"reference": {
"blob": {
"url": "s3://path-to-model"
},
}
}
Include a huggingface model_id for open source non authorization model for deployment.
{
"reference": {
"huggingface": {
"model": "bigscience/bloom-560m"
},
}
}
Currently private models or model which need approval can only be deployed using blob.
The deployment steps follows the usual approach with few distinct options specific to huggingface models. These are listed below.
Model framework
In model framework step huggingface option can be chosen in the dropdown.
Explainer Framework
For text-to-text generation and text generation type models we provide two standard explainers saliency and attention. See saliency and attention in standard explainers.
These explainers can be chosen from dropdown upon selecting the standard explainers radio button.