Azure cloud resources
We advice to run Deeploy on the following managed Azure services:
- Azure Kubernetes Service: Managed Kubernetes to run the Deeploy software and AI/ML deployments
- Database for PostgreSQL: Managed database to store application and AI/ML deployment data
- Blob storage: Object storage to store repository and model files and artifacts
- Key Vault: Use the key vault to store sensitive data encrypted in the database
The following shows a high level reference architecture for a basic Deeploy installation:
For more guidance in the networking setup see networking.
For production workloads, we advise to manage the cloud resources in code. Azure provides providers to manage you infrastructure as code. See the Terraform documentation for Terraform/OpenTofu providers.
If you are planning to use Deeploy in combination with the Azure marketplace, more guidance can be found here.
AKS
We suggest using a managed Kubernetes cluster (Stateless): AKS.
To set up AKS for your Deeploy installation, follow the steps outlined in the AKS guide. However, keep in mind the following specific considerations:
- For normal usage, Deeploy requires approximately 3 medium nodes; minimal requirements: 3 (v)CPU and 6 GB RAM.
- Kubernetes version: we advise to use only standard supported versions to prevent extra costs.
- We suggest using cluster autoscaling. This prevents running into resource limits, but take into account this also results in dynamic costs.
- If you want to use managed identity authentication in AKS, make sure you have enabled the OIDC provider for you cluster, have a federated identity credential in place and have the following role assignments in place for your managed identity:
- scope:
/subscriptions/<subscription_id>/resourceGroups/<resource_group_name>
- role definition name:
Managed Identity Operator
- scope:
- If you want to allow Kubernetes to dynamically create a loadbalancer, make sure you have the following role assignment in place for your managed identity:
- scope:
/subscriptions/<subscription_id>/resourceGroups/<resource_group_name>
- role definition name:
Network Contributor
- scope:
GPU support
For GPU support we recommend attaching an autoscaling node group with your preferred GPU node type that can scale to 0 to your AKS cluster. Alternatively use Karpenter as a flexible scheduler. To make sure no other pods will be scheduled on GPU nodes you can add the following taint to your node pool:
taints:
- key: nvidia.com/gpu
value: present
effect: NoSchedule
When specifying a GPU node for a Deeploy deployment, automatically the nvidia.com/gpu
label will be applied.
To make sure you can select the nodes that are scaled to 0 in Deeploy, add a list of node types to the values.yaml
file when you install the Deeploy Helm chart.
Read more about our NVIDIA integration here.
Database for PostgreSQL
To set up a PostgreSQL database for your Deeploy installation, follow the steps outlined in this guide. Take into account the following considerations:
- Consider enabling storage autogrow to prevent manual interventions and accommodate the amount of data increasing over time.
- Align the network configuration of the RDS database with the AKS cluster (same VNet). This will allow for data transfers over the internal Azure network.
- Implement best practices for backing up and restoring data at any point in time, as described in this article.
- Create a separate user with admin rights only on the required databases (
deeploy
anddeeploy-kratos
). Save the user credentials to use in the Helmvalues.yaml
file that you use in the Deeploy Helm installation.
Database configuration
Make sure that the Postgres database server has the following two databases:
deeploy
deeploy_kratos
- A single user should have administrative rights on both databases.
- The databases should have at least one (public) database schema.
Blob storage
To set up a storage blob, use the following guide. Keep in mind the following considerations:
We suggest using a single user assigned managed identity to access the blob storage container from the AKS cluster using OIDC federation, with the following role assignments:
- scope:
/subscriptions/<subscription_id>/resourceGroups/<resource_group_name>
- role definition name:
Storage Blob Data Contributor
- scope:
Allow AKS pods to assume your role. By providing the the client_id for you managed identity the yaml key
objectStorage.azure.workloadIdentityAnnotation.azure\.workload\.identity/client-id
in the Deeploy values during the installation, the relevant Kubernetes service accounts will be automatically annotated. Moreover make sure theobjectStorage.azure.useWorkloadIdentity
is set totrue
.Create a Blob storage private link for your VNet. This will allow for data transfers over the internal Azure network.
Key Vault
To set up a Key vault key, use the following guide, keep in mind the following considerations:
- We suggest using a single user assigned managed identity to access the key from the AKS cluster using OIDC federation, with the following role assignments:
- scope:
<key_vault_id>/keys/<key_name>
- role definition name:
Key Vault Crypto User
- scope:
- Allow AKS pods to assume your role. By providing the client_id for you managed identity for the yaml key
objectStorage.azure.workloadIdentityAnnotation.azure\.workload\.identity/client-id
in the Deeploy values during the installation, the relevant Kubernetes service accounts will be automatically annotated. Moreover make sure thesecurity.keyManagement.azure.useWorkloadIdentity
is set totrue
.
Networking
We suggest becoming familiar with Azure specific networking as described in this guide, this allows you to create a secure and cost efficient environment.
Security considerations
We advise to enable Azure AD to handle Identity and Access Management functions. Therefore it is necessary to register Deeploy with an Azure AD tenant. The following code snippet in Azure Bash Shell can be used to create a Service Principal with access to the Azure resources:
az ad sp create-for-rbac -n deeploy --skip-assignment
. Once the creation of the Service Principal is complete, a set of attributes, such as the tenant id, client id (app id), and client secret (password), will be returned. These values are important and should be recorded as they will be required later during the installation of the Deeploy software using Helm chart.For the Basic tier databases, it is not possible to whitelist the subnet where the AKS nodes are located. In this case, a firewall rule needs to be created on the DB server that whitelists the IP range encompassing all IPs of the AKS cluster nodes. The following steps need to be taken:
- Obtain the IP range from the VNet of the AKS cluster nodes.
- Add the Service Endpoint created above as a firewall rule on the DB server.
For higher database tiers, it is recommended to create a VNet rule on the DB server that allows access from the subnet the AKS nodes are in. The following steps need to be taken:
- Create a
Microsoft.Sql
Service Endpoint in the VNet of the AKS cluster nodes. - Add the Service Endpoint created above as a firewall rule on the DB server.
- Create a
Setting up Defender for Cloud can help to proactively prevent any security incidents with cloud based applications that you manage.
We advise to enable TLS in Azure database for PostgreSQL
Resource health
We suggest using Azure monitor and Azure alerts for monitoring and alerting related to resource health. If you suspect an issue with your Azure Cloud Resources, check out the service health dashboard.
Estimation of costs
Next to the Deeploy license costs, Azure will bill you for the cloud resources. Check the expected costs with the Azure pricing calculator.
Next steps
- Get SMTP credentials
- Configure DNS and TLS
- Helm install Deeploy