Configure a firewall for serverless compute access
This article describes how to configure an Azure storage firewall for serverless compute using the Azure Databricks account console UI. You can also use the Network Connectivity Configurations API.
To configure a private endpoint for serverless compute access, see Configure private connectivity from serverless compute.
Important
Starting December 4, 2024, Databricks will begin charging for networking costs on serverless workloads that connect to external resources. Billing will be implemented gradually, and you might not be charged until after December 4, 2024. You won’t be charged retroactively for usage before billing is enabled. After billing is enabled, you may be charged for:
- Private connectivity to your resources over Private Link. Data processing charges for private connectivity to your resources over Private Link are waived indefinitely. Per-hour charges will apply.
- Public connectivity to your resources over NAT gateway.
- Data transfer charges incurred, such as when serverless compute and the target resource are in different regions.
Overview of firewall enablement for serverless compute
Serverless network connectivity is managed with network connectivity configurations (NCCs). Account admins create NCCs in the account console and an NCC can be attached to one or more workspaces
An NCC contains a list of network identities for an Azure resource type as default rules. When an NCC is attached to a workspace, serverless compute in that workspace uses one of those networks to connect the Azure resource. You can allowlist those networks on your Azure resource firewall. If you have non-storage Azure resource firewalls, please contact your account team for information on how to use Azure Databricks stable NAT IPs.
NCC firewall enablement is supported from serverless SQL warehouses, jobs, notebooks, Delta Live Tables pipelines, and model serving CPU endpoints.
You can optionally configure network access to your workspace storage account from only authorized networks, including serverless compute. See Enable firewall support for your workspace storage account. When an NCC is attached to a workspace, the network rules are automatically added to the Azure storage account for the workspace storage account.
For more information on NCCs, see What is a network connectivity configuration (NCC)?.
Cost implications of cross-region storage access
The firewall applies only when the Azure resources are in the same region as the Azure Databricks workspace. For cross-region traffic from Azure Databricks serverless compute (for example, workspace is in East US region and ADLS storage is in West Europe), Azure Databricks routes the traffic through an Azure NAT Gateway service.
Requirements
Your workspace must be on the Premium plan.
You must be an Azure Databricks account admin.
Each NCC can be attached to up to 50 workspaces.
Each Azure Databricks account can have up to 10 NCCs per region.
You must have
WRITE
access to your Azure storage account’s network rules.
Step 1: Create a network connectivity configuration and copy subnet IDs
Databricks recommends sharing NCCs among workspaces in the same business unit and those sharing the same region and connectivity properties. For example, if some workspaces use storage firewall and other workspaces use the alternative approach of Private Link, use separate NCCs for those use cases.
- As an account admin, go to the account console.
- In the sidebar, click Cloud Resources.
- Click Network Connectivity Configuration.
- Click Add Network Connectivity Configurations.
- Type a name for the NCC.
- Choose the region. This must match your workspace region.
- Click Add.
- In the list of NCCs, click on your new NCC.
- In Default Rules under Network identities, click View all.
- In the dialog, click the Copy subnets button.
- Click Close.
Step 2: Attach an NCC to workspaces
You can attach an NCC to up to 50 workspaces in the same region as the NCC.
To use the API to attach an NCC to a workspace, see the Account Workspaces API.
- In the account console sidebar, click Workspaces.
- Click your workspace’s name.
- Click Update workspace.
- In the Network Connectivity Config field, select your NCC. If it’s not visible, confirm that you’ve selected the same region for both the workspace and the NCC.
- Click Update.
- Wait 10 minutes for the change to take effect.
- Restart any running serverless compute resources in the workspace.
If you are using this feature to connect to the workspace storage account, your configuration is complete. The network rules are automatically added to the workspace storage account. For additional storage accounts, continue to the next step.
Step 3: Lock down your storage account
If you haven’t already limited access to the Azure storage account to only allow-listed networks, do so now. You do not need to do this step for the workspace storage account.
Creating a storage firewall also affects connectivity from classic compute plane to your resources. You must also add network rules to connect to your storage accounts from classic compute resources.
- Go to the Azure portal.
- Navigate to your storage account for the data source.
- In the left nav, click Networking.
- In the field Public network access, check the value. By default, the value is Enabled from all networks. Change this to Enabled from selected virtual networks and IP addresses.
Step 4: Add Azure storage account network rules
You do not need to do this step for the workspace storage account.
Add one Azure storage account network rule for each subnet. You can do this using the Azure CLI, PowerShell, Terraform, or other automation tools. Note that this step cannot be done in the Azure Portal user interface.
The following example uses the Azure CLI:
az storage account network-rule add --subscription "<sub>" \ --resource-group "<res>" --account-name "<account>" --subnet "<subnet>"
- Replace
<sub>
with the name of your Azure subscription for the storage account. - Replace
<res>
with the resource group of your storage account. - Replace
<account>
with the name of your storage account - Replace
<subnet>
with the ARM resource ID (resourceId
) of the serverless compute subnet.
After running all the commands, you can use the Azure portal to view your storage account and confirm that there is an entry in the Virtual Networks table that represents the new subnet. However, you cannot make the network rules changes in the Azure portal.
Tip
- When you add storage account network rules, use the Network Connectivity API to retrieve the latest subnets.
- Avoid storing NCC information locally.
- Ignore the mention of “Insufficient permissions” in the endpoint status column or the warning below the network list. They indicate only that you do not have permission to read the Azure Databricks subnets but it does not interfere with the ability for that Azure Databricks serverless subnet to contact your Azure storage.
- Replace
Repeat this command once for every subnet.
To confirm that your storage account uses these settings from the Azure portal, navigate to Networking in your storage account.
Confirm that the Public network access is set to Enabled from selected virtual networks and IP addresses and allowed networks are listed in the Virtual Networks section.