I needed VPN access to Azure resources across multiple subscriptions. The requirement was simple: secure access without managing additional credentials, no PSKs floating around, and everything infrastructure-as-code. Here’s how I built it.
The Problem
The Azure environment spans many subscriptions with overlapping IP ranges - a legacy of growth. We had:
- SQL Managed Instances requiring private connectivity
- Public SQL databases needing IP whitelisting
- Development teams needing ad-hoc access
- Zero appetite for managing VPN credentials separately from Azure AD
Traditional VPN solutions would require:
- Separate credential management
- Manual user provisioning
- Shared secrets or certificate distribution
- Additional MFA implementation
We needed something better.
The Solution: Azure VPN Gateway + Terraform + OIDC
The architecture leverages three key components:
- Azure VPN Gateway with Azure AD authentication
- Terraform for complete infrastructure-as-code
- GitHub Actions OIDC for zero-credential deployments
Why Azure VPN Gateway Over Alternatives
We evaluated WireGuard on a VM ($10/month) versus Azure VPN Gateway ($140/month). The cost difference is real, but Azure VPN Gateway won because:
- Native Azure AD authentication (users authenticate with existing credentials)
- Built-in MFA enforcement via Conditional Access policies
- No certificate or PSK management
- Enterprise support and SLAs
- Automatic failover and redundancy
The “WireGuard is cheaper” argument falls apart when you factor in:
- Time spent managing user certificates
- Building Azure AD integration yourself
- Maintaining the VM and WireGuard updates
- On-call burden when it breaks at 2 AM
For a smaller engineering team, the labor cost of managing WireGuard exceeds the Azure service cost within the first month.
Implementation: Terraform Configuration
Network Design
We created a transit VNet in non-overlapping space:
locals {
transit_vnet_cidr = "10.99.40.0/24"
gateway_subnet = "10.99.40.0/27" # Azure minimum for VPN Gateway
vpn_client_pool = "10.99.50.0/26" # 62 client IPs
location = "centralus"
}
Key decision: VPN client pool must be completely outside all Azure VNet ranges. Azure won’t let you use overlapping space for client addressing.
Azure AD Application Setup
The VPN Gateway authenticates users via an Azure AD application:
resource "azuread_application" "vpn" {
display_name = "Azure VPN"
identifier_uris = ["api://<Client_ID>"]
api {
oauth2_permission_scope {
admin_consent_description = "Allow the application to access Azure VPN on behalf of the signed-in user."
admin_consent_display_name = "Access Azure VPN"
id = "00000000-0000-0000-0000-000000000001"
enabled = true
type = "User"
value = "user_impersonation"
}
}
web {
redirect_uris = ["https://portscan.atollo.net/"]
}
}
The redirect URI https://portscan.atollo.net/ is Microsoft’s hardcoded callback for the Azure VPN Client. Yes, it looks weird. No, you can’t change it.
VPN Gateway Configuration
resource "azurerm_virtual_network_gateway" "vpn" {
name = "eus-vng-clientvpn"
location = azurerm_resource_group.transit.location
resource_group_name = azurerm_resource_group.transit.name
type = "Vpn"
vpn_type = "RouteBased"
sku = "VpnGw1"
ip_configuration {
name = "vnetGatewayConfig"
public_ip_address_id = azurerm_public_ip.vpn_gateway.id
private_ip_address_allocation = "Dynamic"
subnet_id = azurerm_subnet.gateway.id
}
vpn_client_configuration {
address_space = [local.vpn_client_pool]
vpn_client_protocols = ["OpenVPN"]
aad_tenant = "https://login.microsoftonline.com/${data.azurerm_client_config.current.tenant_id}/"
aad_audience = azuread_application.vpn.client_id
aad_issuer = "https://sts.windows.net/${data.azurerm_client_config.current.tenant_id}/"
}
}
Provisioning takes 30-45 minutes. Plan accordingly.
Handling Multi-Subscription Complexity
Our infrastructure spans many Azure subscriptions. Terraform needs to peer VNets across all of them to the transit VNet.
Provider Aliases for Cross-Subscription Resources
provider "azurerm" {
features {}
}
provider "azurerm" {
alias = "data_prod"
subscription_id = "<SUBSCRIPTION_ID_1>"
features {}
}
provider "azurerm" {
alias = "infra_prod"
subscription_id = "<SUBSCRIPTION_ID_2>"
features {}
}
Each data source and peering resource targeting a different subscription needs the provider alias:
data "azurerm_virtual_network" "prod2" {
provider = azurerm.infra_prod
name = "crds_prod2-vnet"
resource_group_name = "crds_prod2"
}
resource "azurerm_virtual_network_peering" "prod2_to_transit" {
provider = azurerm.infra_prod
name = "peer-prod2-to-transit"
resource_group_name = "prod2"
virtual_network_name = "prod2-vnet"
remote_virtual_network_id = azurerm_virtual_network.transit.id
allow_forwarded_traffic = true
use_remote_gateways = true
allow_virtual_network_access = true
depends_on = [azurerm_virtual_network_gateway.vpn]
}
GitHub Actions: Zero-Credential Deployment
OIDC Configuration
Traditional approaches store Azure service principal secrets in GitHub. We use OpenID Connect instead - GitHub generates short-lived tokens that Azure validates.
Azure Configuration:
APP_NAME="github-actions-oidc"
REPO="org/iac-infrastructure"
# Create app registration
az ad app create --display-name "$APP_NAME"
APP_ID=$(az ad app list --display-name "$APP_NAME" --query "[0].appId" -o tsv)
OBJECT_ID=$(az ad app list --display-name "$APP_NAME" --query "[0].id" -o tsv)
# Create service principal
az ad sp create --id $APP_ID
# Grant Owner on all subscriptions
for SUB in "${SUBSCRIPTION_IDS[@]}"; do
az role assignment create \
--assignee $APP_ID \
--role Owner \
--scope /subscriptions/$SUB
done
# Configure OIDC trust
az ad app federated-credential create \
--id $OBJECT_ID \
--parameters "{
\"name\": \"github-main\",
\"issuer\": \"https://token.actions.githubusercontent.com\",
\"subject\": \"repo:$REPO:ref:refs/heads/main\",
\"audiences\": [\"api://AzureADTokenExchange\"]
}"
az ad app federated-credential create \
--id $OBJECT_ID \
--parameters "{
\"name\": \"github-pr\",
\"issuer\": \"https://token.actions.githubusercontent.com\",
\"subject\": \"repo:$REPO:pull_request\",
\"audiences\": [\"api://AzureADTokenExchange\"]
}"
The subject claim enforces which repo and branch can authenticate. Someone can’t steal the token because:
- GitHub signs the JWT with its private key
- Azure validates the signature against GitHub’s public key
- The subject claim must match exactly
- Tokens expire in 15 minutes
Workflow Configuration
name: "Terraform CI/CD"
on:
push:
branches: [main]
pull_request:
branches: [main]
permissions:
id-token: write
contents: read
pull-requests: write
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Azure Login (OIDC)
uses: azure/login@v1
with:
client-id: ${{ vars.AZURE_CLIENT_ID }}
tenant-id: ${{ vars.AZURE_TENANT_ID }}
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
- name: Terraform Init
run: terraform init
- name: Terraform Plan
run: terraform plan -out=tfplan
- name: Terraform Apply
if: github.ref == 'refs/heads/main'
run: terraform apply -auto-approve tfplan
Key points:
permissions: id-token: writeenables OIDC token generation- No secrets in the workflow - everything is variables
- PRs get plans, main branch gets applies
- State stored in Azure Storage (separate configuration)
State File Management
We initially used separate state files for PRs (dev.tfstate) and main (prod.tfstate). This caused chaos - PRs couldn’t see what was actually deployed.
Fix: Single state file for all branches:
- name: Create Backend Config
run: |
cat > backend.tf << EOF
terraform {
backend "azurerm" {
resource_group_name = "${{ vars.TERRAFORM_STATE_RG }}"
storage_account_name = "${{ vars.TERRAFORM_STATE_STORAGE }}"
container_name = "tfstate"
key = "${{ github.event.repository.name }}/${{ github.event.repository.name }}.tfstate"
}
}
EOF
Now PRs plan against actual deployed infrastructure. Much better.
User Experience
Client Setup
- Download Azure VPN Client from Mac App Store (free)
- Admin downloads VPN profile from Azure Portal
- Admin distributes single config file to all users
- Users import config, click Connect
- Azure AD authentication flow (with MFA)
- Connected
The config file contains zero secrets - just the gateway address and tenant info. Distribute it freely.
Access Control
Conditional Access policies control who can connect:
- Require MFA
- Require compliant device
- Restrict by group membership
- Restrict by location
- Require managed device
All enforced at authentication time. No VPN-specific access control needed.
What I Learned
1. Azure VPN Gateway Can’t Route Public IPs
I initially wanted to route all traffic through the VPN, including public internet. Azure VPN Gateway doesn’t support this - it only routes:
- The transit VNet CIDR
- Peered VNet CIDRs
For public endpoint access (like SQL databases), traffic goes directly from the user’s internet connection. The VPN gateway’s public IP gets whitelisted on those resources.
This is actually fine. Users don’t need VPN for browsing the web.
2. Split Tunneling Is Sufficient
We don’t force all traffic through the VPN. Only traffic destined for Azure private networks routes through the tunnel. Everything else uses the local internet connection.
Benefits:
- Better performance for general internet use
- Lower bandwidth costs
- Simpler troubleshooting
3. Provider Aliases Are Required for Multi-Subscription
You cannot pass subscription_id directly to azurerm resources or data sources. Provider aliases are the only way to manage resources across multiple subscriptions in a single Terraform configuration.
This is verbose but explicit. The alternative is separate Terraform configurations per subscription, which is worse.
4. Plan for 45-Minute VPN Gateway Deploys
VPN Gateway provisioning is slow. Plan accordingly:
- Initial creation: 30-45 minutes
- Configuration changes: 15-30 minutes
- Don’t iterate rapidly on gateway config
Test everything in a separate subscription first.
Cost Analysis
Monthly costs:
- VPN Gateway (VpnGw1): ~$140
- Public IP: ~$3
- Bandwidth: ~$0.087/GB outbound (first 100GB free)
- Transit VNet: Free
Total: ~$145/month base + bandwidth
Compare to alternatives:
- WireGuard on VM: ~$10/month + labor cost + on-call burden
- Third-party VPN service: $6-18/user/month = $300-900/month for 50 users
- Azure Virtual WAN + Firewall: ~$1,500+/month
For this use case (smaller amount of engineers, enterprise support requirements), Azure VPN Gateway hits the sweet spot.
Conclusion
I built a production-grade VPN solution with:
- Zero stored credentials (OIDC everywhere)
- Azure AD authentication with MFA
- Complete infrastructure-as-code
- Self-service access via Conditional Access policies
- Automatic failover and enterprise SLAs
The GitOps workflow means network changes go through PR review, get tested in CI, and deploy automatically on merge. Adding a new VNet peering is a 5-line code change.
Total implementation time: ~2 days (including all the mistakes documented here).
Would we do it differently? Maybe skip the overlapping IP peering attempts and go straight to the “some things stay on public IPs” design. Otherwise, this architecture is solid.
The code is production-ready. The workflow is reliable. The users are happy. Ship it.