Cert-Manager in Production: Automated Certificate Management

Core Components ClusterIssuer Configuration apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-prod spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: [email protected] privateKeySecretRef: name: letsencrypt-prod-account-key solvers: - http01: ingress: class: nginx - dns01: cloudflare: email: [email protected] apiTokenSecretRef: name: cloudflare-api-token key: api-token Certificate Management Wildcard Certificate apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: wildcard-cert namespace: cert-manager spec: secretName: wildcard-tls commonName: "*.example.com" dnsNames: - "*.example.com" - "example.com" issuerRef: name: letsencrypt-prod kind: ClusterIssuer usages: - digital signature - key encipherment - server auth Production Implementation # Complete cert-manager setup apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: production-certs namespace: production spec: secretName: production-tls duration: 2160h # 90 days renewBefore: 360h # 15 days subject: organizations: - Example Corp commonName: api.example.com dnsNames: - api.example.com - web.example.com - admin.example.com issuerRef: name: letsencrypt-prod kind: ClusterIssuer keystores: jks: create: true passwordSecretRef: name: jks-password key: password --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: secured-ingress annotations: cert-manager.io/cluster-issuer: "letsencrypt-prod" spec: tls: - hosts: - api.example.com - web.example.com secretName: production-tls rules: - host: api.example.com http: paths: - path: / pathType: Prefix backend: service: name: api-service port: number: 80

1 min · Me

Container and Infrastructure Security Scanning: A Comprehensive Guide

Trivy Implementation Container Scanning Pipeline # GitLab CI Pipeline Configuration container_scan: image: aquasec/trivy:latest variables: TRIVY_NO_PROGRESS: "true" TRIVY_CACHE_DIR: ".trivycache/" script: - trivy image --exit-code 1 --severity HIGH,CRITICAL --no-progress --format template --template "@/contrib/sarif.tpl" -o trivy-results.sarif $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA artifacts: reports: security: trivy-results.sarif Kubernetes Manifest Scanning apiVersion: batch/v1 kind: CronJob metadata: name: trivy-cluster-scan spec: schedule: "0 0 * * *" jobTemplate: spec: template: spec: serviceAccountName: trivy-scanner containers: - name: trivy image: aquasec/trivy:latest args: - k8s - --report=summary - --severity=HIGH,CRITICAL - all volumeMounts: - name: results mountPath: /results volumes: - name: results persistentVolumeClaim: claimName: scan-results Snyk Integration GitHub Action Integration name: Snyk Security Scan on: pull_request jobs: security: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Run Snyk to check for vulnerabilities uses: snyk/actions/node@master env: SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }} with: args: --severity-threshold=high - name: Container Scan uses: snyk/actions/docker@master env: SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }} with: image: your-registry/app:latest args: --file=Dockerfile - name: IaC Scan uses: snyk/actions/iac@master env: SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }} with: args: --severity-threshold=high Wiz Implementation Cloud Configuration Scanning # Terraform configuration for Wiz resource "wiz_automation_rule" "critical_vuln" { name = "Critical Vulnerability Alert" description = "Alert on critical vulnerabilities in production" enabled = true trigger { type = "VULNERABILITY" vulnerabilities { severity = ["CRITICAL"] has_fix = true } } actions { create_issue { provider = "JIRA" project = "SEC" type = "Bug" } send_notification { channels = ["SLACK"] template = "critical-vuln" } } } Production Implementation Multi-Scanner Integration # ArgoCD Application for Security Scanning apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: security-scanners spec: project: security source: repoURL: https://github.com/org/security-configs.git targetRevision: HEAD path: scanners/ destination: server: https://kubernetes.default.svc namespace: security syncPolicy: automated: prune: true selfHeal: true syncOptions: - CreateNamespace=true --- # Scanner Configuration apiVersion: v1 kind: ConfigMap metadata: name: scanner-config data: trivy.yaml: | severity: CRITICAL,HIGH ignore-unfixed: true timeout: 10m snyk.yaml: | severity-threshold: high fail-on: upgradable wiz.yaml: | scan-interval: 6h alert-threshold: CRITICAL compliance-frameworks: - SOC2 - PCI

2 min · Me

Implementing Automated Azure Resource Locks with PowerShell Runbooks

Why Resource Locks Matter Azure resource locks are an important security feature that prevent accidental deletion or modification of important resources. However, manual implementation can be tedious and locks may be inadvertently removed, forgotten to put back, or not added to new resources. This post explains how to automate the process using Azure Automation runbooks. Implementation Overview Creating the Automation Runbook Create a new Azure Automation account or use an existing one Create a PowerShell runbook that will: Scan for resources without locks Apply appropriate lock types (CanNotDelete or ReadOnly) Skip dynamic resources like AKS nodes Sample PowerShell Script # Azure Resource Lock Automation Script - Fixed Authentication # This script processes locks in batches to avoid timeout limits #Requires -Module Az.Accounts, Az.Resources param( [string[]]$SubscriptionIds = @(), [string[]]$ExemptResourceGroups = @(), [string[]]$ExemptResources = @(), [string]$LockName = "DenyDelete", [string]$LockNotes = "Delete lock", [switch]$WhatIf = $false, [switch]$IncludeResources = $true, [int]$BatchSize = 10, [int]$MaxExecutionMinutes = 150, [string]$StateTableName = "LockAutomationState", [string]$StorageAccountName = "", [string]$StorageResourceGroup = "", [string[]]$TargetResourceTypes = @( "Microsoft.Compute/virtualMachines", "Microsoft.Compute/virtualMachineScaleSets", "Microsoft.Sql/servers", "Microsoft.Sql/managedInstances", "Microsoft.DBforPostgreSQL/servers", "Microsoft.DBforMySQL/servers", "Microsoft.DBforMariaDB/servers", "Microsoft.DocumentDB/databaseAccounts", "Microsoft.Storage/storageAccounts", "Microsoft.Network/virtualNetworks", "Microsoft.Network/networkSecurityGroups", "Microsoft.Network/routeTables", "Microsoft.Network/publicIPAddresses", "Microsoft.Network/loadBalancers", "Microsoft.Network/applicationGateways", "Microsoft.Network/dnszones", "Microsoft.Network/privateDnsZones", "Microsoft.KeyVault/vaults", "Microsoft.RecoveryServices/vaults", "Microsoft.ContainerRegistry/registries", "Microsoft.Kubernetes/connectedClusters", "Microsoft.ContainerService/managedClusters", "Microsoft.Web/sites", "Microsoft.Web/serverfarms", "Microsoft.Logic/workflows", "Microsoft.DataFactory/factories", "Microsoft.Synapse/workspaces", "Microsoft.Network/natGateways", "Microsoft.Network/vpnGateways", "Microsoft.Purview/accounts", "Microsoft.Security/pricings", "Microsoft.OperationsManagement/solutions" ) ) # Global variables for tracking $script:StartTime = Get-Date $script:ProcessedCount = 0 $script:SuccessCount = 0 $script:SkippedCount = 0 $script:ErrorCount = 0 $script:BatchNumber = 0 $script:TimeoutReached = $false # Check execution time limit function Test-ExecutionTimeLimit { $elapsed = (Get-Date) - $script:StartTime $remainingMinutes = $MaxExecutionMinutes - $elapsed.TotalMinutes if ($remainingMinutes -le 5) { # Stop with 5 minutes buffer $script:TimeoutReached = $true Write-Output "⚠️ TIMEOUT WARNING: Only $([math]::Round($remainingMinutes, 1)) minutes remaining. Stopping execution." return $false } return $true } function Write-BatchSummary { param( [int]$BatchNum, [string]$SubscriptionId, [int]$BatchProcessed, [int]$BatchSuccess, [int]$BatchSkipped, [int]$BatchErrors, [datetime]$BatchStartTime ) $batchElapsed = (Get-Date) - $BatchStartTime $totalElapsed = (Get-Date) - $script:StartTime Write-Output "`n📊 --- Batch $BatchNum Summary ---" Write-Output "🎯 Subscription: $SubscriptionId" Write-Output "📋 Batch processed: $BatchProcessed" Write-Output "✅ Batch successful: $BatchSuccess" Write-Output "⏭️ Batch skipped: $BatchSkipped" Write-Output "❌ Batch errors: $BatchErrors" Write-Output "⏱️ Batch time: $([math]::Round($batchElapsed.TotalMinutes, 1)) minutes" Write-Output "🕐 Total elapsed: $([math]::Round($totalElapsed.TotalMinutes, 1)) minutes" Write-Output "⏳ Remaining time: $([math]::Round($MaxExecutionMinutes - $totalElapsed.TotalMinutes, 1)) minutes" Write-Output "📊 --- End Batch Summary ---`n" } # Simplified state management (in-memory for this version) $script:ProcessedResourceGroups = @{} function Connect-ToAzure { try { Write-Output "🔍 Checking Azure authentication status..." # Check if we already have a PowerShell Az context $context = Get-AzContext -ErrorAction SilentlyContinue if ($context) { Write-Output "✓ Already connected to Azure PowerShell as: $($context.Account.Id)" Write-Output "✓ Current subscription: $($context.Subscription.Name) ($($context.Subscription.Id))" return $true } Write-Output "⚠️ No existing Azure PowerShell context found" # Check if we're running in Azure Automation (has specific environment variables) $isAzureAutomation = $env:AUTOMATION_ASSET_ACCOUNTID -or $env:AUTOMATION_RESOURCE_GROUP if ($isAzureAutomation) { # Try to connect using Managed Identity (for Azure Automation) try { Write-Output "🔄 Detected Azure Automation environment. Attempting to connect using Managed Identity..." Connect-AzAccount -Identity -ErrorAction Stop $context = Get-AzContext Write-Output "✓ Successfully connected using Managed Identity" Write-Output "✓ Connected as: $($context.Account.Id)" return $true } catch { Write-Output "❌ Managed Identity connection failed: $($_.Exception.Message)" } } else { # Running locally - try to import Azure CLI credentials Write-Output "🔄 Running locally. Attempting to import Azure CLI credentials..." try { # Try to connect using Azure CLI credentials Connect-AzAccount -UseDeviceAuthentication:$false -ErrorAction Stop $context = Get-AzContext Write-Output "✓ Successfully connected using existing credentials" Write-Output "✓ Connected as: $($context.Account.Id)" return $true } catch { Write-Output "❌ Failed to connect using existing credentials: $($_.Exception.Message)" # Last resort - ask user to connect manually Write-Output "💡 Please run 'Connect-AzAccount' first to authenticate to Azure PowerShell" Write-Output "💡 Note: 'az login' is for Azure CLI, but this script requires Azure PowerShell authentication" Write-Error "Authentication required. Please run 'Connect-AzAccount' before running this script." return $false } } Write-Error "Authentication required. Please authenticate to Azure before running this script." return $false } catch { Write-Error "Failed to connect to Azure: $($_.Exception.Message)" return $false } } function Add-ResourceGroupLockOptimized { param( [string]$SubscriptionId, [string]$ResourceGroupName, [string]$LockName, [string]$Notes, [bool]$WhatIf ) try { $script:ProcessedCount++ # Quick check for existing lock $existingLock = Get-AzResourceLock -ResourceGroupName $ResourceGroupName -ErrorAction SilentlyContinue | Where-Object { $_.Properties.level -eq "CanNotDelete" } | Select-Object -First 1 if ($existingLock) { Write-Output "✓ RG '$ResourceGroupName' already locked" $script:SkippedCount++ return } if ($WhatIf) { Write-Output "WHATIF: Would lock RG '$ResourceGroupName'" return } New-AzResourceLock -ResourceGroupName $ResourceGroupName -LockName $LockName -LockLevel CanNotDelete -LockNotes $Notes -Force -ErrorAction Stop Write-Output "✓ Locked RG '$ResourceGroupName'" $script:SuccessCount++ } catch { Write-Warning "Failed to lock RG '$ResourceGroupName': $($_.Exception.Message)" $script:ErrorCount++ } } function Add-ResourceLockOptimized { param( [object]$Resource, [string]$LockName, [string]$Notes, [bool]$WhatIf, [string[]]$TargetResourceTypes ) try { # Quick type check if ($Resource.ResourceType -notin $TargetResourceTypes) { return } $script:ProcessedCount++ # Quick check for existing lock $existingLock = Get-AzResourceLock -ResourceName $Resource.Name -ResourceType $Resource.ResourceType -ResourceGroupName $Resource.ResourceGroupName -ErrorAction SilentlyContinue | Where-Object { $_.Properties.level -eq "CanNotDelete" } | Select-Object -First 1 if ($existingLock) { $script:SkippedCount++ return } if ($WhatIf) { Write-Output "WHATIF: Would lock resource '$($Resource.Name)' ($($Resource.ResourceType))" return } New-AzResourceLock -ResourceName $Resource.Name -ResourceType $Resource.ResourceType -ResourceGroupName $Resource.ResourceGroupName -LockName $LockName -LockLevel CanNotDelete -LockNotes $Notes -Force -ErrorAction Stop Write-Output "✓ Locked resource '$($Resource.Name)' ($($Resource.ResourceType))" $script:SuccessCount++ } catch { Write-Warning "Failed to lock resource '$($Resource.Name)': $($_.Exception.Message)" $script:ErrorCount++ } } function Test-ResourceGroupExemption { param([string]$ResourceGroupName, [string[]]$ExemptList) foreach ($exemption in $ExemptList) { if ($ResourceGroupName -like $exemption) { return $true } } return $false } # Main execution try { Write-Output "=== Azure Resource Lock Automation - Batch Mode ===" Write-Output "Started at: $(Get-Date)" Write-Output "Max execution time: $MaxExecutionMinutes minutes" Write-Output "Batch size: $BatchSize resource groups" # Connect to Azure (or verify existing connection) if (-not (Connect-ToAzure)) { Write-Error "Failed to establish Azure connection. Exiting." exit 1 } # Get subscriptions if ($SubscriptionIds.Count -eq 0) { Write-Output "📋 No specific subscriptions provided. Getting all enabled subscriptions..." Write-Output "🔄 Querying Azure for enabled subscriptions..." $subscriptions = Get-AzSubscription | Where-Object { $_.State -eq "Enabled" } $SubscriptionIds = $subscriptions.Id Write-Output "✓ Found $($SubscriptionIds.Count) enabled subscriptions" # List the subscriptions we found foreach ($sub in $subscriptions) { Write-Output " - $($sub.Name) ($($sub.Id))" } } else { Write-Output "📋 Using provided subscription IDs: $($SubscriptionIds.Count) subscription(s)" } Write-Output "`n🚀 Starting processing of $($SubscriptionIds.Count) subscription(s)..." foreach ($subscriptionId in $SubscriptionIds) { # Check time limit before each subscription if (-not (Test-ExecutionTimeLimit)) { Write-Output "Time limit reached. Stopping subscription processing." break } Write-Output "`n🎯 === Processing Subscription: $subscriptionId ===" try { # Set context to the subscription Write-Output "🔄 Setting Azure context to subscription..." $context = Set-AzContext -SubscriptionId $subscriptionId -ErrorAction Stop Write-Output "✓ Set context to subscription: $($context.Subscription.Name)" # Get resource groups Write-Output "📦 Getting resource groups from subscription..." $resourceGroups = Get-AzResourceGroup -ErrorAction Stop Write-Output "✓ Found $($resourceGroups.Count) resource groups in subscription" if ($resourceGroups.Count -eq 0) { Write-Output "⚠️ No resource groups found in this subscription. Skipping..." continue } # Show exemption info if any if ($ExemptResourceGroups.Count -gt 0) { Write-Output "🚫 Exempted resource group patterns: $($ExemptResourceGroups -join ', ')" } Write-Output "⏳ Processing resource groups in batches of $BatchSize..." # Process in batches for ($i = 0; $i -lt $resourceGroups.Count; $i += $BatchSize) { # Check time limit before each batch if (-not (Test-ExecutionTimeLimit)) { break } $script:BatchNumber++ $batchStartTime = Get-Date $batchProcessed = 0 $batchSuccess = 0 $batchSkipped = 0 $batchErrors = 0 $batch = $resourceGroups | Select-Object -Skip $i -First $BatchSize Write-Output "📋 Processing batch $($script:BatchNumber): RGs $($i + 1) to $($i + $batch.Count) of $($resourceGroups.Count)" Write-Output "⏱️ Batch started at: $(Get-Date -Format 'HH:mm:ss')" foreach ($rg in $batch) { Write-Output "🔍 Processing RG: $($rg.ResourceGroupName)" # Check time limit during batch processing if (-not (Test-ExecutionTimeLimit)) { Write-Output "⏰ Time limit reached during batch processing. Stopping current batch." break } # Check exemptions if (Test-ResourceGroupExemption -ResourceGroupName $rg.ResourceGroupName -ExemptList $ExemptResourceGroups) { Write-Output "🚫 Skipping exempted RG: $($rg.ResourceGroupName)" $batchSkipped++ continue } # Track counts before processing $beforeProcessed = $script:ProcessedCount $beforeSuccess = $script:SuccessCount $beforeSkipped = $script:SkippedCount $beforeErrors = $script:ErrorCount # Lock resource group Write-Output "🔒 Checking/adding lock for RG: $($rg.ResourceGroupName)" Add-ResourceGroupLockOptimized -SubscriptionId $subscriptionId -ResourceGroupName $rg.ResourceGroupName -LockName $LockName -Notes $LockNotes -WhatIf $WhatIf # Lock individual resources if enabled if ($IncludeResources) { try { Write-Output "📦 Getting resources from RG: $($rg.ResourceGroupName)" $resources = Get-AzResource -ResourceGroupName $rg.ResourceGroupName -ErrorAction Stop if ($resources.Count -gt 0) { Write-Output " Found $($resources.Count) resources to evaluate" foreach ($resource in $resources) { # Check time limit during resource processing if (-not (Test-ExecutionTimeLimit)) { Write-Output "⏰ Time limit reached during resource processing. Stopping current RG." break } # Only show output for resources we're actually processing if ($resource.ResourceType -in $TargetResourceTypes) { Write-Output " 🔍 Evaluating: $($resource.Name) ($($resource.ResourceType))" } Add-ResourceLockOptimized -Resource $resource -LockName $LockName -Notes $LockNotes -WhatIf $WhatIf -TargetResourceTypes $TargetResourceTypes } } else { Write-Output " No resources found in RG" } } catch { Write-Warning "❌ Failed to get resources from RG $($rg.ResourceGroupName): $($_.Exception.Message)" $batchErrors++ } } # Calculate batch-specific counts $batchProcessed += ($script:ProcessedCount - $beforeProcessed) $batchSuccess += ($script:SuccessCount - $beforeSuccess) $batchSkipped += ($script:SkippedCount - $beforeSkipped) $batchErrors += ($script:ErrorCount - $beforeErrors) # Break if time limit reached if ($script:TimeoutReached) { break } } # Write batch summary Write-BatchSummary -BatchNum $script:BatchNumber -SubscriptionId $subscriptionId -BatchProcessed $batchProcessed -BatchSuccess $batchSuccess -BatchSkipped $batchSkipped -BatchErrors $batchErrors -BatchStartTime $batchStartTime # Break if time limit reached if ($script:TimeoutReached) { break } } } catch { Write-Error "Failed to process subscription $subscriptionId : $($_.Exception.Message)" continue } # Break if time limit reached if ($script:TimeoutReached) { Write-Output "Time limit reached. Stopping all processing." break } } Write-Output "`n🏁 === FINAL EXECUTION SUMMARY ===" Write-Output "🏁 Completed at: $(Get-Date)" $totalElapsed = (Get-Date) - $script:StartTime Write-Output "⏱️ Total execution time: $([math]::Round($totalElapsed.TotalMinutes, 1)) minutes" Write-Output "📊 Batches processed: $($script:BatchNumber)" Write-Output "📋 Total items processed: $($script:ProcessedCount)" Write-Output "✅ Successful locks: $($script:SuccessCount)" Write-Output "⏭️ Skipped (already locked): $($script:SkippedCount)" Write-Output "❌ Errors: $($script:ErrorCount)" if ($script:TimeoutReached) { Write-Output "⚠️ EXECUTION STOPPED DUE TO TIME LIMIT" Write-Output "💡 Consider running again to continue processing remaining items." } if ($WhatIf) { Write-Output "🔍 *** This was a PREVIEW run. No actual changes were made. ***" } Write-Output "🏁 === END SUMMARY ===" } catch { Write-Error "Script execution failed: $($_.Exception.Message)" exit 1 } ## Scheduling the Automation ### Configure Recurring Schedule 1. Set up a schedule in Azure Automation 2. Configure the runbook to execute every 6 hours 3. Ensure proper permissions are assigned to the Automation Account ```powershell $schedule = New-AzAutomationSchedule ` -AutomationAccountName "automatic-resource-locks" ` -Name "ResourceLockSchedule" ` -StartTime "2024-01-01T00:00:00" ` -HourInterval 6 Best Practices Maintain an exclusion list for resources that shouldn’t be locked Implement logging to track lock changes Set up alerts for lock removal events Regular review of locked resources to ensure proper protection Monitoring and Maintenance Regularly check the Automation account’s job history to ensure the runbook executes successfully. Monitor for any failures and adjust the script as needed based on your infrastructure changes. ...

9 min · Me

Implementing Azure Conditional Access Policies for Geographic Security

Understanding Geographic-Based Access Controls Geographic-based access controls are crucial for organizations looking to maintain compliance with international regulations or enhance security by removing some low hanging fruit. One specific use case is blocking access from OFAC sanctioned countries while allowing access from trusted locations. Implementation Steps 1. Create a Report-Only Policy First, create a policy in report-only mode to assess impact: Navigate to Azure Portal > Azure AD > Security > Conditional Access Create a new policy Configure the following settings: Users and groups: All users Cloud apps or actions: All cloud apps Conditions: Locations > Configure > Selected locations Access controls: Block access Enable policy: Report-only 2. Configure Location Conditions Create a list of blocked locations: ...

2 min · Me

Implementing Azure Security Best Practices: Break Glass Accounts, MFA, and Legacy Auth

This week, I implemented several critical security measures in Azure Active Directory (now Microsoft Entra ID) that every organization should consider. Let’s walk through the key implementations: 1. Break Glass Account Setup Break glass accounts are emergency access accounts that help maintain access during identity system failures. Here’s how to set one up: Create a dedicated emergency access account Store credentials securely (I used a password manager) Configure exemptions from Conditional Access policies Share access with minimal required administrators Document the process and access procedures 2. Conditional Access for Admin Roles Implemented stronger MFA controls for administrative roles: ...

1 min · Me