Prometheus Monitoring: SRE Best Practices and Implementation

Effective Metric Collection Key Metric Types Counter Metrics # Example counter metric http_requests_total{status="200", handler="/api/v1"} Gauge Metrics # Memory usage example process_resident_memory_bytes PromQL Best Practices Rate Calculations # Request rate over 5 minutes rate(http_requests_total[5m]) # Error rate percentage sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) * 100 Alert Configuration Alert Rules Example groups: - name: example rules: - alert: HighErrorRate expr: | sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) * 100 > 5 for: 5m labels: severity: critical annotations: summary: High HTTP error rate description: "Error rate is {{ $value }}%" Recording Rules groups: - name: example rules: - record: job:http_inprogress_requests:sum expr: sum by (job) (http_inprogress_requests) Retention and Storage Storage Configuration global: scrape_interval: 15s evaluation_interval: 15s storage: tsdb: retention.time: 15d retention.size: 512GB Production Example apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: api-monitor spec: selector: matchLabels: app: api endpoints: - port: metrics interval: 30s path: /metrics - port: metrics interval: 10s path: /metrics/critical metricRelabelings: - sourceLabels: [__name__] regex: 'http_requests_total' action: keep

1 min · Me

Recovering Files from SQL VM Backups with Azure

Introduction In cloud environments, accidentally overwriting production files can be a nerve-wracking experience. This post walks through the process of recovering individual files from Azure VM backups. Prerequisites Azure subscription with VM backup enabled Access to Azure Portal Backup retention period covering the desired recovery point File-Level Recovery Process 1. Identify the Recovery Point First, locate the specific backup point before the file was modified: Navigate to the Azure Portal Select the VM in question Go to “Backup & restore” Choose “File Recovery” Select a recovery point prior to the file modification 2. Mount the Recovery Point Mount the recovery drive Azure will provide specific mounting instructions One of the nice things about Azure is it builds a binary that does all the mounting work. Other cloud providers often require you to do all this work manually, and also remember to clean up the cloned disks later, and Azure makes all that easy. ...

2 min · Me

Securing a Private n8n Instance in Azure with Let’s Encrypt and Managed Identity

This week, I deployed a private n8n automation instance in Azure with a focus on security, auditability, and zero public exposure. Here’s how I solved the HTTPS challenge without storing credentials or opening ports unnecessarily. Problem Statement I needed to: Run n8n privately for internal automations Enable HTTPS for browser access and webhook security Use Let’s Encrypt for free TLS certs Avoid storing Azure credentials on the VM Keep the VM locked down with minimal exposure Azure VM and NSG Setup Deployed Ubuntu VM with n8n running via systemd Configured Azure Network Security Group (NSG) to allow: Port 22 (SSH) and 443 (HTTPS) only Scoped to my static IP Temporarily opened port 80 for Let’s Encrypt HTTP challenge SSL Issue: Nginx Serving Self-Signed Cert Despite running Certbot successfully, openssl s_client revealed: ...

2 min · Me

Securing Azure Entra ID: Essential Security Measures for Enterprise

Introduction Securing Azure Entra ID (formerly Azure AD) is crucial for maintaining a robust security posture. This post covers essential security measures and how to implement them effectively. Cleaning Up App Registrations Identifying Unused Applications First, identify app registrations with expired credentials: # Get app registrations with expired secrets/certificates Get-AzureADApplication | Where-Object { $_.PasswordCredentials.EndDate -lt (Get-Date) -or $_.KeyCredentials.EndDate -lt (Get-Date) } Verification Process Check service principal sign-in logs for the last 30 days Disable service principals showing no activity Delete the corresponding app registration Implementing MFA Requirements Assessing MFA Status Navigate to Authentication Methods > User Registration Details to identify users without MFA: ...

2 min · Me

Securing Azure Infrastructure: Implementing Essential Security Policies

Introduction Securing Azure infrastructure requires implementing multiple layers of security controls. This post walks through implementing essential security policies to protect your Azure environment. Preventing Public Blob Storage Access One common security risk is accidentally exposing blob storage containers publicly. Azure Policy can prevent this: Navigate to Azure Policy Search for the built-in policy “Configure your Storage account public access to be disallowed” Assign the policy at your desired scope (subscription or management group) Set the effect to “Deny” to prevent creation of public containers { "properties": { "displayName": "Prevent Public Blob Access", "policyType": "BuiltIn", "mode": "All", "parameters": {}, "policyRule": { "if": { "allOf": [ { "field": "type", "equals": "Microsoft.Storage/storageAccounts" }, { "field": "Microsoft.Storage/storageAccounts/allowBlobPublicAccess", "equals": "true" } ] }, "then": { "effect": "deny" } } } } Implementing Conditional Access Policies Admin Role Protection Secure privileged accounts with dedicated conditional access policies: ...

2 min · Me