Microsoft Well-Architected Framework Assessment Guide
A practical guide for conducting Well-Architected Framework assessments on your Azure workloads.
What is the Well-Architected Framework?
The Microsoft Azure Well-Architected Framework (WAF) is a set of guiding tenets for improving the quality of workloads. It provides a consistent approach to evaluating architectures across five pillars:
- Reliability: Ability to recover from failures and continue functioning
- Security: Protecting applications and data from threats
- Cost Optimization: Managing costs to maximize value delivered
- Operational Excellence: Processes that keep systems running in production
- Performance Efficiency: Ability to scale to meet demands efficiently
Before You Begin
Prerequisites
- Access to Azure Portal with Reader permissions (minimum)
- Architecture documentation for the workload
- Access to key stakeholders (architects, engineers, operations)
- 2-4 hours blocked for assessment workshop
Scope Definition
Define the assessment boundary:
- Workload: Which application or system?
- Environment: Production, staging, or both?
- Components: All services or specific focus areas?
- Depth: High-level review or deep technical assessment?
Step 1: Use the Official Assessment Tool
Microsoft provides an interactive assessment at Azure Well-Architected Review.
Process:
- Navigate to the assessment tool
- Select your workload type (general, SAP, IoT, etc.)
- Answer questions for each pillar
- Review generated recommendations
Tips:
- Be honest in responses—optimistic answers produce useless results
- Include multiple stakeholders to capture different perspectives
- Document assumptions made during the assessment
Step 2: Supplement with Technical Review
The assessment tool provides strategic guidance. Supplement with hands-on technical review:
Reliability Review
Check These:
- Multi-region deployment or DR strategy documented
- Health probes configured for all services
- Auto-scaling rules defined and tested
- Backup and restore procedures tested recently
- Chaos engineering or failure injection practiced
Commands to Run:
# List availability sets
az vm availability-set list --output table
# Check Load Balancer health probes
az network lb probe list --lb-name <name> --resource-group <rg>
# Review Auto-scale settings
az monitor autoscale list --resource-group <rg>
Security Review
Check These:
- Network security groups follow least privilege
- Private endpoints used for PaaS services
- Managed identities instead of service principals where possible
- Key Vault used for secrets and certificates
- Azure Defender / Microsoft Defender for Cloud enabled
Commands to Run:
# Check NSG rules
az network nsg list --output table
# List Private Endpoints
az network private-endpoint list --output table
# Check Defender for Cloud status
az security pricing list --output table
Cost Optimization Review
Check These:
- Reserved Instances evaluated for stable workloads
- Auto-shutdown configured for non-production environments
- Orphaned resources identified and removed
- Right-sizing recommendations reviewed
- Cost alerts and budgets configured
Commands to Run:
# List unattached disks
az disk list --query "[?managedBy==null]" --output table
# Check Advisor cost recommendations
az advisor recommendation list --category Cost --output table
Operational Excellence Review
Check These:
- Infrastructure as Code used for all deployments
- CI/CD pipelines with automated testing
- Monitoring and alerting comprehensive
- Runbooks documented and tested
- Incident response procedures defined
Check Azure Monitor:
- Alert rules coverage
- Log Analytics query coverage
- Workbook dashboards for operational visibility
Performance Efficiency Review
Check These:
- SKUs appropriate for workload demands
- Caching implemented where beneficial
- Database indexing and query optimization done
- CDN used for static content
- Load testing performed regularly
Commands to Run:
# Review VM SKUs
az vm list --query "[].{Name:name, Size:hardwareProfile.vmSize}" --output table
# Check Azure Cache for Redis
az redis list --output table
Step 3: Prioritize Recommendations
Scoring Framework
For each finding, score:
Impact (1-5)
- 5: Critical risk or major improvement opportunity
- 3: Moderate impact on quality or efficiency
- 1: Minor improvement
Effort (1-5)
- 5: Months of work, high complexity
- 3: Weeks of work
- 1: Days or hours
Priority = Impact / Effort
Categorization
Group recommendations:
| Priority | Score | Action |
|---|---|---|
| Critical | > 3.0 | Address immediately |
| High | 2.0 - 3.0 | Plan for next quarter |
| Medium | 1.0 - 2.0 | Include in backlog |
| Low | < 1.0 | Consider in future |
Step 4: Create Action Plan
Recommendation Template
For each recommendation:
## [Recommendation Title]
**Pillar:** [Reliability/Security/Cost/Operations/Performance]
**Priority:** [Critical/High/Medium/Low]
**Effort:** [Hours/Days/Weeks/Months]
### Current State
[Describe current situation]
### Recommendation
[Describe recommended change]
### Benefits
- [Benefit 1]
- [Benefit 2]
### Implementation Steps
1. [Step 1]
2. [Step 2]
3. [Step 3]
### Success Criteria
- [How will we know this is complete?]
Roadmap Structure
| Phase | Timeframe | Focus Areas |
|---|---|---|
| Immediate | 0-30 days | Critical security and reliability gaps |
| Short-term | 1-3 months | High-priority items across pillars |
| Medium-term | 3-6 months | Foundation improvements |
| Long-term | 6-12 months | Strategic enhancements |
Step 5: Establish Continuous Assessment
Regular Cadence
| Activity | Frequency |
|---|---|
| Automated compliance checks | Daily |
| Cost optimization review | Weekly |
| Security posture review | Monthly |
| Full WAF assessment | Quarterly |
| Deep architecture review | Annually |
Automation Tools
- Azure Policy: Enforce and audit compliance
- Azure Advisor: Continuous recommendations
- Microsoft Defender for Cloud: Security posture management
- Azure Cost Management: Cost tracking and optimization
- Azure Monitor: Operational and performance monitoring
Common Findings and Quick Wins
Reliability
- Enable Azure Monitor alerts for critical services
- Configure auto-scaling for variable workloads
- Test backup restore procedures
Security
- Enable Microsoft Defender for Cloud
- Implement Azure Private Link for PaaS services
- Rotate secrets and certificates
Cost
- Delete orphaned resources
- Right-size underutilized VMs
- Schedule auto-shutdown for dev/test
Operations
- Enable diagnostic settings for all resources
- Create Log Analytics queries for common investigations
- Document runbooks in Azure Automation
Performance
- Enable Azure CDN for static content
- Implement caching strategy
- Review and optimize database queries
Resources
- Well-Architected Framework Documentation
- Azure Architecture Center
- Cloud Adoption Framework
- Azure Advisor
For a facilitated Well-Architected Review of your Azure environment, contact our team.