AZ-305: Azure Solutions Architect Expert Study Guide
AZ-305 (Azure Solutions Architect Expert) validates your ability to design cloud and hybrid solutions across identity, governance, monitoring, data storage, business continuity, and infrastructure. It is a scenario-heavy, design-focused exam aimed at architects who translate business and technical requirements into Azure designs, and it assumes you already hold the AZ-104 Administrator skills as a prerequisite.
Domain 1: Design Identity, Governance, and Monitoring Solutions
- Azure Policy assignments inherit downward through the management group hierarchy, so a policy assigned at a management group automatically applies to every current and future subscription beneath it.
- The Deny effect blocks non-compliant deployments at creation time (preventive); Audit only flags non-compliance in the compliance dashboard; Modify and DeployIfNotExists remediate existing or new resources (Modify needs a managed identity and a remediation task for pre-existing resources).
- Use a custom RBAC role when no built-in role fits; define Actions/NotActions (control-plane) and DataActions/NotDataActions (data-plane), and scope it to subscriptions, resource groups, or management groups via AssignableScopes.
- Microsoft Entra Privileged Identity Management (PIM) provides just-in-time, time-bound, approval-and-MFA-gated activation of privileged Entra and Azure roles, with access reviews and audit history; it requires Entra ID P2.
- Conditional Access policies are evaluated per sign-in and enforce controls like require MFA or require compliant device; always exclude break-glass (emergency access) accounts from CA and PIM enforcement to avoid lockout.
- Pass-through Authentication (PTA) validates passwords directly against on-premises AD without storing password hashes in the cloud; Password Hash Sync (PHS) syncs hashes and enables leaked-credential detection; Seamless SSO uses Kerberos for transparent sign-in on domain-joined devices.
- Azure Lighthouse delegates resource management so a partner or MSP can manage your resources from their own Entra tenant without creating guest accounts in your directory.
- Entra B2B guest accounts combined with Entitlement Management access packages and recurring access reviews govern external collaboration and automatically revoke stale access.
- Diagnostic settings route platform logs and metrics to a Log Analytics workspace (query/alert/retain), a storage account (cheap long-term archive, supports immutable blobs), or Event Hubs (stream to SIEM/third party).
- Microsoft Sentinel is Azure's cloud-native SIEM/SOAR and is built on top of a Log Analytics workspace; Azure Monitor + Log Analytics provides metrics, logs, and KQL-based alerting.
- Azure Resource Graph queries resource configuration and compliance inventory at scale across subscriptions with KQL; Azure Cost Management provides cost analysis, budgets, and exports.
- Azure Cost Management budgets trigger action groups (email, SMS, webhook, Logic App) at configured thresholds such as 80% and 100% of forecasted or actual spend.
- Resource locks prevent changes: CanNotDelete allows reads/modifies but blocks deletion; ReadOnly blocks all modifications and deletions; locks are inherited by child resources.
- Azure Landing Zone (enterprise-scale) provides a prescriptive management group hierarchy with policy-driven guardrails and subscription vending for scalable, governed onboarding of new workloads.
Domain 2: Design Data Storage Solutions
- Blob Storage access tiers (Hot, Cool, Cold, Archive) trade storage cost for access cost; lifecycle management policies automatically transition or delete blobs based on age (for example Hot to Cool after 30 days, Cool/Archive after 180-365 days). Archive is offline and requires rehydration before reads.
- Azure Cosmos DB offers five consistency levels: Strong, Bounded Staleness, Session (default), Consistent Prefix, and Eventual; Session keeps reads consistent within a client session while allowing low cross-region latency.
- Cosmos DB multi-region writes (multi-master) enable active-active writes in every region; conflict resolution is Last-Writer-Wins by default or via a custom stored procedure.
- Choosing a good Cosmos DB partition key with high cardinality and even access avoids hot partitions; denormalizing data (for example a feed partitioned by user ID with embedded posts) optimizes read-heavy NoSQL workloads.
- Azure SQL Database Hyperscale supports databases up to 128 TB with fast backups/restores and rapid read-scale replicas; Business Critical adds a local SSD, an Always On replica, and the lowest latency.
- Azure SQL Database elastic pools share DTU/vCore capacity across many databases of varying load - ideal for SaaS multi-tenant designs with one database per tenant.
- Azure Data Lake Storage Gen2 adds a hierarchical namespace and POSIX-style ACLs on top of Blob Storage and is the standard store for analytics lakehouse architectures.
- Azure Synapse Analytics offers serverless SQL pools (pay-per-query exploration over the lake), dedicated SQL pools (provisioned MPP data warehouse), and Apache Spark pools (big-data processing).
- Azure Synapse Link provides near-real-time HTAP analytics over Cosmos DB and Azure SQL without building ETL pipelines, querying an analytical store that mirrors operational data.
- Azure Cache for Redis offloads read-heavy database traffic with in-memory caching; the Enterprise/Premium tiers add clustering, persistence, and active geo-replication for resilience.
- Immutable (WORM) blob storage with time-based retention or legal hold satisfies regulatory write-once requirements; pair with customer-managed keys in Key Vault for encryption control.
- Azure Key Vault Managed HSM provides single-tenant, FIPS 140-3 Level 3 validated hardware key protection for the most stringent key-sovereignty requirements.
- Azure Data Factory orchestrates ETL/ELT; a self-hosted integration runtime is required to move data from on-premises or private-network sources securely.
- Event Hubs Capture automatically writes streaming events to Blob/ADLS in Avro format; pair Event Hubs with Azure Stream Analytics for real-time stream processing.
Domain 3: Design Business Continuity Solutions
- RTO is the maximum acceptable downtime (how fast you recover); RPO is the maximum acceptable data loss (how much data you can lose) - design choices map directly to these two targets.
- Availability Zones and Zone-Redundant Storage (ZRS) protect only against a datacenter/zone failure within a single region; they do NOT protect against a full regional outage - that requires multi-region design or GRS.
- A 99.99% (or higher) availability SLA generally cannot be met within a single region; deploy to two or more regions behind Azure Front Door or Traffic Manager for geographic redundancy.
- Azure Site Recovery (ASR) provides continuous replication of Azure VMs, on-premises VMs, and physical servers to a secondary region, with orchestrated and testable failover - good for low-RTO/low-RPO DR without a fully hot standby.
- ASR test failover runs into an isolated virtual network so you can validate the DR plan without impacting production or breaking ongoing replication.
- Azure SQL Database point-in-time restore (PITR) recovers from accidental changes within the configured retention window (1-35 days, default 7); long-term retention (LTR) keeps weekly/monthly/yearly backups for up to 10 years.
- Azure SQL Database active geo-replication and auto-failover groups give a readable secondary in another region with automatic failover; the Business Critical zone-redundant tier can deliver RPO = 0 with automatic failover.
- Azure Backup uses Recovery Services vaults; enable soft delete, immutability, and multi-user authorization (MUA) to protect backups from accidental or malicious deletion (ransomware resilience).
- Azure Backup Center provides centralized, at-scale monitoring and management of backups across subscriptions and vaults.
- Azure Front Door is a global Layer-7 load balancer with health probes, instant failover, and integrated WAF; Azure Traffic Manager is DNS-based (priority, weighted, performance, geographic routing) and operates at the DNS level.
- Azure Service Bus Premium supports geo-disaster recovery pairing (metadata failover via an alias); Event Hubs and Service Bus Premium offer geo-DR for messaging continuity.
- Azure Files protection uses Azure Backup with scheduled snapshots and configurable retention to recover from accidental file deletion or corruption.
- Tiered BCDR strategies are common: critical (Tier 1) workloads use ASR continuous replication for low RTO/RPO, while less critical (Tier 3) workloads rely on Azure Backup with geo-redundant storage.
- Composite SLAs multiply across dependent components in series, so each added required component lowers the overall availability figure - account for this when validating a design against a target SLA.
Domain 4: Design Infrastructure Solutions
- Azure Functions on the Consumption plan scales to zero (no idle cost) and bills per execution; Premium plan adds pre-warmed instances, VNet integration, and no cold start; for event-driven serverless compute it is the lowest-cost option.
- Azure Kubernetes Service (AKS) is the managed orchestrator for containerized microservices; the cluster autoscaler scales nodes, and KEDA adds event-driven (scale-to-zero) autoscaling based on queue length or other metrics.
- Azure Container Instances (ACI) runs short-lived containers serverlessly with per-second billing and no cluster to manage - ideal for brief batch jobs that terminate after running.
- Azure App Service deployment slots enable zero-downtime deploys and blue-green/canary releases; route a percentage of traffic to a staging slot for testing, then swap.
- App Service networking: regional VNet integration handles outbound traffic into a private network, while a private endpoint provides secure inbound access; disable public network access once private endpoints are configured.
- Private endpoints give a service a private IP inside your VNet (Azure SQL, Storage, etc.); after configuring them, disable public network access to eliminate internet exposure.
- Application Gateway is a regional Layer-7 load balancer with WAF, SSL offload, and URL-based routing; Azure Front Door is the global equivalent with WAF; choose Front Door for multi-region/global entry points.
- Azure Load Balancer (Standard) is a regional Layer-4 (TCP/UDP) load balancer supporting availability zones and the backend health-probe model for VMs and VMSS.
- Hub-and-spoke topology centralizes shared services (Azure Firewall or an NVA for inspection, and an ExpressRoute or VPN gateway) in the hub VNet, with spokes peered to it; Azure Virtual WAN provides a managed, scalable hub with integrated Firewall and site-to-site VPN.
- ExpressRoute private peering connects to Azure IaaS/PaaS over a private circuit; Microsoft peering reaches Microsoft 365 and public PaaS endpoints; it offers higher bandwidth and lower latency than site-to-site VPN.
- Azure API Management (Standard/Premium) centralizes API publishing with rate limiting via products, request/response transformation policies, OAuth 2.0/JWT validation, and a developer portal; Premium adds multi-region and VNet support.
- Azure Service Bus queues provide reliable, ordered, exactly-once-style processing; enable sessions for FIFO ordering grouped by a session ID (for example processing one customer's orders in sequence).
- Autoscaling supports scheduled rules (scale up before business hours, down after) combined with metric-based rules (CPU, queue length) to handle predictable patterns plus unexpected spikes cost-effectively.
- Azure Migrate is the central hub for discovery, dependency mapping, assessment (right-sizing and cost), and migration of servers, databases, and web apps to Azure.
AZ-305 exam tips
- Read every requirement keyword: words like 'minimize cost', 'least administrative effort', 'highest availability', or 'RPO = 0' single out the one correct design among several technically working options.
- Watch for the management-group inheritance pattern: assigning policy or RBAC at the right scope (management group vs subscription vs resource group) is a recurring deciding factor.
- Translate stated RTO/RPO numbers into mechanisms: near-zero RPO points to synchronous replication or geo-replication; tolerant RPO/RTO points to Azure Backup or periodic ASR replication.
- Distinguish regional vs global services - availability zones and Application Gateway are regional, while Front Door, Traffic Manager, and Cosmos DB multi-region span regions; a full regional outage requires a global/multi-region answer.
- Many questions are case-study or drag-and-drop matching one Azure service per requirement; eliminate options by ruling out the wrong tier or the wrong networking mode rather than memorizing every feature.
Study guide FAQ
Is AZ-305 harder than AZ-104, and do I need AZ-104 first?
Microsoft recommends AZ-104 (Azure Administrator) experience before AZ-305. AZ-305 is design- and scenario-focused rather than hands-on configuration, so it tests judgment about choosing the right service and tier for given business and technical constraints rather than how to click through the portal.
What kinds of questions appear on the exam?
Expect case studies with multiple linked questions, traditional multiple choice, drag-and-drop matching of services to requirements, and 'yes/no for each statement' design-review items. There are no hands-on labs; almost everything is requirement-to-design reasoning.
What is the passing score and exam format?
The passing score is 700 on a scale of 1000 (scaled, not a raw percentage). You have about 120 minutes, and the exam typically has 40-60 questions including one or more case studies. Case study sections may lock once you move past them, so answer carefully before advancing.
How do I tell similar storage or compute options apart?
Anchor on the differentiators the question stresses: consistency level and partition key for Cosmos DB, access tier and lifecycle policy for Blob Storage, elastic pool vs Hyperscale vs Business Critical for Azure SQL, and Consumption vs Premium vs Dedicated for Functions/AKS/ACI. The requirement keywords (cost, latency, scale, RPO) tell you which differentiator decides the answer.