AWS SOA-C02: SysOps Administrator Associate Study Guide
The AWS Certified SysOps Administrator - Associate (SOA-C02) validates your ability to deploy, manage, and operate workloads on AWS, with emphasis on monitoring, reliability, automation, security, networking, and cost/performance optimization. It is aimed at operations-focused professionals with at least one year of hands-on experience operating AWS workloads. The 180-minute exam includes multiple-choice/multiple-response questions and a scaled passing score of 720 (out of 1000).
Domain 1: Monitoring, Logging, and Remediation
- The CloudWatch unified agent must be installed on EC2 to collect OS-level metrics like memory utilization and disk space; these are NOT available from the default hypervisor-level metrics.
- CloudWatch standard metric resolution is 60 seconds; publish with StorageResolution=1 (via PutMetricData) to enable 1-second high-resolution custom metrics.
- New CloudWatch Logs log groups default to 'Never expire'; use aws logs put-retention-policy --retention-in-days 30 to set retention explicitly.
- A CloudWatch Logs metric filter scans log streams for a pattern (e.g. 'ERROR') and converts matches into a custom metric that can drive an alarm.
- CloudWatch Logs Insights queries large log volumes interactively; the two CLI calls are aws logs start-query (returns a queryId) and aws logs get-query-results.
- aws cloudwatch put-metric-alarm needs --namespace, --metric-name, --period (e.g. 300), and --evaluation-periods (e.g. 2); aws cloudwatch put-metric-data --namespace --metric-name --value publishes a single custom data point.
- Configure how an alarm treats missing data (notBreaching, breaching, ignore, missing) to stop it from flipping to INSUFFICIENT_DATA unexpectedly.
- A CloudWatch anomaly-detection alarm builds a trained band from historical data and handles seasonality without a fixed static threshold.
- A composite alarm combines multiple child alarm states with AND/OR logic and notifies only on the composite condition, reducing alarm noise.
- AWS Config rules can trigger Systems Manager Automation documents to auto-remediate non-compliant resources; EventBridge rules can target Lambda or SSM Automation for event-driven remediation.
- An EventBridge rule matching a specific CloudTrail API event (e.g. AuthorizeSecurityGroupIngress) with a Lambda target gives near-real-time remediation of risky changes.
- Common reasons an EventBridge rule does not fire: missing S3 event notification configuration and a missing Lambda resource-based policy granting EventBridge invoke permission.
- For cost-effective long-term log retention and analysis, export CloudWatch Logs to S3 via a subscription filter and query with Amazon Athena rather than keeping everything in CloudWatch Logs.
- Enable CloudTrail log file integrity validation and deliver logs to a separate restricted logging account for tamper-evident auditing; use AWS X-Ray distributed tracing to find latency bottlenecks across microservices.
Domain 2: Reliability and Business Continuity
- RTO (Recovery Time Objective) is the maximum acceptable time to restore service; RPO (Recovery Point Objective) is the maximum acceptable data loss measured in time.
- An EC2 Auto Scaling Group monitors instance health via EC2 status checks or ELB health checks, terminates unhealthy instances, and maintains desired capacity across AZs.
- Spanning an ASG across at least two AZs with an ALB targeting subnets in each AZ provides high availability against an AZ failure.
- Increase the ASG health check grace period so it covers full application startup time; otherwise instances are killed before they finish booting.
- ASG rebalancing first terminates instances in the AZ with the most instances, then applies the default termination policy (e.g. oldest launch template/configuration).
- RDS Multi-AZ provisions a synchronous standby in another AZ with automatic failover and no data loss; it is for availability, not read scaling, and is enabled with aws rds create-db-instance --multi-az.
- RDS point-in-time recovery uses automated backups plus transaction logs; a cross-Region read replica can be promoted to a standalone primary, but promotion is irreversible and stops replication permanently.
- DynamoDB global tables provide multi-active, eventually consistent replication across Regions and resolve conflicts with last-writer-wins; back up a table with aws dynamodb create-backup.
- Create an EBS snapshot with aws ec2 create-snapshot --volume-id; copying an encrypted snapshot to another Region requires re-encryption with a KMS key available in the destination Region.
- S3 Cross-Region Replication only replicates objects created or updated AFTER the rule is enabled; replicate pre-existing objects with S3 Batch Replication.
- For SQS reliability, set a dead-letter queue with a maxReceiveCount and a visibility timeout longer than the maximum message processing time to avoid duplicate processing.
- Route 53 failover routing serves a Primary record while its health check passes and automatically returns the Secondary when the primary fails, enabling active/passive HA.
- AWS Backup centrally schedules and monitors backups across EBS, RDS, DynamoDB, EFS, and more, with retention rules and cross-Region/cross-account copies.
- A warm-standby DR strategy continuously replicates the database to the DR Region and pre-creates core infrastructure with minimal compute, scaling up only at failover to balance cost and RTO.
Domain 3: Deployment, Provisioning, and Automation
- AWS CloudFormation is the native IaC service: JSON/YAML templates managed as stacks with dependency ordering, automatic rollback on failure, and drift detection.
- aws cloudformation deploy creates or updates a stack idempotently in one command; use create-change-set + describe-change-set to preview exactly which resources will be modified, replaced, or deleted before applying.
- Recover a stack stuck in UPDATE_ROLLBACK_FAILED with ContinueUpdateRollback, optionally skipping problematic resources via --resources-to-skip.
- DeletionPolicy: Retain preserves (orphans) a resource in your account after the stack is deleted - useful for databases and S3 buckets you must not lose.
- CloudFormation StackSets deploy resources consistently across multiple accounts and Regions; with AWS Organizations auto-deployment, new accounts automatically receive the baseline stack.
- An ASG uses a launch template (the modern, versioned successor to the deprecated, immutable launch configuration) specifying AMI, instance type, security groups, IAM instance profile, and user data; create-launch-template-version registers a new version.
- Systems Manager Session Manager gives browser/CLI shell access to instances with no SSH keys, open inbound ports, or bastion hosts, with full session logging.
- SSM managed-instance prerequisites: the AmazonSSMManagedInstanceCore policy on the instance role and a running SSM Agent are the two most common missing pieces.
- Run shell commands on managed instances with aws ssm send-command --document-name AWS-RunShellScript; State Manager and Automation handle ongoing configuration and runbook automation.
- Store config and secrets in SSM Parameter Store; use --type SecureString --key-id for KMS-encrypted parameters via aws ssm put-parameter.
- AWS Secrets Manager supplies secrets at runtime and supports automatic rotation (e.g. RDS credentials) without code changes - prefer it over Parameter Store when rotation is required.
- Blue/green deployment with CodeDeploy shifting the ALB target group enables zero-downtime cutover and instant rollback by routing back to the old target group.
- Immutable infrastructure means replacing instances and baking golden AMIs rather than patching in place; CodeDeploy auto-deploys the current target revision to new ASG instances during scale-out.
- CloudFormation and CDK cover declarative provisioning, while Systems Manager (Run Command, State Manager, Automation, Patch Manager) covers ongoing operations automation.
Domain 4: Security and Compliance
- Attach an IAM role via an instance profile to grant EC2 AWS permissions through temporary, auto-rotated IMDS credentials - never embed long-term access keys in code or config.
- Security Groups are stateful and operate at the instance/ENI level (return traffic auto-allowed); Network ACLs are stateless at the subnet level and require explicit rules for BOTH directions.
- In IAM policy evaluation an explicit Deny always overrides any Allow, regardless of which policy granted the Allow.
- AWS Config continuously records resource configurations and evaluates them against managed/custom rules for compliance and drift, and can trigger SSM Automation remediation.
- Amazon GuardDuty uses ML, anomaly detection, and threat intelligence over CloudTrail events, VPC Flow Logs, and DNS query logs to detect compromised credentials and malicious activity.
- AWS CloudTrail records every API call (caller identity, timestamp, source IP, parameters); aws cloudtrail create-trail --is-organization-trail plus aws cloudtrail start-logging enable and activate an org-wide trail.
- Secure an S3 bucket by enabling all four S3 Block Public Access settings (BlockPublicAcls, IgnorePublicAcls, BlockPublicPolicy, RestrictPublicBuckets), enforcing default encryption, and applying least-privilege policies.
- Enforce KMS encryption on a bucket with aws s3api put-bucket-encryption using SSEAlgorithm aws:kms; SSE-KMS with a customer managed key gives you key rotation control and detailed CloudTrail access logging.
- Retrieve temporary cross-account credentials with aws sts assume-role --role-arn --role-session-name; for cross-account S3, attach an IAM role to the instance and scope the target bucket policy to that role.
- Add an inbound SSH rule with aws ec2 authorize-security-group-ingress --group-id --protocol tcp --port 22 --cidr; attach a managed policy with aws iam attach-role-policy --role-name --policy-arn.
- Enforce EBS encryption account-wide by enabling EBS encryption by default in EC2 settings and using an AWS Config rule (encrypted-volumes) to detect non-compliant volumes.
- AWS WAF with AWS-managed rule groups protects web apps from common exploits (SQL injection, XSS) at the ALB, API Gateway, or CloudFront layer.
- Use AWS IAM Identity Center or SAML federation to provide workforce users temporary role-based credentials instead of long-lived IAM users.
- IAM roles eliminate static secrets; combine with permission boundaries and least privilege so each principal can only perform required actions.
Domain 5: Networking and Content Delivery
- A subnet is public if its route table has a 0.0.0.0/0 (or ::/0) route to an Internet Gateway; a private subnet has no such IGW route.
- Private-subnet instances reach the internet via a NAT gateway placed in a public subnet PLUS a 0.0.0.0/0 route in the private subnet's route table pointing to that NAT gateway.
- Provisioning a NAT gateway requires two steps: aws ec2 allocate-address for an Elastic IP, then aws ec2 create-nat-gateway referencing that EIP and a public subnet.
- Create core VPC plumbing with aws ec2 create-vpc --cidr-block 10.0.0.0/16 and aws ec2 create-route --route-table-id --destination-cidr-block 0.0.0.0/0 --gateway-id.
- Gateway VPC endpoints (S3 and DynamoDB only) route traffic privately at no extra cost; Interface endpoints (PrivateLink) use ENIs and are created with aws ec2 create-vpc-endpoint --vpc-endpoint-type Interface.
- If a PrivateLink interface endpoint's hostname resolves to public IPs, Private DNS is not enabled on the endpoint - enable it so the service's default hostname resolves to private addresses.
- VPC peering requires route table entries in BOTH VPCs pointing the peer's CIDR to the peering connection; missing routes are the most common cause of peering connectivity failures.
- AWS Transit Gateway is a regional hub connecting many VPCs, VPN, and Direct Connect attachments, avoiding the O(n^2) sprawl of full-mesh VPC peering.
- AWS Direct Connect provides dedicated, private, high-bandwidth connectivity with predictable latency, versus a Site-to-Site VPN over the public internet.
- Application Load Balancer (Layer 7) supports content-based routing on host header, URL path, query string, and HTTP method; Network Load Balancer (Layer 4) handles TCP/UDP at ultra-low latency and preserves the client source IP.
- To see the real client IP behind an ALB, read the X-Forwarded-For header; an NLB preserves source IP natively. Create a target group with health checks via aws elbv2 create-target-group --health-check-path /health.
- Amazon CloudFront caches content at 600+ points of presence (edge locations) worldwide to reduce latency; restrict origin access so the origin only accepts requests from CloudFront (OAC) to protect it.
- A Route 53 private hosted zone associated with your VPCs provides private internal DNS resolution for resources not exposed publicly.
- Security Groups vs NACLs is a frequent exam point: SG = stateful, instance-level, allow rules only; NACL = stateless, subnet-level, supports explicit deny and ordered numbered rules.
Domain 6: Cost and Performance Optimization
- Right-size over-provisioned instances (consistently low CPU/memory) to a smaller, cheaper instance type to cut hourly cost while keeping adequate performance.
- AWS Compute Optimizer analyzes CloudWatch CPU, memory, network, and disk metrics with ML to recommend right-sizing; fetch results with aws compute-optimizer get-ec2-instance-recommendations.
- Savings Plans and Reserved Instances give up to 72% off On-Demand for steady 1-3 year usage; Compute Savings Plans are the most flexible, applying across instance family, size, and Region.
- A zonal Reserved Instance reserves capacity in one specific AZ and only discounts usage in that AZ, unlike a regional RI which is more flexible but provides no capacity reservation.
- Spot Instances offer the deepest discount for fault-tolerant, interruptible batch workloads; request one with aws ec2 run-instances --instance-market-options MarketType=spot.
- AWS Cost Explorer visualizes and forecasts spend across services, accounts, and tags; AWS Budgets sets cost/usage thresholds and alerts via SNS - create one with aws budgets create-budget --account-id with a JSON definition.
- S3 Lifecycle policies transition objects to cheaper classes (Standard-IA, Glacier Instant Retrieval, Glacier Deep Archive) on a schedule; apply with aws s3api put-bucket-lifecycle-configuration.
- S3 Intelligent-Tiering automatically moves objects between access tiers based on usage patterns, avoiding retrieval-fee surprises when access is unpredictable.
- Put Amazon ElastiCache (Redis/Memcached) in front of RDS to cache hot reads, reducing database load and read latency for read-heavy workloads.
- CloudFront caching plus S3 lifecycle rules together cut both data-transfer and storage costs for static assets.
- EBS performance: gp3 lets you provision IOPS and throughput independently of size; on gp2 you must increase volume size to raise baseline IOPS. io1/io2 Provisioned IOPS SSD gives consistently high, low-latency IOPS for critical databases.
- Configure dynamic scaling with a target-tracking policy via aws autoscaling put-scaling-policy --policy-type TargetTrackingScaling and inspect with describe-policies.
- T3/T2 burstable instances earn CPU credits; in unlimited mode, sustained high CPU after credits are exhausted incurs surcharges and (in standard mode) throttling - choose a non-burstable family for steady high load.
- Match storage and compute to the workload: use right-sizing and Compute Optimizer for compute, lifecycle/tiering for storage, and caching/CDN to offload origins, then verify savings in Cost Explorer.
AWS SOA-C02 exam tips
- Expect three exam-lab style scenarios in the older format, but SOA-C02 is now multiple-choice/multiple-response only - still practice the CLI commands (put-metric-alarm, send-command, create-change-set, assume-role) because exact flags appear in answer options.
- When a question asks for OS-level metrics (memory, disk) on EC2, the answer almost always involves installing the CloudWatch unified agent - default EC2 metrics never include these.
- Distinguish RTO vs RPO and map DR strategies (backup/restore, pilot light, warm standby, multi-site) to cost-versus-recovery-time tradeoffs; the cheapest option that meets the stated RTO/RPO is usually correct.
- For security questions, default to IAM roles over access keys, explicit Deny wins, S3 Block Public Access for exposure, and SSE-KMS with a customer managed key when key control or audit is required.
- Read carefully for 'most cost-effective' versus 'highest performance' - the same scenario has different correct answers (e.g. Spot/lifecycle for cost, Provisioned IOPS/Savings Plans for predictable performance).
Study guide FAQ
How many questions are on the SOA-C02 exam and what is the passing score?
The exam contains 65 scored questions (plus unscored items) answered in 180 minutes, and you need a scaled score of 720 out of 1000 to pass. CertGrid's bank for this exam is much larger so you can practice broadly.
Which domain carries the most weight, and where should I focus?
Cost and Performance Optimization and Security and Compliance are the heaviest areas in practice, followed by Networking. Make sure you are fluent in CloudWatch monitoring, IAM roles, VPC routing (NAT/IGW/endpoints), and EBS/instance right-sizing.
Do I really need to memorize CLI commands and flags?
Yes. Many questions present CLI snippets and ask which is correct or what flag is missing. Know the key calls and their required flags: put-metric-alarm, put-metric-data, ssm send-command/put-parameter, cloudformation deploy/create-change-set, ec2 create-snapshot, and s3api put-public-access-block.
How much hands-on experience is recommended before taking it?
AWS recommends at least one year operating, managing, and deploying workloads on AWS. The exam rewards practical operational knowledge - alarms, automation, failover, patching, and troubleshooting - far more than memorized definitions.