GCP Associate Cloud Engineer Study Guide
The Google Cloud Associate Cloud Engineer (ACE) exam validates the hands-on ability to deploy applications, monitor operations, and manage enterprise solutions on Google Cloud, with heavy emphasis on the gcloud CLI, the Console, IAM, and core compute, storage, and networking services. It is a 2-hour, multiple-choice and multiple-select exam (passing score 700 on a scaled basis) aimed at engineers and administrators with at least 6 months of hands-on Google Cloud experience. Expect scenario-based questions that ask you to pick the most cost-effective, secure, and operationally sound option rather than recall of pure definitions.
Domain 1: Setting Up a Cloud Solution Environment
- The resource hierarchy is Organization > Folders > Projects > Resources; IAM policies set at a higher level are inherited by everything below, and child-level policies are additive (you cannot remove an inherited grant lower down).
- An Organization node requires a Cloud Identity or Google Workspace account tied to a verified domain; it is created automatically the first time such a domain user signs in to Google Cloud.
- Use Organization Policy constraints (e.g. gcp.resourceLocations) for centralized governance above IAM; the Resource Locations constraint can restrict resource creation to specific regions such as us-east1 and us-central1.
- constraints/compute.trustedImageProjects restricts which projects images can be created from; constraints/storage.uniformBucketLevelAccess and the public-access-prevention constraint help lock down Cloud Storage.
- Each project has a globally unique, immutable Project ID, a mutable Project Name, and a system-assigned Project Number; billing must be linked to a project before most resources can be created.
- To create or link billing you need Billing Account Administrator; to associate a project with an existing billing account you need the Billing Account User role plus project-level billing permissions.
- Set budgets and alerts under Billing > Budgets & alerts; budgets do NOT cap spend by default. To actually stop spend, publish budget notifications to a Pub/Sub topic that triggers a Cloud Function which disables billing on the project.
- Enable billing export to BigQuery in the billing account settings, choose a destination dataset, and grant analysts the BigQuery Data Viewer role on that dataset for detailed cost analysis.
- Use the Google Cloud Pricing Calculator to estimate costs before deployment; it is the recommended pre-purchase estimation tool.
- APIs must be enabled per project before use: gcloud services enable compute.googleapis.com (or container.googleapis.com, etc.); script enablement by iterating projects with gcloud services enable.
- gcloud named configurations let you store separate project/account/region settings: gcloud config configurations create <name>, then gcloud config configurations activate <name> to switch contexts.
- Cloud Shell is a free, ephemeral Debian VM with the SDK preinstalled and 5 GB of persistent $HOME storage; the VM itself is reclaimed after about 20 minutes of inactivity (non-home data is lost).
- If gcloud is 'command not found' after installing the Cloud SDK, the SDK's bin directory has not been added to the system PATH environment variable.
- Enforce account security org-wide by turning on 2-Step Verification in the Google Workspace / Cloud Identity Admin console; use the Cloud Foundation Toolkit (Terraform modules) to bootstrap a best-practice organization.
Domain 2: Planning and Configuring a Cloud Solution
- Committed Use Discounts (CUDs) give 1- or 3-year discounts for steady baseline capacity; combine CUDs for baseline with on-demand or autoscaling for burst. Sustained Use Discounts apply automatically to long-running Compute Engine usage.
- Spot VMs (the successor to Preemptible VMs) are deeply discounted but can be reclaimed at any time with a 30-second warning; use them for fault-tolerant, checkpoint-able batch workloads, not stateful services.
- Cloud Storage classes by access frequency: Standard (hot), Nearline (>=30-day min, monthly access), Coldline (>=90-day min, quarterly), Archive (>=365-day min); Object Lifecycle Management automates transitions and deletions.
- Pick managed databases by shape: Cloud SQL (managed MySQL/PostgreSQL/SQL Server, regional, ~64 TB) for relational OLTP; Spanner for global, horizontally scalable, strongly consistent relational; Bigtable for high-throughput wide-column NoSQL; Firestore (Native mode) for document data with real-time listeners and offline support.
- BigQuery is the serverless, fully managed data warehouse for large-scale SQL analytics; you pay for storage plus query bytes scanned (or via slot/capacity pricing).
- Streaming analytics reference pattern: ingest with Pub/Sub, process with Dataflow (Apache Beam), and land results in BigQuery for dashboards; use Dataproc (managed Spark/Hadoop) with autoscaling for existing Spark jobs.
- Compute choices: Compute Engine (full VM control), GKE (containers/orchestration), Cloud Run (serverless stateless containers, scales to zero), Cloud Functions 2nd gen (event-driven, pay-per-invocation), App Engine (managed app platform).
- GKE Autopilot fully manages nodes, scaling, and patching and bills per running pod; GKE Standard gives node-pool control (needed for custom machine types, GPUs, or special node config).
- Cloud SQL high availability uses a synchronously replicated standby in a second zone within the same region for automatic failover; read replicas serve read scaling, not HA.
- For high availability on Compute Engine, use a regional Managed Instance Group spanning multiple zones behind a load balancer with health checks and autoscaling; MIGs provide autohealing and rolling updates.
- Host a static website on a Cloud Storage bucket and put Cloud CDN (via an external HTTP(S) Load Balancer) in front for global edge caching; use multi-regional/dual-region buckets for geo-redundancy.
- Dual-region buckets with turbo replication provide a 15-minute RPO for near-synchronous cross-region replication of critical data.
- For ML/HPC, use accelerator-optimized A2/A3 machine types with attached NVIDIA GPUs (NVLink for high-bandwidth GPU-to-GPU); choose appropriate machine families (E2 cost-optimized, N2/N2D general, C2 compute-optimized, M-series memory-optimized).
- Cloud Run supports gradual rollout via traffic splitting across revisions (e.g. 90/10) for canary and blue/green deployments.
Domain 3: Deploying and Implementing a Cloud Solution
- Create Compute Engine VMs with gcloud compute instances create; build reusable instance templates and back MIGs with them, enabling autohealing via a health check plus an autoscaler.
- Deploy serverless containers with gcloud run deploy; set the ingress flag to 'internal' to restrict access, and use environment variables (Console, gcloud, or YAML) for configuration.
- Cloud Run and Cloud Functions reach private VPC resources (such as a Cloud SQL private IP or internal services) through a Serverless VPC Access connector.
- Store and serve container images from Artifact Registry (the successor to Container Registry); create a Docker-format repository and push images there. App images for App Engine deploy via gcloud app deploy.
- Cloud Build automates CI/CD: define steps in cloudbuild.yaml and create a trigger (e.g. on push to the main branch) to build, test, and deploy automatically.
- Event triggers: Cloud Storage object finalize fires on google.cloud.storage.object.v1.finalized; Pub/Sub triggers invoke a function when a message lands on a named topic.
- GKE Service types: ClusterIP (internal only), NodePort (exposes on each node's port, often behind another LB), LoadBalancer (provisions an external/internal LB); Ingress provides L7 HTTP(S) routing.
- Use readiness probes so traffic only reaches healthy pods and liveness probes to restart hung containers; set resource requests/limits to guide scheduling and the cluster autoscaler.
- Networking: a custom-mode VPC lets you define subnets per region (each subnet has a regional CIDR); VPC is global while subnets are regional. Use Cloud NAT to give private (no external IP) instances outbound internet access.
- Control east-west traffic with network tags or service accounts referenced in firewall rules; default VPC firewall denies ingress and allows egress, and rules are evaluated by priority (lower number wins).
- Connect App Engine Standard to Cloud SQL via the built-in Unix-socket Cloud SQL proxy by specifying the instance connection name in app.yaml; from GKE/Compute use the Cloud SQL Auth Proxy or private IP.
- Store Terraform state remotely in a versioned Cloud Storage bucket using the GCS backend (which provides state locking); use workspaces or separate var files to manage per-environment or per-developer stacks.
- Hybrid connectivity: Cloud VPN for encrypted tunnels over the internet; Dedicated Interconnect (10/100 Gbps) or Partner Interconnect for private, high-bandwidth on-prem links.
- Load data into BigQuery with bq load (supports wildcard GCS URIs to match many files); define datasets/tables via Terraform (google_bigquery_dataset) including location and default table expiration.
Domain 4: Ensuring Successful Operation of a Cloud Solution
- Cloud Monitoring shows resource dashboards (CPU, memory, disk I/O) and hosts alerting policies; install the Ops Agent on VMs to collect memory and detailed disk metrics that are not available by default.
- Cloud Logging centralizes logs; filter by resource type and labels (e.g. resource type dataflow_step plus a Dataflow job ID) and route logs with sinks to Cloud Storage, BigQuery, or Pub/Sub.
- Create log-based metrics from log patterns (e.g. error counts) and attach alerting policies to them; deliver alerts through notification channels such as email, SMS, Slack, or PagerDuty.
- Cloud Monitoring uptime checks probe public HTTP/HTTPS/TCP endpoints from multiple global locations and can trigger alerts when the endpoint is unreachable or returns errors.
- The _Default and _Required log sinks/buckets exist by default; reduce cost and noise by adding exclusion filters (e.g. drop debug logs from non-prod) on the _Default sink. _Required logs cannot be excluded.
- Debug boot/connectivity problems by viewing serial port output: gcloud compute instances get-serial-port-output, or enable interactive serial console access.
- Compute Engine snapshots are incremental and crash-consistent and can be taken while the instance is running; schedule them with snapshot schedules for automated backups.
- Cloud SQL data protection: enable automated daily backups plus point-in-time recovery (PITR, which requires binary/transaction logging); to upgrade a major version safely, clone the instance, upgrade and test the clone, then upgrade production.
- Cluster autoscaling: enable the GKE cluster autoscaler on a node pool with min/max node counts; pair it with the Horizontal Pod Autoscaler for pod-level scaling based on metrics.
- Use Compute Engine instance schedules (start/stop policies) to power VMs on at 8 AM and off at 6 PM to cut cost for non-24x7 workloads.
- Cloud SQL Query Insights surfaces top resource-consuming queries and query latency for relational performance tuning; GKE usage metering exports namespace-level CPU/memory consumption to BigQuery for chargeback.
- Tune Cloud Run scaling and timeouts: raise max instances, adjust per-instance concurrency, increase the request timeout (up to 60 minutes for Cloud Run), and add CPU/memory for heavy requests.
- Automate object cleanup with Object Lifecycle Management Delete actions (e.g. delete objects older than 365 days) or class transitions, instead of manual deletion.
- Move very large datasets with the Transfer Appliance (ship physical hardware) for petabyte/offline migrations, or Storage Transfer Service for online transfers; detect and fix Terraform drift with terraform plan then terraform apply.
Domain 5: Configuring Access and Security
- An IAM policy binds members (users, groups, service accounts, domains) to roles on a resource; prefer granting roles to Google Groups, then manage membership in the group, and apply at folder/project level for inheritance.
- Role types: basic (Owner/Editor/Viewer - too broad, avoid in production), predefined (service-scoped, recommended), and custom (least-privilege tailored sets). Follow least privilege and use predefined/custom over basic.
- Service accounts are both an identity and a resource: attach a custom service account with only the needed roles to a VM and use the metadata server for credentials instead of downloading and storing key files.
- IAM Conditions add context to a grant - e.g. an expiry condition gives a contractor access that automatically lapses after 30 days; conditions can also restrict by resource name or request attributes.
- Use IAM Recommender to find and remove excess/unused permissions, and disable or delete service accounts unused for 90+ days after confirming they are not needed.
- Workload Identity is the recommended way for GKE pods to call Google APIs: bind a Kubernetes service account to a Google service account so pods get credentials without node-stored keys.
- Store secrets in Secret Manager (versioned, IAM-controlled) instead of in code or env files; mount them into GKE via the Secret Manager CSI driver; runtime service accounts read secrets they are authorized for.
- Encryption at rest is automatic with Google-managed keys; for control over key rotation/lifecycle use Customer-Managed Encryption Keys (CMEK) in Cloud KMS, settable as the default key for BigQuery datasets, Cloud Storage buckets, and disks.
- VPC Service Controls create a service perimeter around managed services (e.g. BigQuery, Cloud Storage) to prevent data exfiltration - even an identity with valid IAM cannot move data across the perimeter boundary.
- Shared VPC lets a host project share subnets with service projects; grant a service-project user the Compute Network User role (roles/compute.networkUser) on the specific subnet to let them use it.
- Audit logs: Admin Activity logs (always on, no charge) record config/metadata changes; Data Access logs (mostly off by default, can be enabled) record reads/writes of data; both go to Cloud Logging.
- Secure service-to-service calls: make a Cloud Run service require authentication and grant the caller's service account the Cloud Run Invoker role (roles/run.invoker) so it presents an identity token; use Anthos Service Mesh for automatic mTLS between services.
- Create firewall rules with gcloud compute firewall-rules create, scoping with --target-tags and --source-ranges; restricting SSH to 0.0.0.0/0 is risky - prefer Identity-Aware Proxy (IAP) TCP forwarding for SSH without external IPs.
- Container security: enable Artifact Registry/Container Analysis vulnerability scanning so images are scanned on push, and use Binary Authorization to allow only attested images to deploy.
GCP Associate Cloud Engineer exam tips
- Read for the qualifier words - 'most cost-effective', 'least privilege', 'minimal operational overhead', 'fastest', 'highest availability' - these usually decide between two technically valid answers.
- Know the gcloud command structure cold (gcloud <group> <resource> <verb> --flags) and recognize the most common ones for compute instances, IAM, projects/config, run, container, and storage (gsutil/gcloud storage).
- Default to managed and serverless options (Cloud Run, Cloud Functions, GKE Autopilot, Cloud SQL, BigQuery) when the scenario stresses low ops burden, and reserve self-managed Compute Engine/GKE Standard for custom-control requirements.
- Memorize the resource hierarchy and IAM inheritance, the difference between organization policy constraints and IAM, and the storage-class minimum durations (Nearline 30d, Coldline 90d, Archive 365d).
- Manage your 120 minutes: flag and skip scenario-heavy questions on the first pass, answer the quick recall items, then return - and never leave a question blank since there is no penalty for guessing.
Study guide FAQ
How is the Associate Cloud Engineer exam scored and what is passing?
It is reported on a scaled basis with a passing score of 700; you get a simple pass/fail result, not a percentage breakdown. The exam is about 50-60 multiple-choice and multiple-select questions in 120 minutes, and there is no penalty for wrong answers, so answer every question.
How much hands-on experience do I need before attempting it?
Google recommends at least 6 months of hands-on experience building and managing solutions on Google Cloud. The exam is practical, so the best preparation is actually using the gcloud CLI, Cloud Console, and core services (Compute Engine, GKE, Cloud Run, Cloud Storage, IAM, VPC) rather than memorizing facts alone.
Should I focus on the gcloud CLI or the Cloud Console?
Both, but the gcloud CLI is heavily tested - you must recognize correct command syntax, flags, and config/configuration switching. Console-based steps (billing, budgets, IAM grants, monitoring setup) also appear, so practice each task in both interfaces to know where settings live and what the equivalent command is.
What topics carry the most weight on the exam?
Deploying and implementing solutions (Domain 3) is the largest area, followed by planning/configuring and ensuring successful operation. Across all domains, IAM and security, the resource hierarchy, networking (VPC, firewall rules, load balancing), and choosing the right compute/storage/database service for a scenario are the recurring high-value themes.