CertGrid
Google Cloud Study Guide

Google Cloud Professional Data Engineer Study Guide

The Google Cloud Professional Data Engineer exam validates your ability to design data processing systems, build and operationalize batch and streaming pipelines, store and prepare data for analysis, and enable machine learning on Google Cloud. It targets practitioners who design, build, secure, and maintain data systems, and the 120-minute exam scores on a 700-point scale (passing is roughly 70%). Expect heavily scenario-based questions that ask you to pick the most cost-effective, scalable, and operationally sound service for a given access pattern, latency, and consistency requirement.

Domain 1: Designing Data Processing Systems

Key concepts you must know · 149 practice questions

Domain 2: Ingesting and Processing the Data

Key concepts you must know · 189 practice questions

Domain 3: Storing the Data

Key concepts you must know · 151 practice questions

Domain 4: Preparing and Using Data for Analysis

Key concepts you must know · 143 practice questions

Domain 5: Maintaining and Automating Data Workloads

Key concepts you must know · 173 practice questions

Google Cloud Professional Data Engineer exam tips

Study guide FAQ

How long is the exam and what score do I need to pass?

You have 120 minutes to answer roughly 50-60 multiple-choice and multiple-select questions. Scoring is on a 700-point scale where 700 is the passing mark, which is approximately 70% correct. There is no penalty for wrong answers, so answer every question.

How much hands-on Google Cloud experience does Google recommend before taking it?

Google recommends roughly 3 or more years of industry experience including 1 or more years designing and managing solutions on Google Cloud. The exam is deeply scenario-based, so practical familiarity with BigQuery, Dataflow, Pub/Sub, Bigtable, and Dataproc matters far more than rote memorization.

Do I need to know machine learning and BigQuery ML in depth?

You need a working understanding rather than deep ML expertise. Know when to use BigQuery ML (SQL-based models like linear/logistic regression and forecasting), Vertex AI for custom and managed ML, and pre-trained APIs. Expect questions on choosing the right tool and on preventing issues like overfitting and data leakage rather than deriving algorithms.

How current is the exam, and does it still cover legacy services?

The exam tracks current Google Cloud services and naming, so expect Dataplex, BigLake, Datastream, Dataform, the BigQuery Storage Write/Read APIs, and BigQuery Editions slot reservations. Legacy terms like Data Studio are now Looker Studio and legacy streaming inserts are superseded by the Storage Write API; favor the modern service in answers unless a question explicitly constrains otherwise.