DP-100: Azure Data Scientist Associate Practice Exam Questions

What the DP-100 exam covers

Design and Prepare a Machine Learning Solution167 questions
Explore Data and Train Models165 questions
Prepare a Model for Deployment161 questions
Deploy and Retrain a Model164 questions

Free DP-100 sample questions

A sample of 10 questions with answers and explanations. Sign up free to practice all 657.

Question 1Design and Prepare a Machine Learning Solution

Which Azure service is the primary platform for end-to-end machine learning (training, deployment, MLOps)?
- AAzure Machine LearningCorrect
- BAzure Databricks
- CAzure AI Foundry
- DAzure Synapse Analytics
✓ Correct answer: A

Azure Machine Learning is the comprehensive, first-party platform from Microsoft designed to support the entire end-to-end machine learning lifecycle. It provides integrated tools for data preparation, model training (with managed compute clusters), experiment tracking, model registration and versioning, and production deployment through endpoints and MLOps capabilities. The workspace serves as the central hub that connects all these components together, enabling teams to collaborate effectively on ML projects at scale.
Why the other options are wrong
- BAzure Databricks is a Spark-based analytics and data engineering platform; it supports ML but is not Microsoft's dedicated first-party end-to-end ML platform with native model registry, endpoints, and MLOps.
- CAzure AI Foundry focuses on building generative AI apps and agents over foundation models, not the classic end-to-end training/deployment/MLOps lifecycle.
- DAzure Synapse Analytics is an enterprise data warehousing and big-data analytics service, not a purpose-built ML training and deployment platform.
Question 2Explore Data and Train Models

You trained a classification model and need to evaluate it on imbalanced data where the positive class represents only 5% of samples. Solution: You evaluate using the F1 score as the primary metric. Does this meet the goal?
- ANo
- BYesCorrect
✓ Correct answer: B

B - Yes, this solution meets the goal. When evaluating a classification model on imbalanced data where the positive class represents only 5% of samples, the F1 score is an excellent primary metric. F1 is the harmonic mean of precision and recall, making it robust to class imbalance: it weighs both false positives and false negatives equally and achieves its maximum value only when both precision and recall are high. For a minority class in imbalanced data, F1 prevents the metric from being gamed by models that simply predict the majority class frequently. Using F1 as the primary metric ensures the evaluation focuses on actual predictive performance on the minority class rather than overall accuracy, which would be dominated by majority class performance. This is a well-established best practice for imbalanced classification evaluation.
Why the other options are wrong
- ANo is incorrect. The correct answer is Yes. B - Yes, this solution meets the goal. When evaluating a classification model on imbalanced data where the positive class represents only 5% of samples, the F1 score is an excellent primary metric.
Question 3Deploy and Retrain a Model

Which command deletes an online endpoint named 'my-endpoint' (and its deployments) without an interactive prompt?
- Aaz ml online-endpoint delete --name my-endpoint --yesCorrect
- Baz ml endpoint delete my-endpoint --confirm
- Caz ml online-endpoint remove --name my-endpoint --force
- Daz ml online-endpoint destroy -n my-endpoint --no-prompt
✓ Correct answer: A

The --yes flag suppresses the interactive confirmation prompt that would normally pause execution.
Why the other options are wrong
- BThe Azure CLI requires either --yes or explicit user confirmation to proceed with destructive operations on online endpoints and their associated deployments. This ensures deletions cannot occur unexpectedly in automated scripts. B is incorrect because the endpoint command variant does not exist; the correct command namespace is online-endpoint, and --confirm is not a recognized flag for deletion operations in the Azure ML CLI.
- CC is incorrect because remove is not the standard Azure ML CLI command for deleting online endpoints; delete is the correct command, and --force does not suppress interactive prompts the same way --yes does.
- DD is incorrect because destroy is not a valid Azure ML CLI command for endpoints, and --no-prompt is not a recognized flag; the standard flag to bypass confirmation is --yes.
Question 4Deploy and Retrain a Model

Where are the platform metrics (request count, latency, CPU/GPU/memory utilization) of an Azure ML online endpoint surfaced for charting and alerting?
- AApplication Insights traces only
- BAzure Monitor metricsCorrect
- CA Log Analytics saved KQL query
- DThe workspace activity log
✓ Correct answer: B

Platform metrics for an Azure ML online endpoint, such as request count, latency, and CPU/GPU/memory utilization, are surfaced through Azure Monitor metrics, where you can chart them and configure metric alerts. Azure Monitor is the central metrics platform for Azure resources, including online endpoints.
Why the other options are wrong
- AApplication Insights traces capture request-level telemetry and exceptions, but the platform metrics like request count, latency, and utilization are surfaced through Azure Monitor metrics for charting and alerting.
- CA saved KQL query runs over logs in Log Analytics; the numeric platform metrics themselves are exposed through Azure Monitor metrics, not stored as a query.
- DThe workspace activity log records management operations, not the endpoint's request-count, latency, and CPU/GPU/memory metrics.
Question 5Design and Prepare a Machine Learning Solution

A training script fails with 'AuthorizationFailure' when reading from a storage account via a datastore, even though the data exists. The compute uses a managed identity. What is the most likely fix?
- ARecreate the failing data asset again under a brand-new name and then simply submit the job again
- BIncrease the backing compute cluster's per-node VM size setting for the training job
- CGrant the compute's managed identity the Storage Blob Data Reader role on the account/containerCorrect
- DDisable MLflow run tracking entirely for the failing training job on the cluster
✓ Correct answer: C

An 'AuthorizationFailure' when reading existing data via an identity-based datastore means the compute's managed identity lacks data-plane permission on the storage. Granting it the Storage Blob Data Reader role (data-plane RBAC) on the account or container lets it read the blobs. The data exists; the fix is the missing role assignment.
Why the other options are wrong
- ARecreating the asset does not grant storage permissions.
- BA bigger VM does not fix an authorization failure.
- DDisabling MLflow is unrelated to storage authorization.
Question 6Design and Prepare a Machine Learning SolutionSelect all that apply

An administrator at Relecloud is planning to use fairness assessment. Which two of the following are requirements or features of this solution? (Choose two.)
- AAssessment fairness
- BCompute targetsCorrect
- CCompute instancesCorrect
- DCompute clusters
- EExperiment management
✓ Correct answer: B, C

Running a fairness assessment, like any responsible-AI computation, needs somewhere to execute. Compute targets are the named compute resources Azure ML can run jobs on, and compute instances are the managed single-user machines often used to author and run such analyses interactively. Both provide the execution capacity the assessment requires. The reworded label and the general experiment-management option are not the specific compute features needed.
Why the other options are wrong
- AAssessment fairness is a reworded label, not a distinct feature or requirement.
- DCompute clusters are also compute, but the pairing here specifies compute targets and instances; clusters are scale-out training capacity rather than the interactive compute typically used for an assessment.
- EExperiment management organizes runs and metrics and does not provide the compute on which the assessment executes.
Question 7Explore Data and Train ModelsSelect all that apply

A consultant is reviewing the feature engineering configuration at Proseware Inc. Which two actions should be performed to optimize the implementation? (Choose two.)
- AHyperparameter tuningCorrect
- BConfusion matrix
- CClassification algorithms
- DDisable feature engineering monitoring
- EDistributed trainingCorrect
✓ Correct answer: A, E

Optimizing a feature-engineering configuration here pairs with techniques that improve and accelerate the resulting models. Hyperparameter tuning searches for the configuration that best exploits the engineered features, and distributed training speeds those experiments across multiple nodes or GPUs. Both are concrete optimization levers. The confusion-matrix, classification, and disable options are not the requested pairing.
Why the other options are wrong
- BConfusion matrix is an evaluation artifact and is not the tuning-and-distributed-training pairing specified.
- CClassification algorithms define a task type and are not the optimization pairing requested in this item.
- DDisable feature engineering monitoring removes oversight rather than optimizing the work.
Question 8Prepare a Model for Deployment

When configuring model packaging at Fabrikam Inc, administrators must also consider model validation as part of the overall architecture.
- TrueTrueCorrect
- FalseFalse
✓ Correct answer: True

Model packaging bundles the trained model artifacts, serialized weights, and metadata into a deployment-ready format. Model validation is a prerequisite quality assurance step that verifies the packaged model can deserialize correctly, produces expected predictions on test data, and meets performance thresholds before deployment. Without validation, packaging may succeed technically while the model is corrupted or incompatible with the target inference framework. Validation checks ensure that packaged models can actually be loaded by the inference engine and function as intended in production, making it an integral part of the packaging workflow.
Why the other options are wrong
- FalseFalse is incorrect. The statement is true. Model packaging bundles the trained model artifacts, serialized weights, and metadata into a deployment-ready format. Model validation is a prerequisite quality assurance step that verifies the packaged model can deserialize correctly, produces expected predictions on test data, and meets performance thresholds before deployment.
Question 9Design and Prepare a Machine Learning Solution

A data science team needs an Azure Machine Learning data asset that always references the newest files dropped into a folder, so that each training run reads whatever data currently exists at that path without re-registering a new version. Which data asset type should you use?
- AA folder (uri_folder) data asset pointing to the directory pathCorrect
- BAn MLTable data asset with a fixed snapshot of file paths
- CA file (uri_file) data asset pointing to a single file
- DA registered table data asset versioned at creation time
✓ Correct answer: A

A uri_folder asset stores a path to a folder rather than a materialized snapshot, so each job that mounts or downloads the folder picks up the current contents. This satisfies the need to always read the latest files without registering a new version for every change.
Why the other options are wrong
- BAn MLTable that captures a fixed snapshot of paths would not automatically reflect newly added files.
- CA uri_file points to one specific file, so it cannot dynamically include new files in a folder.
- DA versioned table asset is pinned at creation, so it would not change to reflect newly arrived files.
Question 10Prepare a Model for Deployment

You convert a PyTorch model to ONNX and serve it with ONNX Runtime on a managed online endpoint. What is the primary benefit of this conversion for the deployment?
- AOptimized, framework-agnostic inference that can speed up scoring and reduce dependenciesCorrect
- BAutomatic retraining of the model whenever new data arrives
- CBuilt-in role-based access control for the endpoint
- DElimination of the need to register the model in the workspace
✓ Correct answer: A

ONNX is an open, framework-neutral format, and ONNX Runtime applies graph optimizations and hardware acceleration to improve inference speed and lower latency. It also lets you avoid shipping the full training framework in the serving container, reducing image size and dependency surface. It does not change retraining, security, or registration requirements.
Why the other options are wrong
- BONNX conversion is about inference optimization; it does not trigger or perform retraining.
- CEndpoint access control is configured via auth modes and RBAC, independent of the model format.
- DYou still register the model regardless of whether it is in ONNX or native format.

Unlock all 657 DP-100 questions

Related Microsoft resources

DP-100 practice exam FAQ

How many questions are in the DP-100 practice exam on CertGrid?

CertGrid has 657 practice questions for DP-100: Azure Data Scientist Associate, covering 4 exam domains. The real DP-100 exam is 40-60 qs in 100 min. CertGrid's timed mock is a fixed 50 questions.

What is the passing score for DP-100?

The DP-100 exam passing score is 70%, and you have about 100 min to complete it. CertGrid scores your practice attempts the same way so you know when you are ready.

Are these official DP-100 exam questions?

No. CertGrid is an independent practice platform. We do not provide real or leaked exam questions. Our questions are original and designed to help you practice the concepts, scenarios, and difficulty style of the DP-100: Azure Data Scientist Associate exam.

Can I practice DP-100 for free?

Yes. You can start practicing DP-100: Azure Data Scientist Associate for free with daily practice and sample questions. Paid plans unlock full timed exams, complete explanations, and domain analytics.

What CertGrid is (and is not)

CertGrid is an independent IT certification practice platform for Azure, AWS, Google, Cisco, Security, Linux, Kubernetes, Terraform, and other certification tracks. It provides objective-mapped practice questions, readiness scoring, weak-domain drills, and explanations to help learners understand what to study next.

Independent & original. CertGrid is an independent practice platform and is not affiliated with or endorsed by Microsoft. Questions are original practice items designed to mirror certification concepts and exam style. CertGrid does not provide official exam questions or braindumps.