CertGrid
Microsoft Certification

AI-900: Azure AI Fundamentals Practice Exam

Validates foundational knowledge of machine learning and AI concepts and related Azure services.

Practice 875 exam-style AI-900 questions with full answer explanations, then take timed mock exams that score like the real thing.

875
Practice questions
40
On the real exam
700
Passing score
85 min
Exam length

What the AI-900 exam covers

Free AI-900 sample questions

A sample of 10 questions with answers and explanations. Sign up free to practice all 875.

  1. Question 1Describe AI Workloads and Considerations

    Which of the following BEST describes artificial intelligence (AI)?

    • ASoftware that exclusively replaces human workers in manufacturing
    • BHardware that processes data faster than a human brain
    • CSoftware that mimics human behaviors and capabilities such as visual perception, speech recognition, and decision-makingCorrect
    • DA database that stores large amounts of information
    ✓ Correct answer: C

    Artificial intelligence is fundamentally defined as software systems designed to replicate human cognitive capabilities. AI systems learn from data and make decisions by identifying patterns, much like humans do through experience. This includes visual perception (computer vision), understanding spoken language (speech recognition), and autonomous decision-making across various domains. The key distinction is that AI mimics these human-like behaviors through computational algorithms and machine learning models, not through direct human intervention for each task.

    Why the other options are wrong
    • ASoftware that exclusively replaces human workers in manufacturing is incorrect because AI does not necessarily replace humans exclusively, nor is it limited to manufacturing—it has applications across healthcare, finance, education, and many other sectors.
    • BHardware that processes data faster than a human brain is incorrect because AI is fundamentally a software concept, not a hardware characteristic, and speed of computation is not what defines artificial intelligence.
    • DA database that stores large amounts of information is incorrect because databases are merely storage systems for data; they do not process information to make intelligent decisions or mimic human cognitive capabilities like AI systems do.
  2. Question 2Describe AI Workloads and Considerations

    A human resources department wants to use AI to screen resumes for job openings. The training data consists of resumes from previously hired candidates who were predominantly from one demographic group. What is the MOST likely risk?

    • AThe system will violate the privacy principle by exposing candidate data
    • BThe system will violate the fairness principle by favoring candidates similar to the historical hiresCorrect
    • CThe system will violate the transparency principle by not explaining its decisions
    • DThe system will violate the inclusiveness principle by not supporting multiple languages
    ✓ Correct answer: B

    When training data consists mostly of resumes from one demographic group, the model learns to favor candidates resembling those historical hires, producing biased screening that violates the fairness principle. This is a well-known risk in AI hiring tools: the model perpetuates and amplifies the bias present in the historical data. Fairness requires equitable treatment across groups, so a model that systematically disadvantages other demographics is unfair. The other options describe privacy, transparency, or language issues that are not the primary risk created by skewed training data.

    Why the other options are wrong
    • AThe privacy principle violation by exposing candidate data is incorrect because the scenario describes biased outcomes from skewed data, not the exposure of personal information.
    • CThe transparency principle violation by not explaining its decisions is incorrect because the core risk here is biased selection, not a lack of explainability.
    • DThe inclusiveness principle violation by not supporting multiple languages is incorrect because language support is unrelated to the demographic bias introduced by the training data.
  3. Question 3Describe Fundamental Principles of Machine Learning

    A healthcare provider trains a classification model to predict whether patients are at high risk for diabetes. The model achieves 95% accuracy, but the team notices that the model almost never identifies high-risk patients. The dataset contains 95% low-risk and 5% high-risk patients. What is the primary issue?

    • AThe model is overfitting to the training data
    • BThe model is underfitting and needs more complex features
    • CThe model needs a higher learning rate
    • DThe dataset is imbalanced, and accuracy alone is a misleading metricCorrect
    ✓ Correct answer: D

    This scenario exemplifies the critical limitation of accuracy as an evaluation metric in imbalanced classification problems. With a 95-5 class distribution, a trivial classifier that predicts "low-risk" for every patient would achieve 95% accuracy despite never identifying a single high-risk patient. The model's high accuracy obscures its failure on the minority class, which is precisely the class of clinical importance. In imbalanced datasets, accuracy is mathematically dominated by the majority class performance and becomes uninformative about minority class detection. The appropriate evaluation metrics for imbalanced classification include recall and precision for each class separately, F1-score which balances precision and recall, and the ROC-AUC curve which evaluates performance across different classification thresholds regardless of class distribution. These metrics reveal that while the model achieves high accuracy, it has extremely poor recall for high-risk patients, making it unsuitable for clinical deployment.

    Why the other options are wrong
    • AThe model is overfitting to the training data is incorrect because overfitting would manifest as good performance on training data but poor performance on test data. The scenario does not indicate whether the high accuracy applies to training or test data, but more importantly, overfitting does not explain why the model systematically fails to identify the minority class.
    • BThe model is underfitting and needs more complex features is incorrect because underfitting would result in poor performance on both majority and minority classes. The model achieves high accuracy on the majority class, indicating it is not underfitted; the issue is class imbalance, not model complexity.
    • CThe model needs a higher learning rate is incorrect because learning rate affects the speed and stability of the training optimization process, not the fundamental issue of class imbalance. Even with perfect training, a model on imbalanced data will appear to perform well overall while failing on the minority class.
  4. Question 4Describe Features of Computer Vision Workloads

    A parking management company wants to use cameras to monitor parking lots and detect available spaces in real-time. The system needs to identify individual cars and their positions within the parking lot. Which computer vision task is best suited?

    • AOCR
    • BFace detection
    • CImage classification
    • DObject detectionCorrect
    ✓ Correct answer: D

    Identifying individual cars and their positions within a parking lot to detect available spaces requires object detection, which locates each object with a bounding box and class label. Because the system needs both to recognize cars and to know where they are, the spatial localization that detection provides is essential. Counting and positioning multiple instances is a defining strength of object detection. The other options read text, recognize faces, or label the whole image without locating each car.

    Why the other options are wrong
    • AOCR is incorrect because it extracts text from images, which does not help locate cars and determine free parking spaces.
    • BFace detection is incorrect because it locates human faces, not vehicles and their positions in a parking lot.
    • CImage classification is incorrect because it assigns a single label to the whole image and cannot identify or locate individual cars within the lot.
  5. Question 5Describe Features of NLP Workloads

    A healthcare company wants to automatically extract patient names, medication names, and dosage amounts from clinical notes. Which NLP workload is most appropriate?

    • AKey phrase extraction from clinical notes
    • BNamed entity recognition with healthcare-specific entity typesCorrect
    • CSentiment analysis of clinical notes
    • DLanguage translation of clinical notes
    ✓ Correct answer: B

    This task requires extraction of domain-specific entities: patient names (persons), medication names (drugs/substances), and dosage amounts (measurements). While standard NER can identify generic person and organization entities, healthcare-specific NER models are trained with additional entity categories that include medical terminology, pharmaceutical names, dosage units, and clinical concepts. These specialized models understand the unique context of clinical documents and can accurately extract medical information that general-purpose NER models might misclassify or miss entirely.

    Why the other options are wrong
    • AKey phrase extraction from clinical notes is incorrect because it identifies the main themes and important topics in text rather than specific named entities; dosages and medication names would not be extracted as discrete, structured data.
    • CSentiment analysis of clinical notes is incorrect because sentiment analysis evaluates the emotional tone or attitude in text, which is irrelevant to the extraction of factual clinical information like medication names and dosages.
    • DLanguage translation of clinical notes is incorrect because translation converts text from one language to another without performing entity extraction; the clinical data would still need to be processed for entity recognition.
  6. Question 6Describe Features of Generative AI Workloads

    A company uses Azure OpenAI to generate customer email responses. They want the responses to be factual and consistent rather than creative and varied. Which parameter should they adjust?

    • AIncrease the temperature to a high value
    • BChange the model to DALL-E
    • CDecrease the temperature to a low value (close to 0)Correct
    • DIncrease the max_tokens parameter
    ✓ Correct answer: C

    Temperature is a hyperparameter that directly controls the randomness and variability of model outputs. Lower temperatures reduce the randomness of token selection, causing the model to choose tokens more deterministically based on their probability distribution, resulting in more consistent, focused, and factual responses. When temperature approaches 0, the model becomes highly deterministic and tends to select the highest-probability next token, eliminating creativity and variance in favor of reliability and consistency. For customer email responses where consistency and factual accuracy are essential, a low temperature ensures that responses follow established patterns and contain accurate information rather than creative interpretations that could contradict company policy or contain fabricated details.

    Why the other options are wrong
    • AIncrease the temperature to a high value is incorrect because higher temperatures increase randomness and creativity, producing more diverse and varied responses; this is the opposite of the desired behavior for applications requiring factual consistency.
    • BChange the model to DALL-E is incorrect because DALL-E is an image generation model designed for visual content creation, not for generating text-based email responses.
    • DIncrease the max_tokens parameter is incorrect because token limit adjustment controls response length and does not affect the consistency or creativity of generated text; it is orthogonal to the consistency-versus-creativity trade-off.
  7. Question 7Describe Features of NLP Workloads

    When implementing Workloads practices in Describe Features of NLP Workloads, which approach is recommended?

    • ADisable access controls for faster day-to-day workflows
    • BUse a single shared service account for the entire team
    • CGrant full administrator access to all team members
    • DImplement role-based access control with least privilegeCorrect
    ✓ Correct answer: D

    Role-based access control with least privilege ensures that each team member possesses only the minimum access necessary for their specific role, reducing security risks and maintaining accountability. RBAC enables fine-grained permission management, making it easy to audit who accessed what resources and when. This approach prevents accidental misconfigurations caused by excessive permissions and reduces the attack surface by limiting what compromised accounts can access.

    Why the other options are wrong
    • ADisable access controls for faster day-to-day workflows is incorrect because removing access controls creates critical security vulnerabilities, exposes sensitive data and systems to unauthorized access, and violates fundamental security principles.
    • BUse a single shared service account for the entire team is incorrect because shared accounts eliminate individual accountability, prevent effective audit trails, and make it impossible to determine who performed specific actions.
    • CGrant full administrator access to all team members is incorrect because universal administrative access violates least privilege principles, enables users to perform actions beyond their role requirements, and creates unnecessary security risks.
  8. Question 8Describe Fundamental Principles of Machine Learning

    A marketing team wants to group their customers into segments based on purchasing behavior, demographics, and browsing patterns, without having predefined category labels. Which machine learning task should they use?

    • ABinary classification
    • BRegression
    • CClusteringCorrect
    • DMulticlass classification
    ✓ Correct answer: C

    Grouping customers into segments without predefined category labels is an unsupervised clustering task. The model must discover natural groupings in the customer data based on similarities in purchasing behavior, demographics, and browsing patterns. Clustering algorithms like K-means, hierarchical clustering, or DBSCAN identify distinct customer segments that can inform targeted marketing strategies without requiring pre-labeled training data.

    Why the other options are wrong
    • ABinary classification is incorrect because the task does not have two predefined outcomes and does not use labeled training data for supervised learning.
    • BRegression is incorrect because the goal is not predicting continuous numerical values, but identifying discrete customer segments.
    • DMulticlass classification is incorrect because the task is unsupervised without predefined category labels; supervised classification requires training data with known segment assignments.
  9. Question 9Describe Features of Computer Vision Workloads

    A factory uses Custom Vision to detect defective products on an assembly line. Each image may contain multiple products, and the model needs to locate each defective item. Which Custom Vision project type should be used?

    • AMultilabel classification
    • BMulticlass classification
    • CObject detectionCorrect
    • DImage segmentation
    ✓ Correct answer: C

    This manufacturing scenario requires not only identifying defective products but also locating each one within images that may contain multiple items—a requirement that precisely defines object detection functionality. Object detection differs from image classification by providing spatial information for each detected item: it returns bounding box coordinates specifying the pixel location of each object along with a class label and confidence score. For the assembly line application, the model must be trained to recognize visual defect patterns (cracks, misalignment, discoloration, deformation, etc.) and simultaneously identify the precise location of each defective product in the image, enabling automated flagging systems or robotic interventions to target the specific defective items. Custom Vision's object detection project type provides this exact capability through its training and inference pipelines.

    Why the other options are wrong
    • AMultilabel classification is incorrect because while multilabel models can assign multiple categories to a single image, they do not provide spatial location information—you would know defects exist but not where in the image to find them.
    • BMulticlass classification is incorrect because this approach assigns only a single class label to the entire image, so you could not distinguish between multiple products within a single image or locate individual defects.
    • DImage segmentation is incorrect because while segmentation provides pixel-level detail about object boundaries, Custom Vision does not offer image segmentation as a project type; object detection is the appropriate Custom Vision offering for locating multiple items.
  10. Question 10Describe Features of NLP Workloads

    What type of sentiment score does Azure AI Language return at the document level?

    • AA single number between -1 and 1
    • BConfidence scores for positive, negative, and neutral categories that sum to approximately 1Correct
    • CA letter grade from A to F
    • DA star rating from 1 to 5
    ✓ Correct answer: B

    Azure AI Language returns document-level sentiment using a three-category classification model where the service calculates confidence scores representing the probability that the document belongs to each sentiment category (positive, negative, and neutral). These three scores sum to approximately 1.0, representing 100% confidence distribution across the three categories. This probabilistic approach provides richer information than a simple binary classification or single continuous score. For instance, a customer review about a hotel might receive scores like {positive: 0.65, negative: 0.20, neutral: 0.15}, indicating the text is predominantly positive with some reservations. The document-level sentiment differs from sentence-level sentiment analysis, where each sentence also receives similar three-way category scores.

    Why the other options are wrong
    • AA single number between -1 and 1 is incorrect because Azure AI Language sentiment analysis does not use a unidimensional scale. A single continuous value cannot represent the nuanced classification across three distinct sentiment categories.
    • CA letter grade from A to F is incorrect because sentiment analysis results are not expressed as letter grades. Azure AI Language uses numerical confidence scores in a probabilistic framework, not letter-grade systems.
    • DA star rating from 1 to 5 is incorrect because document-level sentiment is not expressed as star ratings. The service uses a three-category probability distribution model, not a five-point numerical scale.

AI-900 practice exam FAQ

How many questions are in the AI-900 practice exam on CertGrid?

CertGrid has 875 practice questions for AI-900: Azure AI Fundamentals, covering 5 exam domains. The real AI-900 exam has about 40 questions.

What is the passing score for AI-900?

The AI-900 exam passing score is 700, and you have about 85 minutes to complete it. CertGrid scores your practice attempts the same way so you know when you are ready.

Are these official AI-900 exam questions?

No. CertGrid is an independent practice platform. Questions are written to mirror the style and concepts of AI-900: Azure AI Fundamentals, with full explanations, but they are not official or copied vendor exam items. They are original practice questions designed to help you genuinely learn the material.

Can I practice AI-900 for free?

Yes. You can start practicing AI-900: Azure AI Fundamentals for free with daily practice and sample questions. Paid plans unlock full timed exams, complete explanations, and domain analytics.