Offering Free access to ISTQB AI Testing CT-AI Exam Questions Pool Bank

Certified Tester AI Testing Exam Questions and Answers

Testing Engine

Product Type: Testing Engine

$37.5 ~~$124.99~~

Add to Cart

PDF + Testing Engine

Product Type: PDF + Testing Engine

$52.5 ~~$174.99~~

Add to Cart

PDF Study Guide

Product Type: PDF Study Guide

$33 ~~$109.99~~

Add to Cart

Question 1

A system was developed for screening the X-rays of patients for potential malignancy detection (skin cancer). A workflow system has been developed to screen multiple cancers by using several individually trained ML models chained together in the workflow.

Testing the pipeline could involve multiple kind of tests (I - III):

I.Pairwise testing of combinations

II.Testing each individual model for accuracy

III.A/B testing of different sequences of models

Which ONE of the following options contains the kinds of tests that would be MOST APPROPRIATE to include in the strategy for optimal detection?

SELECT ONE OPTION

Options:

Only III

I and II

I and III

Only II

Question 2

A motorcycle engine repair shop owner wants to detect a leaking exhaust valve and fix it before it falls and causes catastrophic damage to the engine. The shop developed and trained a predictive model with historical data files from known health engines and ones which experienced a catastrophic fails due to exhaust valve failure. The shop evaluated 200 engines using this model and then disassembled the engines to assess the true state of the valves, recording the results in the confusion matrix below.

What is the precision of this predictive model

Options:

90.0%

94.5%

98.9%

94.2%

Question 3

A bank wants to use an algorithm to determine which applicants should be given a loan. The bank hires a data scientist to construct a logistic regression model to predict whether the applicant will repay the loan or not. The bank has enough data on past customers to randomly split the data into a training data set and a test/validation data set. A logistic regression model is constructed on the training data set using the following independent variables:

Gender

Marital status

Number of dependents

Education

Income

Loan amount

Loan term

Credit score

The model reveals that those with higher credit scores and larger total incomes are more likely to repay their loans. The data scientist has suggested that there might be bias present in the model based on previous models created for other banks.

Given this information, what is the best test approach to check for potential bias in the model?

Options:

Experienced-based testing should be used to confirm that the training data set is operationally relevant. This can include applying exploratory data analysis (EDA) to check for bias within the training data set.

Back-to-back testing should be used to compare the model created using the training data set to another model created using the test data set, if the two models significantly differ, it will indicate there is bias in the original model.

Acceptance testing should be used to make sure the algorithm is suitable for the customer. The team can re-work the acceptance criteria such that the algorithm is sure to correctly predict the remaining applicants that have been set aside for the validation data set ensuring no bias is present.

A/B testing should be used to verify that the test data set does not detect any bias that might have been introduced by the original training data. If the two models significantly differ, it will indicate there is bias in the original model.

Answer:

Explanation:

Bias in an AI system occurs when the training data contains inherent prejudices that cause the model to make unfair predictions. Experience-based testing, particularlyExploratory Data Analysis (EDA), helps uncover these biases by analyzing patterns, distributions, and potential discriminatory factors in the training data.

Option A:“Experience-based testing should be used to confirm that the training data set is operationally relevant. This can include applying exploratory data analysis (EDA) to check for bias within the training data set.”

This is the correct answer. EDA involves examining the dataset for bias, inconsistencies, or missing values, ensuring fairness in ML model predictions.

Option B:“Back-to-back testing should be used to compare the model created using the training data set to another model created using the test data set. If the two models significantly differ, it will indicate there is bias in the original model.”

Back-to-back testing is used for regression testing and to compare versions of an AI system but is not primarily used to detect bias.

Option C:“Acceptance testing should be used to make sure the algorithm is suitable for the customer. The team can re-work the acceptance criteria such that the algorithm is sure to correctly predict the remaining applicants that have been set aside for the validation data set ensuring no bias is present.”

Acceptance testing focuses on meeting predefined business requirements rather than detecting and mitigating bias.

Option D:“A/B testing should be used to verify that the test data set does not detect any bias that might have been introduced by the original training data. If the two models significantly differ, it will indicate there is bias in the original model.”

A/B testing is used for evaluating variations of a model rather than for explicitly identifying bias.

Bias Testing Methods:"AI-based systems should be tested for algorithmic bias, sample bias, and inappropriate bias. Experience-based testing and EDA are useful for detecting bias".

Exploratory Data Analysis (EDA):"EDA helps uncover potential bias in training data through visualization and statistical analysis".

Analysis of the Answer Options:ISTQB CT-AI Syllabus References:Thus,Option A is the best choice for detecting bias in the loan applicant model.

Question 4

A tourist calls an airline to book a ticket and is connected with an automated system which is able to recognize speech, understand requests related to purchasing a ticket, and provide relevant travel options. When the tourist asks about the expected weather at the destination or potential impacts on operations because of the tight labor market the only response from the automated system is: "Idon't understand your question."

This AI system should be categorized as?

Options:

General AI

Narrow AI

Super AI

Conventional AI

Question 5

A mobile app start-up company is implementing an AI-based chat assistant for e-commerce customers. In the process of planning the testing, the team realizes that the specifications are insufficient.

Which testing approach should be used to test this system?

Options:

Exploratory testing

Static analysis

Equivalence partitioning

State transition testing

Question 6

Max. Score: 2

Al-enabled medical devices are used nowadays for automating certain parts of the medical diagnostic processes. Since these are life-critical process the relevant authorities are considenng bringing about suitable certifications for these Al enabled medical devices. This certification may involve several facets of Al testing (I - V).

I.Autonomy

II.Maintainability

III.Safety

IV.Transparency

V.Side Effects

Which ONE of the following options contains the three MOST required aspects to be satisfied for the above scenario of certification of Al enabled medical devices?

SELECT ONE OPTION

Options:

Aspects II, III and IV

Aspects I, II, and III

Aspects III, IV, and V

Aspects I, IV, and V

Question 7

You are testing an autonomous vehicle which uses AI to determine proper driving actions and responses. You have evaluated the parameters and combinations to be tested and have determinedthat there are too many to test in the time allowed. It has been suggested that you use pairwise testing to limit the parameters. Given the complexity of the software under test, what is likely the outcome from using pairwise testing?

Options:

The number of parameters to test can be reduced to less than a dozen.

All high priority defects will be identified using this method.

While the number of tests needed can be reduced, there may still be a large enough set of tests that automation will be required to execute all of them.

Pairwise cannot be applied to this problem because there is AI involved and the evolving values may result in unexpected results that cannot be verified.

Answer:

Explanation:

Pairwise testing is a combinatorial testing technique that reduces the number of test cases by focusing on testing interactions between pairs of parameters rather than all possible combinations. It is widely used in AI-based systems, including autonomous vehicles, where the number of possible input parameter combinations can be extremely high.

Option A:“The number of parameters to test can be reduced to less than a dozen.”

This is incorrect. While pairwise testing significantly reduces the number of test cases, it does not necessarily limit them to a fixed number like a dozen. The final number of tests depends on the number of parameters and their possible values.

Option B:“All high priority defects will be identified using this method.”

This is incorrect. While pairwise testing is effective in detecting defects caused by interactions between two parameters, it may not uncover defects resulting from more complex interactions involving three or more parameters.

Option C:“While the number of tests needed can be reduced, there may still be a large enough set of tests that automation will be required to execute all of them.”

This is the correct answer. Even though pairwise testing reduces the number of test cases, AI-based systems such as autonomous vehicles still have a large number of test scenarios. Therefore, automation is often necessary to execute all test cases within the available time.

Option D:“Pairwise cannot be applied to this problem because there is AI involved, and the evolving values may result in unexpected results that cannot be verified.”

This is incorrect. Pairwise testing can still be applied to AI-based systems, including those that evolve over time. However, additional testing techniques may be required to verify evolving behavior.

Pairwise Testing for AI Systems:"Pairwise testing is widely used because it effectively reduces the number of test cases while maintaining defect detection capability".

Automation Requirement:"In practice, even with pairwise testing, extensive test suites may still require automation".

Analysis of the Answer Options:ISTQB CT-AI Syllabus References:

Question 8

Which ONE of the following characteristics is the least likely to cause safety related issues for an Al system?

SELECT ONE OPTION

Options:

Non-determinism

Robustness

High complexity

Self-learning

Question 9

Which of the following problems would best be solved using the supervised learning category of regression?

Options:

Determining the optimal age for a chicken's egg laying production using input data of the chicken's age and average daily egg production for one million chickens.

Recognizing a knife in carry on luggage at a security checkpoint in an airport scanner.

Determining if an animal is a pig or a cow based on image recognition.

Predicting shopper purchasing behavior based on the category of shopper and the positioning of promotional displays within a store.

Answer:

Explanation:

Understanding Supervised Learning - RegressionSupervised learning is a category of machine learning where the model is trained on labeled data. Within this category,regressionis used when the goal is to predict a continuous numeric value.

Regressiondeals with problems where the output variable is continuous in nature, meaning it can take any numerical value within a range.

Common examples include predicting prices, estimating demand, and analyzing production trends.

(A) Determining the optimal age for a chicken's egg-laying production using input data of the chicken's age and average daily egg production for one million chickens.✅(Correct)

This is a classicregression problembecause it involves predicting a continuous variable:daily egg productionbased on the input variablechicken's age.

The goal is to find a numerical relationship between age and egg production, which makesregression the appropriate supervised learning method.

(B) Recognizing a knife in carry-on luggage at a security checkpoint in an airport scanner.❌(Incorrect)

This is animage recognition task, which falls underclassification, not regression.

Classification problems involve assigning inputs to discrete categories (e.g., "knife detected" or "no knife detected").

This is anotherclassification problemwhere the goal is to categorize an image into one of two labels (pig or cow).

(D) Predicting shopper purchasing behavior based on the category of shopper and the positioning of promotional displays within a store.❌(Incorrect)

This problem could involve a mix ofclassificationandassociation rule learning, but it does not explicitly predict a continuous variable in the way regression does.

Regression is used when predicting a numeric output."Predicting the age of a person based on input data about their habits or predicting the future prices of stocks are examples of problems that use regression."

Supervised learning problems are divided into classification and regression."If the output is numeric and continuous in nature, it may be regression."

Regression is commonly used for predicting numerical trends over time."Regression models result in a numerical or continuous output value for a given input."

Analysis of Answer ChoicesReferences from ISTQB Certified Tester AI Testing Study GuideThus,option A is the correct answer, as it aligns with the principles of regression-based supervised learning.

Question 10

Which ONE of the following options is an example that BEST describes a system with Al-based autonomous functions?

SELECT ONE OPTION

Options:

A system that utilizes human beings for all important decisions.

A fully automated manufacturing plant that uses no software.

A system that utilizes a tool like Selenium.

A system that is fully able to respond to its environment.

Question 11

A transportation company operates three types of delivery vehicles in its fleet. The vehicles operate at different speeds (slow, medium, and fast). The transportation company is attempting to optimize scheduling and has created an AI-based program to plan routes for its vehicles using records from the medium-speed vehicle traveling to selected destinations. The test team uses this data in metamorphic testing to test the accuracy of the estimated travel times created by the AI route planner with the actual routes and times.

Which of the following describes the next phase of metamorphic testing?

Options:

The team tests the time required for the fast and slow vehicles to travel the same route as the medium vehicle. Then, by calculating the speed difference, they then predict how much faster or slower the vehicles will travel. That information is then used to verify that the arrival time of the vehicles meets the expected result.

The team decomposes each route into the relevant components that affect the travel time such as traffic density and vehicle power. The team then uses statistical analysis to characterize the influence of each component to calculate the fast and slow vehicle route times.

The team uses an AI system to select the most dissimilar routes. With this information, any of the AI routes can be metaphorically transformed into a fast or slow route.

The team uses the same AI route planner to create routes that are longer and shorter but follow the same track. Finally, by driving the fast vehicles on the long routes and slow vehicles on the short routes and vice versa, the AI system will have enough information to infer travel times for all vehicles on all routes.

Answer:

Explanation:

Metamorphic Testing (MT)is a testing technique that verifies AI-based systems by generatingfollow-up test casesbased on existing test cases. These follow-up test cases adhere to aMetamorphic Relation (MR), ensuring that if the system is functioning correctly, changes in input should result in predictable changes in output.

Metamorphic testing works by transforming source test cases into follow-up test cases

Here, thesource test caseinvolves testing themedium-speed vehicle’stravel time.

Thefollow-up test casesare derived byextrapolating travel times for fast and slow vehiclesusing predictable relationships based on speed differences.

MR states that modifying input should result in a predictable change in output

Since the speed of the vehicle is a known factor, it is possible to predict the new arrival times and verify whether they follow expected trends.

This is a direct application of metamorphic testing principles

Inroute optimization systems, metamorphic testing often applies transformations tospeed, distance, or conditionsto verify expected outcomes.

(B) Decomposing each route into traffic density and vehicle power❌

While useful for statistical analysis, this approach does not generate follow-up test cases based on a definedmetamorphic relation (MR).

Thisdoes not follow metamorphic testing principles, which require predictable transformations.

(D) Running fast vehicles on long routes and slow vehicles on short routes❌

This methoddoes not maintain a controlled MRand introduces too manyuncontrolled variables.

Metamorphic testing generates follow-up test cases based on a source test case."MT is a technique aimed at generating test cases which are based on a source test case that has passed.One or more follow-up test cases are generated by changing (metamorphizing) the source test case based on a metamorphic relation (MR)."

MT has been used for testing route optimization AI systems."In the area of AI, MT has been used for testing image recognition, search engines, route optimization and voice recognition, among others."

Why Option A is Correct?Why Other Options are Incorrect?References from ISTQB Certified Tester AI Testing Study GuideThus,option A is the correct answer, as it aligns with the principles ofmetamorphic testing by modifying input speeds and verifying expected results.

Question 12

A company is using a spam filter to attempt to identify which emails should be marked as spam. Detection rules are created by the filter that causes a message to be classified as spam. An attacker wishes to have all messages internal to the company be classified as spam. So, the attacker sends messages with obvious red flags in the body of the email and modifies the from portion of the email to make it appear that the emails have been sent by company members. The testers plan to use exploratory data analysis (EDA) to detect the attack and use this information to prevent future adversarial attacks.

How could EDA be used to detect this attack?

Options:

EDA can help detect the outlier emails from the real emails.

EDA can detect and remove the false emails.

EDA can restrict how many inputs can be provided by unique users.

EDA cannot be used to detect the attack.

Answer:

Explanation:

Exploratory Data Analysis (EDA) is an essential technique for examining datasets to uncover patterns, trends, and anomalies, including outliers. In this case, the attacker manipulates the spam filter by injecting emails with red flags and masking them as internal company emails. The primary goal of EDA here is to detect these adversarial modifications.

Detecting Outliers:

EDA techniques such as statistical analysis, clustering, and visualization can reveal patterns in email metadata (e.g., sender details, email content, frequency).

Outlier detection methods like Z-score, IQR (Interquartile Range), or machine learning-based anomaly detection can identify emails that significantly deviate from typical internal communications.

Identifying Distribution Shifts:

By analyzing the frequency and characteristics of emails flagged as spam, testers can detect if the attack has introduced unusual patterns.

If a surge of internal emails is suddenly classified as spam, EDA can help verify whether these classifications are consistent with historical data.

Feature Analysis for Adversarial Patterns:

EDA enables visualization techniques such as scatter plots or histograms to distinguish normal emails from manipulated ones.

Examining email metadata (e.g., changes in headers, unusual wording in email bodies) can reveal adversarial tactics.

Counteracting Adversarial Attacks:

Once anomalies are identified, the spam filter’s detection rules can be improved by retraining the model on corrected datasets.

The adversarial examples can be added to the training data to enhance the robustness of the filter against future attacks.

Exploratory Data Analysis (EDA) is used to detect outliers and adversarial attacks."EDA is where data are examined for patterns, relationships, trends, and outliers. It involves the interactive, hypothesis-driven exploration of data."

EDA can identify poisoned or manipulated data by detecting anomalies and distribution shifts."Testing to detect data poisoning is possible using EDA, as poisoned data may show up as outliers."

EDA helps validate ML models and detect potential vulnerabilities."The use of exploratory techniques, primarily driven by data visualization, can help validate the ML algorithm being used, identify changes that result in efficient models, and leverage domain expertise."

References from ISTQB Certified Tester AI Testing Study GuideThus,option A is the correct answer, as EDA is specifically useful for detecting outliers, which can help identify manipulated spam emails.

Question 13

“BioSearch” is creating an Al model used for predicting cancer occurrence via examining X-Ray images. The accuracy of the model in isolation has been found to be good. However, the users of the model started complaining of the poor quality of results, especially inability to detect real cancer cases, when put to practice in the diagnosis lab, leading to stopping of the usage of the model.

A testing expert was called in to find the deficiencies in the test planning which led to the above scenario.

Which ONE of the following options would you expect to MOST likely be the reason to be discovered by the test expert?

SELECT ONE OPTION

Options:

A lack of similarity between the training and testing data.

The input data has not been tested for quality prior to use for testing.

A lack of focus on choosing the right functional-performance metrics.

A lack of focus on non-functional requirements testing.

Question 14

A ML engineer is trying to determine the correctness of the new open-source implementation *X", of a supervised regression algorithm implementation. R-Square is one of the functional performance metrics used to determine the quality of the model.

Which ONE of the following would be an APPROPRIATE strategy to achieve this goal?

SELECT ONE OPTION

Options:

Add 10% of the rows randomly and create another model and compare the R-Square scores of both the model.

Train various models by changing the order of input features and verify that the R-Square score of these models vary significantly.

Compare the R-Square score of the model obtained using two different implementations that utilize two different programming languages while using the same algorithm and the same training and testing data.

Drop 10% of the rows randomly and create another model and compare the R-Square scores of both the models.

Question 15

Which ONE of the following combinations of Training, Validation, Testing data is used during the process of learning/creating the model?

SELECT ONE OPTION

Options:

Training data - validation data - test data

Training data - validation data

Training data • test data

Validation data - test data

Question 16

The stakeholders of a machine learning model have confirmed that they understand the objective and purpose of the model, and ensured that the proposed model aligns with their business priorities. They have also selected a framework and a machine learning model that they will be using.

What should be the next step to progress along the machine learning workflow?

Options:

Tune the machine learning algorithm based on objectives and business priorities

Prepare and pre-process the data that will be used to train and test the model

Agree on defined acceptance criteria for the machine learning model

Evaluate the selection of the framework and the model

Question 17

Consider a natural language processing (NLP) algorithm that attempts to predict the next word that you would like to type in a text message. An update to the algorithm has been created that should increase the accuracy of the predictions based on user typing patterns. The old algorithm was rated for accuracy by the users. Then, after the new update was released, the users rated the updated algorithm. A statistical test was used to compare between the two versions of the algorithm to see whether or not the update should remain in place.

This is an example of what type of testing?

Options:

Metamorphic testing

A/B testing

Exploratory testing

Pairwise testing

Question 18

Which ONE of the following options describes the LEAST LIKELY usage of Al for detection of GUI changes due to changes in test objects?

SELECT ONE OPTION

Options:

Using a pixel comparison of the GUI before and after the change to check the differences.

Using a computer vision to compare the GUI before and after the test object changes.

Using a vision-based detection of the GUI layout changes before and after test object changes.

Using a ML-based classifier to flag if changes in GUI are to be flagged for humans.

Question 19

Data used for an object detection ML system was found to have been labelled incorrectly in many cases.

Which ONE of the following options is most likely the reason for this problem?

SELECT ONE OPTION

Options:

Security issues

Accuracy issues

Privacy issues

Bias issues

Question 20

Which of the following is correct regarding the layers of a deep neural network?

Options:

There is only an input and output layer

There is at least one internal hidden layer

There must be a minimum of five total layers to be considered deep

The output layer is not connected with the other layers to maintain integrity

Question 21

"AllerEgo" is a product that uses sell-learning to predict the behavior of a pilot under combat situation for a variety of terrains and enemy aircraft formations. Post training the model was exposed to the real-

world data and the model was found to be behaving poorly. A lot of data quality tests had been performed on the data to bring it into a shape fit for training and testing.

Which ONE of the following options is least likely to describes the possible reason for the fall in the performance, especially when considering the self-learning nature of the Al system?

SELECT ONE OPTION

The difficulty of defining criteria for improvement before the model can be accepted.

The fast pace of change did not allow sufficient time for testing.

The unknown nature and insufficient specification of the operating environment might have caused the poor performance.

Options:

There was an algorithmic bias in the Al system.

Question 22

Upon testing a model used to detect rotten tomatoes, the following data was observed by the test engineer, based on certain number of tomato images.

For this confusion matrix which combinations of values of accuracy, recall, and specificity respectively is CORRECT?

SELECT ONE OPTION

Options:

0.87.0.9. 0.84

1,0.87,0.84

1,0.9, 0.8

0.84.1,0.9

Question 23

In a certain coffee producing region of Colombia, there have been some severe weather storms, resulting in massive losses in production. This caused a massive drop in stock price of coffee.

Which ONE of the following types of testing SHOULD be performed for a machine learning model for stock-price prediction to detect influence of such phenomenon as above on price of coffee stock.

SELECT ONE OPTION

Options:

Testing for accuracy

Testing for bias

Testing for concept drift

Testing for security

Question 24

There is a growing backlog of unresolved defects for your project. You know the developers have an ML model that they have created which has learned which developers work on which type of software and the speed with which they resolve issues. How could you use this model to help reduce the backlog and implement more efficient defect resolution?

Options:

Use it to prioritize defects automatically based on the time expected for the fix to be made, the speed of the fix, and the likelihood of regressions.

Use it to assign defects to the best developer to resolve the problem and to load balance the defect assignments among the developers.

Use it to determine the root cause of each defect and develop a process improvement plan that can be implemented to remove the most common root causes.

Use it to review the code and determine where more defects are likely to occur so that testing can be targeted to those areas.

Load More CT-AI Questions

Special Summer Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70special

ISTQB CT-AI Certified Tester AI Testing Exam Exam Practice Test

Certified Tester AI Testing Exam Questions and Answers

Testing Engine

PDF + Testing Engine

PDF Study Guide

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation: