Winter Special Flat 65% Limited Time Discount offer - Ends in 0d 00h 00m 00s - Coupon code: netdisc

CertNexus AIP-210 CertNexus Certified Artificial Intelligence Practitioner (CAIP) Exam Practice Test

Page: 1 / 9
Total 90 questions

CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers

Testing Engine

  • Product Type: Testing Engine
$43.75  $124.99

PDF Study Guide

  • Product Type: PDF Study Guide
$38.5  $109.99
Question 1

In which of the following scenarios is lasso regression preferable over ridge regression?

Options:

A.

The number of features is much larger than the sample size.

B.

There are many features with no association with the dependent variable.

C.

There is high collinearity among some of the features associated with the dependent variable.

D.

The sample size is much larger than the number of features.

Question 2

When working with textual data and trying to classify text into different languages, which approach to representing features makes the most sense?

Options:

A.

Bag of words model with TF-IDF

B.

Bag of bigrams (2 letter pairs)

C.

Word2Vec algorithm

D.

Clustering similar words and representing words by group membership

Question 3

Which of the following is the primary purpose of hyperparameter optimization?

Options:

A.

Controls the learning process of a given algorithm

B.

Makes models easier to explain to business stakeholders

C.

Improves model interpretability

D.

Increases recall over precision

Question 4

Workflow design patterns for the machine learning pipelines:

Options:

A.

Aim to explain how the machine learning model works.

B.

Represent a pipeline with directed acyclic graph (DAG).

C.

Seek to simplify the management of machine learning features.

D.

Separate inputs from features.

Question 5

Which of the following describes a benefit of machine learning for solving business problems?

Options:

A.

Increasing the quantity of original data

B.

Increasing the speed of analysis

C.

Improving the constraint of the problem

D.

Improving the quality of original data

Question 6

For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.

Which assumption of linear regression is being violated?

Options:

A.

Equality of variance (Homoscedastidty)

B.

Independence

C.

Linearity

D.

Normality

Question 7

You have a dataset with thousands of features, all of which are categorical. Using these features as predictors, you are tasked with creating a prediction model to accurately predict the value of a continuous dependent variable. Which of the following would be appropriate algorithms to use? (Select two.)

Options:

A.

K-means

B.

K-nearest neighbors

C.

Lasso regression

D.

Logistic regression

E.

Ridge regression

Question 8

Which of the following text vectorization methods is appropriate and correctly defined for an English-to-Spanish translation machine?

Options:

A.

Using TF-IDF because in translation machines, we do not care about the order of the words.

B.

Using TF-IDF because in translation machines, we need to consider the order of the words.

C.

Using Word2vec because in translation machines, we do not care about the order of the words.

D.

Using Word2vec because in translation machines, we need to consider the order of the words.

Question 9

Which of the following is TRUE about SVM models?

Options:

A.

They can be used only for classification.

B.

They can be used only for regression.

C.

They can take the feature space into higher dimensions to solve the problem.

D.

They use the sigmoid function to classify the data points.

Question 10

Your dependent variable Y is a count, ranging from 0 to infinity. Because Y is approximately log-normally distributed, you decide to log-transform the data prior to performing a linear regression.

What should you do before log-transforming Y?

Options:

A.

Add 1 to all of the Y values.

B.

Divide all the Y values by the standard deviation of Y.

C.

Explore the data for outliers.

D.

Subtract the mean of Y from all the Y values.

Question 11

A product manager is designing an Artificial Intelligence (AI) solution and wants to do so responsibly, evaluating both positive and negative outcomes.

The team creates a shared taxonomy of potential negative impacts and conducts an assessment along vectors such as severity, impact, frequency, and likelihood.

Which modeling technique does this team use?

Options:

A.

Business

B.

Harms

C.

Process

D.

Threat

Question 12

The graph is an elbow plot showing the inertia or within-cluster sum of squares on the y-axis and number of clusters (also called K) on the x-axis, denoting the change in inertia as the clusters change using k-means algorithm.

What would be an optimal value of K to ensure a good number of clusters?

Options:

A.

2

B.

3

C.

5

D.

9

Question 13

Which of the following methods can be used to rebalance a dataset using the rebalance design pattern?

Options:

A.

Bagging

B.

Boosting

C.

Stacking

D.

Weighted class

Question 14

Which two of the following statements about the beta value in an A/B test are accurate? (Select two.)

Options:

A.

The Beta value is the rate of type II errors for the test.

B.

The Beta value is the rate of type I errors for the test.

C.

The statistical power of a test is the inverse of the Beta value, or 1 - Beta.

D.

The Beta in an Alpha/Beta test represents one of the two variants of the A/B test.

Question 15

Which two of the following decrease technical debt in ML systems? (Select two.)

Options:

A.

Boundary erosion

B.

Design anti-patterns

C.

Documentation readability

D.

Model complexity

E.

Refactoring

Question 16

Why do data skews happen in the ML pipeline?

Options:

A.

Test and evaluation data are designed incorrectly.

B.

There Is a mismatch between live input data and offline data.

C.

There is a mismatch between live output data and offline data.

D.

There is insufficient training data for evaluation.

Question 17

A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?

Options:

A.

The AI model was trained in winter and applied in summer.

B.

The application was migrated from on-premise to a public cloud.

C.

The team set flawed expectations when training the model.

D.

The training data used was inaccurate.

Question 18

A big data architect needs to be cautious about personally identifiable information (PII) that may be captured with their new IoT system. What is the final stage of the Data Management Life Cycle, which the architect must complete in order to implement data privacy and security appropriately?

Options:

A.

De-Duplicate

B.

Destroy

C.

Detain

D.

Duplicate

Question 19

In general, models that perform their tasks:

Options:

A.

Less accurately are less robust against adversarial attacks.

B.

Less accurately are neither more nor less robust against adversarial attacks.

C.

More accurately are less robust against adversarial attacks.

D.

More accurately are neither more nor less robust against adversarial attacks.

Question 20

Which of the following regressions will help when there is the existence of near-linear relationships among the independent variables (collinearity)?

Options:

A.

Clustering

B.

Linear regression

C.

Polynomial regression

D.

Ridge regression

Question 21

You are developing a prediction model. Your team indicates they need an algorithm that is fast and requires low memory and low processing power. Assuming the following algorithms have similar accuracy on your data, which is most likely to be an ideal choice for the job?

Options:

A.

Deep learning neural network

B.

Random forest

C.

Ridge regression

D.

Support-vector machine

Question 22

Which three security measures could be applied in different ML workflow stages to defend them against malicious activities? (Select three.)

Options:

A.

Disable logging for model access.

B.

Launch ML Instances In a virtual private cloud (VPC).

C.

Monitor model degradation.

D.

Use data encryption.

E.

Use max privilege to control access to ML artifacts.

F.

Use Secrets Manager to protect credentials.

Question 23

Which of the following sentences is true about model evaluation and model validation in ML pipelines?

Options:

A.

Model evaluation and validation are the same.

B.

Model evaluation is defined as an external component.

C.

Model validation is defined as a set of tasks to confirm the model performs as expected.

D.

Model validation occurs before model evaluation.

Question 24

A change in the relationship between the target variable and input features is

Options:

A.

concept drift.

B.

covariate shift.

C.

data drift.

D.

model decay.

Question 25

Which of the following items should be included in a handover to the end user to enable them to use and run a trained model on their own system? (Select three.)

Options:

A.

Information on the folder structure in your local machine

B.

Intermediate data files

C.

Link to a GitHub repository of the codebase

D.

README document

E.

Sample input and output data files

Question 26

When should you use semi-supervised learning? (Select two.)

Options:

A.

A small set of labeled data is available but not representative of the entire distribution.

B.

A small set of labeled data is biased toward one class.

C.

Labeling data is challenging and expensive.

D.

There is a large amount of labeled data to be used for predictions.

E.

There is a large amount of unlabeled data to be used for predictions.

Question 27

Which of the following sentences is TRUE about the definition of cloud models for machine learning pipelines?

Options:

A.

Data as a Service (DaaS) can host the databases providing backups, clustering, and high availability.

B.

Infrastructure as a Service (IaaS) can provide CPU, memory, disk, network and GPU.

C.

Platform as a Service (PaaS) can provide some services within an application such as payment applications to create efficient results.

D.

Software as a Service (SaaS) can provide AI practitioner data science services such as Jupyter notebooks.

Page: 1 / 9
Total 90 questions