Winter Special Flat 65% Limited Time Discount offer - Ends in 0d 00h 00m 00s - Coupon code: netdisc

Databricks Databricks-Machine-Learning-Professional Databricks Certified Machine Learning Professional Exam Practice Test

Databricks Certified Machine Learning Professional Questions and Answers

Testing Engine

  • Product Type: Testing Engine
$42  $119.99

PDF Study Guide

  • Product Type: PDF Study Guide
$36.75  $104.99
Question 1

A data scientist has developed a scikit-learn modelsklearn_modeland they want to log the model using MLflow.

They write the following incomplete code block:

Which of the following lines of code can be used to fill in the blank so the code block can successfully complete the task?

Options:

A.

mlflow.spark.track_model(sklearn_model, "model")

B.

mlflow.sklearn.log_model(sklearn_model, "model")

C.

mlflow.spark.log_model(sklearn_model, "model")

D.

mlflow.sklearn.load_model("model")

E.

mlflow.sklearn.track_model(sklearn_model, "model")

Question 2

A machine learning engineering team has written predictions computed in a batch job to a Delta table for querying. However, the team has noticed that the querying is running slowly. The team has alreadytuned the size of the data files. Upon investigating, the team has concluded that the rows meeting the query condition are sparsely located throughout each of the data files.

Based on the scenario, which of the following optimization techniques could speed up the query by colocating similar records while considering values in multiple columns?

Options:

A.

Z-Ordering

B.

Bin-packing

C.

Write as a Parquet file

D.

Data skipping

E.

Tuning the file size

Question 3

A machine learning engineer is in the process of implementing a concept drift monitoring solution. They are planning to use the following steps:

1. Deploy a model to production and compute predicted values

2. Obtain the observed (actual) label values

3. _____

4. Run a statistical test to determine if there are changes over time

Which of the following should be completed as Step #3?

Options:

A.

Obtain the observed values (actual) feature values

B.

Measure the latency of the prediction time

C.

Retrain the model

D.

None of these should be completed as Step #3

E.

Compute the evaluation metric using the observed and predicted values

Question 4

A machine learning engineer and data scientist are working together to convert a batch deployment to an always-on streaming deployment. The machine learning engineer has expressed that rigorous data tests must be put in place as a part of their conversion to account for potential changes in data formats.

Which of the following describes why these types of data type tests and checks are particularly important for streaming deployments?

Options:

A.

Because the streaming deployment is always on, all types of data must be handled without producing an error

B.

All of these statements

C.

Because the streaming deployment is always on, there is no practitioner to debug poor model performance

D.

Because the streamingdeployment is always on, there is a need to confirm that the deployment can autoscale

E.

None of these statements

Question 5

A machine learning engineering team wants to build a continuous pipeline for data preparation of a machine learning application. The team would like the data to be fully processed and made ready for inference in a series of equal-sized batches.

Which of the following tools can be used to provide this type of continuous processing?

Options:

A.

Spark UDFs

B.

[Structured Streaming

C.

MLflow

D Delta Lake

D.

AutoML

Question 6

A machine learning engineer has created a webhook with the following code block:

Which of the following code blocks will trigger this webhook to run the associate job?

A)

B)

C)

D)

E)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

Question 7

A data scientist has developed and logged a scikit-learn random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the feature_importances_ of the original model object.

Which of the following lines of code can be used to restore the model object so that feature_importances_ is available?

Options:

A.

mlflow.load_model(model_uri)

B.

client.list_artifacts(run_id)["feature-importances.csv"]

C.

mlflow.sklearn.load_model(model_uri)

D.

This can only be viewed in the MLflow Experiments UI

E.

client.pyfunc.load_model(model_uri)

Question 8

A data scientist has written a function to track the runs of their random forest model. The data scientist is changing the number of trees in the forest across each run.

Which of the following MLflow operations is designed to log single values like the number of trees in a random forest?

Options:

A.

mlflow.log_artifact

B.

mlflow.log_model

C.

mlflow.log_metric

D.

mlflow.log_param

E.

There is no way to store values like this.

Question 9

A machine learning engineer is manually refreshing a model in an existing machine learning pipeline. The pipeline uses the MLflow Model Registry model "project". The machine learning engineer would like to add a new version of the model to "project".

Which of the following MLflow operations can the machine learning engineer use to accomplish this task?

Options:

A.

mlflow.register_model

B.

MlflowClient.update_registered_model

C.

mlflow.add_model_version

D.

MlflowClient.get_model_version

E.

The machine learning engineer needs to create an entirely new MLflow Model Registry model

Question 10

A machine learning engineer wants to view all of the active MLflow Model Registry Webhooks for a specific model.

They are using the following code block:

Which of the following changes does the machine learning engineer need to make to this code block so it will successfully accomplish the task?

Options:

A.

There are no necessary changes

B.

Replace list with view in the endpoint URL

C.

Replace POST with GET in the call to http request

D.

Replace list with webhooks in the endpoint URL

E.

Replace POST with PUT in the call to http request

Question 11

A machine learning engineer wants to programmatically create a new Databricks Job whose schedule depends on the result of some automated tests in a machine learning pipeline.

Which of the following Databricks tools can be used to programmatically create the Job?

Options:

A.

MLflow APIs

B.

AutoML APIs

C.

MLflow Client

D.

Jobs cannot be created programmatically

E.

Databricks REST APIs

Question 12

Which of the following tools can assist in real-time deployments by packaging software with its own application, tools, and libraries?

Options:

A.

Cloud-based compute

B.

None of these tools

C.

REST APIs

D.

Containers

E.

Autoscaling clusters

Question 13

Which of the following deployment paradigms can centrally compute predictions for a single record with exceedingly fast results?

Options:

A.

Streaming

B.

Batch

C.

Edge/on-device

D.

None of these strategies will accomplish the task.

E.

Real-time

Question 14

A machine learning engineer is attempting to create a webhook that will trigger a Databricks Jobjob_idwhen a model version for modelmodeltransitions into any MLflow Model Registry stage.

They have the following incomplete code block:

Which of the following lines of code can be used to fill in the blank so that the code block accomplishes the task?

Options:

A.

"MODEL_VERSION_CREATED"

B.

"MODEL_VERSION_TRANSITIONED_TO_PRODUCTION"

C.

"MODEL_VERSION_TRANSITIONED_TO_STAGING"

D.

"MODEL_VERSION_TRANSITIONED_STAGE"

E.

"MODEL_VERSION_TRANSITIONED_TO_STAGING", "MODEL_VERSION_TRANSITIONED_TO_PRODUCTION"

Question 15

Which of the following MLflow operations can be used to delete a model from the MLflow Model Registry?

Options:

A.

client.transition_model_version_stage

B.

client.delete_model_version

C.

client.update_registered_model

D.

client.delete_model

E.

client.delete_registered_model

Question 16

A machine learning engineer wants to deploy a model for real-time serving using MLflow Model Serving. For the model, the machine learning engineer currently has one model version in each of the stages in the MLflow Model Registry. The engineer wants to know which model versions can be queried once Model Serving is enabled for the model.

Which of the following lists all of the MLflow Model Registry stages whose model versions are automatically deployed with Model Serving?

Options:

A.

Staging. Production. Archived

B.

Production

C.

None. Staging. Production. Archived

D.

Staging. Production

E.

[None. Staging. Production

Question 17

Which of the following describes the concept of MLflow Model flavors?

Options:

A.

A convention that deployment tools can use to wrap preprocessing logic into a Model

B.

A convention that MLflow Model Registry can use to version models

C.

A convention that MLflow Experiments can use to organize their Runs by project

D.

A convention that deployment tools can use to understand the model

E.

A convention that MLflow Model Registrycan use to organize its Models by project

Question 18

A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on customer-level Spark DataFrame spark_df, but it is missing a few of the static features that were used when training the model. The customer_id column is the primary key of spark_df and the training set used when training and logging the model.

Which of the following code blocks can be used to compute predictions for spark_df when the missing feature values can be found in the Feature Store by searching for features by customer_id?

Options:

A.

df = fs.get_missing_features(spark_df, model_uri)

fs.score_model(model_uri, df)

B.

fs.score_model(model_uri, spark_df)

C.

df = fs.get_missing_features(spark_df, model_uri)

fs.score_batch(model_uri, df)

df = fs.get_missing_features(spark_df)

D.

fs.score_batch(model_uri, df)

E.

fs.score_batch(model_uri, spark_df)