Offering Free access to Databricks Certification Databricks-Certified-Professional-Data-Scientist Exam Questions Pool Bank

Databricks Certified Professional Data Scientist Exam Questions and Answers

Testing Engine

Product Type: Testing Engine

$37.5 ~~$124.99~~

Add to Cart

PDF + Testing Engine

Product Type: PDF + Testing Engine

$52.5 ~~$174.99~~

Add to Cart

PDF Study Guide

Product Type: PDF Study Guide

$33 ~~$109.99~~

Add to Cart

Question 1

Which of the following are advantages of the Support Vector machines?

Options:

Effective in high dimensional spaces.

it is memory efficient

possible to specify custom kernels

Effective in cases where number of dimensions is greater than the number of samples

Number of features is much greater than the number of samples, the method still give good performances

SVMs directly provide probability estimates

Question 2

In which of the following scenario you should apply the Bay's Theorem

Options:

The sample space is partitioned into a set of mutually exclusive events {A1, A2, . .., An }.

Within the sample space, there exists an event B, for which P(B) > 0.

The analytical goal is to compute a conditional probability of the form: P(Ak | B ).

In all above cases

Question 3

You are creating a model for the recommending the book at Amazon.com, so which of the following recommender system you will use you don't have cold start problem?

Options:

Naive Bayes classifier

Item-based collaborative filtering

User-based collaborative filtering

Content-based filtering

Question 4

In which of the scenario you can use the regression to predict the values

Options:

Samsung can use it for mobile sales forecast

Mobile companies can use it to forecast manufacturing defects

Probability of the celebrity divorce

Only 1 and 2

All 1 ,2 and 3

Question 5

Which of the following is not a correct application for the Classification?

Options:

credit scoring

tumor detection

image recognition

drug discovery

Question 6

Select the correct statement which applies to K-Nearest Neighbors

Options:

No Assumption about the data

Computationally expensive

Require less memory

Works with Numeric Values

Question 7

You are having 1000 patients' data with the height and age. Where age in years and height in meters. You wanted to create cluster using this two attributes. You wanted to have near equal effect for both the age and height while creating the cluster. What you can do?

Options:

You will be adding height with the numeric value 100

You will be converting each height value to centimeters

You will be dividing both age and height with their respective standard deviation

You will be taking square root of height

Question 8

Clustering is a type of unsupervised learning with the following goals

Options:

Maximize a utility function

Find similarities in the training data

Not to maximize a utility function

1 and 2

2 and 3

Question 9

A fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of the

Options:

Presence of the other features.

Absence of the other features.

Presence or absence of the other features

None of the above

Question 10

A problem statement is given as below

Hospital records show that of patients suffering from a certain disease, 75% die of it. What is the probability that of 6 randomly selected patients, 4 will recover?

Which of the following model will you use to solve it.

Options:

Binomial

Poisson

Normal

Any of the above

Question 11

If E1 and E2 are two events, how do you represent the conditional probability given that E2 occurs given that E1 has occurred?

Options:

P(E1)/P(E2)

P(E1+E2)/P(E1)

P(E2)/P(E1)

P(E2)/(P(E1+E2)

Question 12

You are studying the behavior of a population, and you are provided with multidimensional data at the individual level. You have identified four specific individuals who are valuable to your study, and would like to find all users who are most similar to each individual. Which algorithm is the most appropriate for this study?

Options:

Association rules

Decision trees

Linear regression

K-means clustering

Question 13

Question-18. What is the best way to ensure that the k-means algorithm will find a good clustering of a collection of vectors?

Options:

Only consider values of k larger than log(N), where N is the number of observations in the data set

Run at least log(N) iterations of Lloyd's algorithm, where N is the number of observations in the data set

Choose the initial centroids so that they all He along different axes

Choose the initial centroids so that they are far away from each other

Question 14

Let's say you have two cases as below for the movie ratings

1. You recommend to a user a movie with four stars and he really doesn't like it and he'd rate it two stars

2. You recommend a movie with three stars but the user loves it (he'd rate it five stars). So which statement correctly applies?

Options:

In both cases, the contribution to the RMSE is the same

In both cases, the contribution to the RMSE is the different

In both cases, the contribution to the RMSE, could varies

None of the above

Question 15

A data scientist is asked to implement an article recommendation feature for an on-line magazine.

The magazine does not want to use client tracking technologies such as cookies or reading history. Therefore, only the style and subject matter of the current article is available for making recommendations. All of the magazine's articles are stored in a database in a format suitable for analytics.

Which method should the data scientist try first?

Options:

K Means Clustering

Naive Bayesian

Logistic Regression

Association Rules

Question 16

Which of the following question statement falls under data science category?

Options:

What happened in last six months?

How many products have been sold in a last month?

Where is a problem for sales?

Which is the optimal scenario for selling this product?

What happens, if these scenario continues?

Question 17

In which of the scenario you can use the linear regression model?

Options:

Predicting Home Price based on the location and house area

Predicting demand of the goods and services based on the weather

Predicting tumor size reduction based on input as number of radiation treatment

Predicting sales of the text book based on the number of students in state

Question 18

As a data scientist consultant at ABC Corp, you are working on a recommendation engine for the learning resources for end user. So Which recommender system technique benefits most from additional user preference data?

Options:

Naive Bayes classifier

Item-based collaborative filtering

Logistic Regression

Content-based filtering

Question 19

You are doing advanced analytics for the one of the medical application using the regression and you have two variables which are weight and height and they are very important input variables, which cannot be ignored and they are also highly co-related. What is the best solution for that?

Options:

You will take cube root of height

You will take square root of weight

You will take square of the height.

You would consider using BMI (Body Mass Index)

Question 20

You are creating a regression model with the input income, education and current debt of a customer, what could be the possible output from this model.

Options:

Customer fit as a good

Customer fit as acceptable or average category

expressed as a percent, that the customer will default on a loan

1 and 3 are correct

2 and 3 are correct

Load More Databricks-Certified-Professional-Data-Scientist Questions

Easter Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70special

Databricks Databricks-Certified-Professional-Data-Scientist Databricks Certified Professional Data Scientist Exam Exam Practice Test

Databricks Certified Professional Data Scientist Exam Questions and Answers

Testing Engine

PDF + Testing Engine

PDF Study Guide

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation: