Special Summer Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70special

CompTIA DA0-001 CompTIA Data+ Certification Exam Exam Practice Test

Page: 1 / 36
Total 363 questions

CompTIA Data+ Certification Exam Questions and Answers

Testing Engine

  • Product Type: Testing Engine
$37.5  $124.99

PDF Study Guide

  • Product Type: PDF Study Guide
$33  $109.99
Question 1

Which of the following is a best practice when updating a legacy data source?

Options:

A.

Placing old data in new fields

B.

Keeping only the most recent data

C.

Creating a codebook to document field changes

D.

Removing the data source from production

Question 2

Which of following is a non-relational database?

Options:

A.

Neo4j

B.

SQLite

C.

MySQL

D.

PostgreSQL

Question 3

Which of the following is an example of a discrete variable?

Options:

A.

The temperature of a hot tub

B.

The height of a horse

C.

The time to complete a task

D.

The number of people in an office

Question 4

Which of the following is the best reason for removing data outliers?

Options:

A.

Data varies significantly from others.

B.

Data is redundant in the table.

C.

Data is duplicated in the whole range.

D.

Data is missing from the table.

Question 5

A company’s marketing department wants to do a promotional campaign next month. A data analyst on the team has been asked to perform customer segmentation, looking at how recently a customer bought the product, at what frequency, and at what value. Which of the following types of analysis would this practice be considered?

Options:

A.

Prescriptive

B.

Trend

C.

Gap

D.

Custer

Question 6

Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.

In what phase are the group's R skills most relevant?

Options:

A.

Extract.

B.

Load.

C.

Transform.

D.

Purge.

Question 7

Which of the following is an example of a at flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 8

A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?

Options:

A.

Static

B.

Real-time

C.

Self-service

D.

Dynamic

Question 9

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

Options:

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Question 10

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

Options:

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Question 11

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 12

An organization wants to evaluate whether project activities are within the set projections and in line to meet the desired project targets. Which of the following types of analysis is best suited for this situation?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Descriptive analysis

D.

Exploratory analysis

Question 13

Which of the following query statements would be used when filtering data in a relational database management system? (Select two).

Options:

A.

ORDER BY

B.

HAVING

C.

WHERE

D.

SELECT

E.

INSERT

F.

GROUP BY

Question 14

A data analyst is setting up a data dashboard to monitor several ETL data streams to ensure that data is complete for later analysis. Which of the following audiences should the analyst target for this dashboard?

Options:

A.

Executives

B.

The management team

C.

Technical experts

D.

External vendors

Question 15

Which of the following data types should an analyst use to provide the most flexibility when recording emails on a form?

Options:

A.

Alphanumeric

B.

Text

C.

Discrete

D.

Continuous

Question 16

A county in Illinois is conducting a survey to determine the mean annual income per household. The county is 427sq mi (2.65q km). Which of the following sampling methods would MOST likely result in a representative sample?

Options:

A.

A stratified phone survey of 100 people that is conducted between 2:00 p.m. and 3:00 p.m.

B.

A systematic survey that is sent to 100 single-family homes in the county

C.

Surveys sent to ten randomly selected homes within 5mi (8km) of the county’s office

D.

Surveys sent to 100 randomly selected homes that are reflective of the population

Question 17

A company notifies its employees that emails will be automatically moved to a cloud-based server in 180 days. Which of the following describes this concept?

Options:

A.

Data deletion

B.

Data processing

C.

Data retention

D.

Data constraints

Question 18

A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?

Options:

A.

A self-serve dashboard of website performance that updates in real time

B.

A weekly log report of site visits and user actions

C.

A portal that is refreshed daily and reports errors classified by type

D.

A daily summary email indicating website outages for the previous day

Question 19

Which of the following is the best technique for transferring data from one database to another with some data manipulation?

Options:

A.

Application programming interfaces

B.

Delta load

C.

Extract, transform, load

D.

Export/import

Question 20

A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.

Which of the following query optimization techniques would effectively prevent SQL Injection attacks?

Options:

A.

Indexing.

B.

Subset of records.

C.

Temporary table in the query set.

D.

Parametrization.

Question 21

A data analyst needs to create a dashboard to help identify trends in the data sets. Which of the following is an appropriate consideration for dashboard development?

Options:

A.

Data sources and attributes

B.

Frequently asked questions

C.

A report from the data source

D.

A comparison of data sets

Question 22

A JSON file is an example of:

Options:

A.

structured data.

B.

web data.

C.

machine data.

D.

processed data.

Question 23

An analyst is designing a dashboard that will provide a story of the sales and sales customer ratio. The following data is available:

Which of the following charts should the analyst consider including in the dashboard?

Options:

A.

A column chart with site and sales

B.

A line chart with site and sales

C.

A pie chart with site and sales

D.

A scatter chart with site and sales

Question 24

An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?

Options:

A.

PI I

B.

PCI

C.

PBI

D.

PHI

Question 25

Which of the following defines the policies and procedures for managing the master data?

Options:

A.

Data administration

B.

Data stewardship

C.

Data ownership

D.

Data governance

Question 26

Which of the following is an example of structured data?

Options:

A.

A credit card number

B.

An email

C.

A photo

D.

Social media correspondence

Question 27

A business intelligence engineer needs to reduce the size of a data model for reporting purposes. The data set contains more than one million rows, and the table has a date-time column named Date. Which of the following should the analyst do to complete this task?

Options:

A.

Change the data type of the Date column to text.

B.

Trim the date.

C.

Round the hour of the Date column to the start of the hour.

D.

Split the Date column into two columns—time and date.

Question 28

Which of the following contains alphanumeric values?

Options:

A.

10.1Ε²

B.

13.6

C.

1347

D.

A3J7

Question 29

An analyst wants to extract data from a variety of sources and store the data in a cloud-based environment prior to cleaning. Which of the following integration techniques should the analyst use?

Options:

A.

ETL

B.

API

C.

SQL

D.

ELT

Question 30

An analyst is working with a data set that lists individuals' first and last names in separate columns. Which of the following processes should the analyst use to combine the first and last names into a single spreadsheet cell?

Options:

A.

Transpose

B.

Blend

C.

Concatenate

D.

Merges

Question 31

A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?

Options:

A.

Systematic sampling

B.

Convenience sampling

C.

Stratified sampling

D.

Random sampling

Question 32

Which of the following is a difference between a primary key and a unique key?

Options:

A.

A unique key cannot take null values, whereas a primary key can take null values.

B.

There can be only one primary key in a data set, whereas there can be multiple unique keys.

C.

A primary key can take a value more than once, whereas a unique key cannot take a value more than once.

D.

A primary key cannot be a date variable, whereas a unique key can be.

Question 33

A data analyst is performing a data merge within a spreadsheet using the tables below:

The analyst is attempting to pull the addresses from Table 2 into Table 1 using the last names and is receiving an error message. Which of the following steps can the analyst perform to fix the error?

Options:

A.

Use concatenate to combine the tables.

B.

Ensure the formula is pulling from right to left.

C.

Sort the data by the last name field.

D.

Review the spelling and data type.

Question 34

Which of the following is an example of a strategy to reduce statistical errors?

Options:

A.

Removing outliers

B.

Adding more data

C.

Transformation

D.

Recoding data

Question 35

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

Options:

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Question 36

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

Options:

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Question 37

A data architect is designing a data solution for a retail clothing store chain. Each store has a database that tracks sales transactions. The data architect needs to create a summary table that will be used for a senior executive dashboard. The summary table should not contain duplicate store information. Which of the following should the data architect create?

Options:

A.

A check constraint

B.

A primary key

C.

A foreign key

D.

A unique constraint

Question 38

A client has requested an analysis of all pet care items purchased by current customers and their social media connections in the past 12 months. Which of the following data analysis techniques would be the best choice given these requirements?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory data analysis

Question 39

Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:

Using this information, which of the following students had the BEST score?

Options:

A.

Randy

B.

Katie

C.

Ralph

D.

Jean

Question 40

Which of the following should an analyst do to best summarize the data on a data set?

Options:

A.

Filtering

B.

Aggregation

C.

Sorting

D.

Concatenation

Question 41

Which one the following is not considered an aggregate function?

Options:

A.

SUM

B.

MIN

C.

SELECT

D.

MAX

Question 42

Consider the following dataset which contains information about houses that are for sale:

Which of the following string manipulation commands will combine the address and region namecolumns to create a full address?

full_address------------------------- 85 Turner St, Northern Metropolitan 25 Bloomburg St, Northern Metropolitan 5 Charles St, Northern Metropolitan 40 Federation La, Northern Metropolitan 55a Park St, Northern Metropolitan

Options:

A.

SELECT CONCAT(address, ' , ' , regionname) AS full_address FROM melb LIMIT 5;

B.

SELECT CONCAT(address, '-' , regionname) AS full_address FROM melb LIMIT 5;

C.

SELECT CONCAT(regionname, ' , ' , address) AS full_address FROM melb LIMIT 5

D.

SELECT CONCAT(regionname, '-' , address) AS full_address FROM melb LIMIT 5;

Question 43

Which of the following best describes a difference between JSON and XML?

Options:

A.

JSON is quicker to read and write.

B.

JSON has to use an end tag.

C.

JSON strings are longer

D.

JSON is much more difficult to parse.

Question 44

Exhibit.

Which of the following logical statements results in Table B?

A)

B)

C)

D)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question 45

Which of the following is the best description of discrete data types?

Options:

A.

Non-numeric data used to describe attributes of a population sample

B.

The frequency of the number of times each value occurs by using whole numbers

C.

Numeric values that can be measured on a continuous scale

D.

Non-numeric data used to describe attributes of a population sample ranked in a specific order

Question 46

Given the diagram below:

Which of the following data schemas shown?

Options:

A.

Key-value pairs

B.

Online transactional processing

C.

Data Lake

D.

Relational database

Question 47

Given the diagram below:

Which of the following steps is missing?

Options:

A.

Remove redundant data.

B.

Validate the data types.

C.

Connect to the data API.

D.

Normalize the data.

Question 48

Which of the following data manipulation techniques is an example of a logical function?

Options:

A.

WHERE

B.

AGGREGATE

C.

BOOLEAN

D.

IF

Question 49

Which of the following differentiates a flat text file from other data types?

Options:

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Question 50

Given the following graph:

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D, over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 51

You would like to measure how well an organization is achieving its goals.

What type of analysis should you perform?

Options:

A.

Performance analysis.

B.

Outlier analysis.

C.

Predictive analysis.

D.

Trend analysis.

Question 52

Which one of the following programming languages is specifically designed for use in analytics applications?

Options:

A.

Python.

B.

R

C.

C++

D.

Java.

Question 53

Given the following table:

Which of the following methods is the best way to describe the changes in the values in the table?

Options:

A.

Average

B.

Range

C.

Standard deviation

D.

Median

Question 54

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

Options:

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Question 55

A data analyst must separate the column shown below into multiple columns for each component of the name:

Which of the following data manipulation techniques should the analyst perform?

Options:

A.

Imputing

B.

Transposing

C.

Parsing

D.

Concatenating

Question 56

A gambler thinks that a coin is fair and is equally likely to turn up heads or tails when the coin is flipped. Which of the following tests should the gambler use to fest this hypothesis?

Options:

A.

t-test

B.

Chi-squared test

C.

Rank sum test

D.

Ratio test

Question 57

Which of the following is an example of PII?

Options:

A.

Age

B.

Name

C.

Ethnicity

D.

Gender

Question 58

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Question 59

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 60

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

Options:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Question 61

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

Options:

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Question 62

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?

Options:

A.

Determine the data needs and sources for analysis.

B.

Initiate the analysis for exploratory data analysis.

C.

Review the business questions to understand the scope.

D.

Finalize the methodology to solve the problem.

Question 63

Which of the following techniques is used to quantify data?

Options:

A.

Decoding

B.

Enumeration

C.

Coding

D.

Structure

Question 64

Which one of the following would not normally be considered a summary statistic?

Options:

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Question 65

Which of the following is a relational database?

Options:

A.

SQL

B.

Excel

C.

JSON

D.

NoSQL

Question 66

A customer survey reveals 90% positive feedback. Which of the following statistical methods would be best to utilize to determine the reliability of a data set and predict how a larger sample of customers over the same time period might respond?

Options:

A.

Calculate a high variance on survey responses.

B.

Calculate the maximum range of the survey responses.

C.

Calculate a low standard deviation on survey responses.

D.

Remove any data more than 4 standard deviation from the mean.

Question 67

A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:

Income category 1: less than $1.

Income category 2: more than $1 and less than $20,000.

Income category 3: more than $20,001 and less than $40,000.

Income category 4: more than $40,001.

Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?

Options:

A.

Data merge

B.

Derived variables

C.

Data blending

D.

Data append

Question 68

Samantha needs to share a list of her organization's top 50 customers with the VP of sales.

She would like to include the name of the customer, the business they represent, their contact information, and their total sales over the past year.

The VP does not have any specialized analytics skills or software but would like to make some personal notes on the dataset.

What would be the best tool for Samantha to use to share this information?

Options:

A.

Power BI.

B.

Microsoft Excel.

C.

Minitab.

D.

SAS.

Question 69

Given the following report:

Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).

Options:

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period lhe report covers

F.

The date on which the report was run

Question 70

Which of the following describes the use of a representative amount of data from a main repository?

Options:

A.

Observation

B.

Delta load

C.

Web scraping

D.

Sampling

Question 71

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

Options:

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Question 72

Jenny wants to study the academic performance of undergraduate sophomores and wants to determine the average grade point average at different points during an academic year.

What best describes the data set she needs?

Options:

A.

Sample.

B.

Observation.

C.

Variable.

D.

Population.

Question 73

A healthcare data analyst notices that one data set in the column for BloodPressure contains several outliers that need to be replaced with meaningful values. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Recode

B.

Impute

C.

Append

D.

Reduction

Question 74

An analyst conducted a preliminary analysis for a data set and identified several patterns and anomalies. Which of the following analysis techniques did the analyst use?

Options:

A.

Performance analysis

B.

Exploratory analysis

C.

Link analysis

D.

Trend analysis

Question 75

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

Options:

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Question 76

Which one of the following in NOT a common data integration tool?

Options:

A.

XSS

B.

ELT

C.

ETL

D.

APIs

Question 77

An analyst reviews the following table:

Which of the following data types is represented in the values in the RefNo column?

Options:

A.

Numeric

B.

Real Number

C.

Currency

D.

Alphanumeric

Question 78

A data analyst needs to create a weekly recurring report on sales performance and distribute it to all sales managers. Which of the following would be the BEST method to automate and ensure successful delivery for this task?

Options:

A.

Use scheduled report delivery.

B.

Implement subscription access delivery.

C.

Print out a copy.

D.

Upload the report to the server.

Question 79

Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?

Options:

A.

A dashboard with a continuous data stream and saved searches

B.

A report of test scores by classroom, emailed to the superintendent at the end of the month

C.

A report of test scores with pie charts showing student performance

D.

A dashboard with a scheduled delivery, the ability to filter scores by school, and bar charts for comparison

Question 80

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?

Options:

A.

Data merge

B.

Data append

C.

Data blending

D.

Data imputation

Question 81

Which of the following data manipulation techniques should an analyst use to hide unnecessary data during analysis?

Options:

A.

Filtering

B.

Parametrization

C.

Sorting

D.

Indexing

Question 82

An analyst modified a data set that had a number of issues. Given the original and modified versions:

Which of the following data manipulation techniques did the analyst use?

Options:

A.

Imputation

B.

Recoding

C.

Parsing

D.

Deriving

Question 83

A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?

Options:

A.

Create a dashboard displaying a data refresh date so users know the current sales numbers and configure permissions to control access.

B.

Create a dashboard for sales numbers, pipeline, and team and individual performance for the management team.

C.

Create a dashboard with filters for the overall team, individuals, and management. Users can filter to see the data they want.

D.

Create a dashboard with views for team, individuals, and management. Configure permissions to control access.

Question 84

A data engineer is creating a database field to capture whether a customer likes vanilla ice cream. Which of the following data types is the best to capture this information?

Options:

A.

Integer

B.

Boolean

C.

Categorical

D.

Numeric

Question 85

Given the table below:

Which of the following variable types BEST describes the “Year” column?

Options:

A.

Numeric

B.

Date

C.

Alphanumeric

D.

Text

Question 86

An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?

Options:

A.

Derived

B.

Categorical

C.

Continuous

D.

Control

Question 87

Which of the following data types must be used when working with variables that require classification into two or more groups before analysis?

Options:

A.

Discrete

B.

Numerical

C.

Alphanumeric

D.

Categorical

Question 88

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Question 89

Given the following:

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

Options:

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Question 90

Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?

Options:

A.

Rephrase the business requirement.

B.

Determine the data necessary for the analysis

C.

Build a mock dashboard/presentation layout.

D.

Perform exploratory data analysis.

Question 91

A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following

data visualization methods should the analyst use?

Options:

A.

A bubble chart

B.

A waterfall chart

C.

A scatter plot

D.

A line chart

Question 92

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Customer Table -

In-store Transactions –

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 93

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should theanalyst recommend?

Options:

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Question 94

You have two databases tables that you would like to join together using a foreign key relationship.

What term best describes this action?

Options:

A.

Blending.

B.

Appending.

C.

Mixing.

D.

Merging.

Question 95

What role in a data governance is typically responsible for day-to-day oversight of data use?

Options:

A.

Data processors.

B.

Data custodians

C.

Data owners.

D.

Data stewards.

Question 96

Which of the following BEST describes the issue in which character values are mixed with integer values in a data set column?

Options:

A.

Duplicate data

B.

Missing data

C.

Data outliers

D.

Invalid data type

Question 97

Which of the following actions should be taken when transmitting data to mitigate the chance of a data leak occurring? (Choose two.)

Options:

A.

Data identification

B.

Data processing

C.

Data Reporting

D.

Data encryption

E.

Data masking

F.

Fata removal

Question 98

A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?

Options:

A.

1.16

B.

6

C.

36

D.

72

Question 99

Five dogs have the following heights in millimeters:

300, 430, 170, 470, 600

Which of the following is the mean height for the five dogs?

Options:

A.

394mm

B.

405mm

C.

493mm

D.

504mm

Question 100

Which one of the following is a measure of dispersion?

Options:

A.

Variance.

B.

Mode.

C.

Median.

D.

Mean.

Question 101

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 102

A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?

Options:

A.

Order number. salesperson. date shipped, recipient address, and price

B.

Item name, salesperson. recipient address, shipping cost. and date shipped

C.

Item number, item name, salesperson. date sold. and price

D.

Item name. salesperson. price. shipping cost. and date shipped

Question 103

Which of the following would be used to store unstructured data from different sources?

Options:

A.

A data lake

B.

A database management system

C.

A database

D.

A data warehouse

Question 104

An analyst is explaining the company’s financial systems and reporting tools to a new coworker. Which of the following data quality dimensions are the most important? (Select three).

Options:

A.

Data formatting

B.

Data accuracy

C.

Data maturity

D.

Data field

E.

Data completeness

F.

Data consistency

G.

Data diversity

Question 105

Which of the following statistical methods requires two or more categorical variables?

Options:

A.

Simple linear regression

B.

Chi-squared test

C.

Z-test

D.

Two-sample t-test

Question 106

A data analyst has been asked to create a sales report that calculates the rolling 12-month average for sales. If the report will be published on November 1, 2020, which of the following months shouts the report cover?

Options:

A.

October 1, 2019 to October 31, 2020

B.

October 31, 2020 to November 1, 2021

C.

November 1, 2019 to October 31, 2020

D.

October 31, 2019 to October 31, 2020

Question 107

Which of the following query optimization techniques involves examining only the data that is needed for a particular task?

Options:

A.

Making a temporary table

B.

Creating a flat file

C.

Indexing documents

D.

Creating an execution plan

Question 108

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Page: 1 / 36
Total 363 questions