Which of the following is a best practice when updating a legacy data source?
Which of following is a non-relational database?
Which of the following is an example of a discrete variable?
Which of the following is the best reason for removing data outliers?
A company’s marketing department wants to do a promotional campaign next month. A data analyst on the team has been asked to perform customer segmentation, looking at how recently a customer bought the product, at what frequency, and at what value. Which of the following types of analysis would this practice be considered?
Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.
In what phase are the group's R skills most relevant?
Which of the following is an example of a at flat file?
A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?
A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:
Which of the following types of charts should be considered?
Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.
What can she do to get prevent confusion as see seeks feedback before publishing the report?
Choose the best answer.
After completing web scraping, which of the following file formats needs to be parsed?
An organization wants to evaluate whether project activities are within the set projections and in line to meet the desired project targets. Which of the following types of analysis is best suited for this situation?
Which of the following query statements would be used when filtering data in a relational database management system? (Select two).
A data analyst is setting up a data dashboard to monitor several ETL data streams to ensure that data is complete for later analysis. Which of the following audiences should the analyst target for this dashboard?
Which of the following data types should an analyst use to provide the most flexibility when recording emails on a form?
A county in Illinois is conducting a survey to determine the mean annual income per household. The county is 427sq mi (2.65q km). Which of the following sampling methods would MOST likely result in a representative sample?
A company notifies its employees that emails will be automatically moved to a cloud-based server in 180 days. Which of the following describes this concept?
A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?
Which of the following is the best technique for transferring data from one database to another with some data manipulation?
A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.
Which of the following query optimization techniques would effectively prevent SQL Injection attacks?
A data analyst needs to create a dashboard to help identify trends in the data sets. Which of the following is an appropriate consideration for dashboard development?
A JSON file is an example of:
An analyst is designing a dashboard that will provide a story of the sales and sales customer ratio. The following data is available:
Which of the following charts should the analyst consider including in the dashboard?
An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?
Which of the following defines the policies and procedures for managing the master data?
Which of the following is an example of structured data?
A business intelligence engineer needs to reduce the size of a data model for reporting purposes. The data set contains more than one million rows, and the table has a date-time column named Date. Which of the following should the analyst do to complete this task?
Which of the following contains alphanumeric values?
An analyst wants to extract data from a variety of sources and store the data in a cloud-based environment prior to cleaning. Which of the following integration techniques should the analyst use?
An analyst is working with a data set that lists individuals' first and last names in separate columns. Which of the following processes should the analyst use to combine the first and last names into a single spreadsheet cell?
A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?
Which of the following is a difference between a primary key and a unique key?
A data analyst is performing a data merge within a spreadsheet using the tables below:
The analyst is attempting to pull the addresses from Table 2 into Table 1 using the last names and is receiving an error message. Which of the following steps can the analyst perform to fix the error?
Which of the following is an example of a strategy to reduce statistical errors?
A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.
Which of the following data manipulation techniques would he use to obtain this information?
An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:
Which of the following conclusions is accurate at a 95% confidence interval?
A data architect is designing a data solution for a retail clothing store chain. Each store has a database that tracks sales transactions. The data architect needs to create a summary table that will be used for a senior executive dashboard. The summary table should not contain duplicate store information. Which of the following should the data architect create?
A client has requested an analysis of all pet care items purchased by current customers and their social media connections in the past 12 months. Which of the following data analysis techniques would be the best choice given these requirements?
Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:
Using this information, which of the following students had the BEST score?
Which of the following should an analyst do to best summarize the data on a data set?
Which one the following is not considered an aggregate function?
Consider the following dataset which contains information about houses that are for sale:
Which of the following string manipulation commands will combine the address and region namecolumns to create a full address?
full_address------------------------- 85 Turner St, Northern Metropolitan 25 Bloomburg St, Northern Metropolitan 5 Charles St, Northern Metropolitan 40 Federation La, Northern Metropolitan 55a Park St, Northern Metropolitan
Which of the following best describes a difference between JSON and XML?
Exhibit.
Which of the following logical statements results in Table B?
A)
B)
C)
D)
Which of the following is the best description of discrete data types?
Given the diagram below:
Which of the following data schemas shown?
Given the diagram below:
Which of the following steps is missing?
Which of the following data manipulation techniques is an example of a logical function?
Which of the following differentiates a flat text file from other data types?
Given the following graph:
Which of the following summary statements upholds integrity in data reporting?
You would like to measure how well an organization is achieving its goals.
What type of analysis should you perform?
Which one of the following programming languages is specifically designed for use in analytics applications?
Given the following table:
Which of the following methods is the best way to describe the changes in the values in the table?
A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:
Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?
A data analyst must separate the column shown below into multiple columns for each component of the name:
Which of the following data manipulation techniques should the analyst perform?
A gambler thinks that a coin is fair and is equally likely to turn up heads or tails when the coin is flipped. Which of the following tests should the gambler use to fest this hypothesis?
Which of the following is an example of PII?
Which of the ing is the correct ion for a tab-delimited spre file?
The number of phone calls that the call center receives in a day is an example of:
A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:
Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)
Which of the following is the first step an analyst should perform upon receiving a business request for analysis?
Which of the following techniques is used to quantify data?
Which one of the following would not normally be considered a summary statistic?
Which of the following is a relational database?
A customer survey reveals 90% positive feedback. Which of the following statistical methods would be best to utilize to determine the reliability of a data set and predict how a larger sample of customers over the same time period might respond?
A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:
Income category 1: less than $1.
Income category 2: more than $1 and less than $20,000.
Income category 3: more than $20,001 and less than $40,000.
Income category 4: more than $40,001.
Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?
Samantha needs to share a list of her organization's top 50 customers with the VP of sales.
She would like to include the name of the customer, the business they represent, their contact information, and their total sales over the past year.
The VP does not have any specialized analytics skills or software but would like to make some personal notes on the dataset.
What would be the best tool for Samantha to use to share this information?
Given the following report:
Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).
Which of the following describes the use of a representative amount of data from a main repository?
Given the following data tables:
Which of the following MDM processes needs to take place FIRST?
Jenny wants to study the academic performance of undergraduate sophomores and wants to determine the average grade point average at different points during an academic year.
What best describes the data set she needs?
A healthcare data analyst notices that one data set in the column for BloodPressure contains several outliers that need to be replaced with meaningful values. Which of the following data manipulation techniques should the analyst use?
An analyst conducted a preliminary analysis for a data set and identified several patterns and anomalies. Which of the following analysis techniques did the analyst use?
An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:
Which of the following conclusions is accurate at a 95% confidence interval?
Which one of the following in NOT a common data integration tool?
An analyst reviews the following table:
Which of the following data types is represented in the values in the RefNo column?
A data analyst needs to create a weekly recurring report on sales performance and distribute it to all sales managers. Which of the following would be the BEST method to automate and ensure successful delivery for this task?
Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?
An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?
Which of the following data manipulation techniques should an analyst use to hide unnecessary data during analysis?
An analyst modified a data set that had a number of issues. Given the original and modified versions:
Which of the following data manipulation techniques did the analyst use?
A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?
A data engineer is creating a database field to capture whether a customer likes vanilla ice cream. Which of the following data types is the best to capture this information?
Given the table below:
Which of the following variable types BEST describes the “Year” column?
An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?
Which of the following data types must be used when working with variables that require classification into two or more groups before analysis?
Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?
Given the following:
Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?
Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?
A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following
data visualization methods should the analyst use?
A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:
Customer Table -
In-store Transactions –
Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?
A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should theanalyst recommend?
You have two databases tables that you would like to join together using a foreign key relationship.
What term best describes this action?
What role in a data governance is typically responsible for day-to-day oversight of data use?
Which of the following BEST describes the issue in which character values are mixed with integer values in a data set column?
Which of the following actions should be taken when transmitting data to mitigate the chance of a data leak occurring? (Choose two.)
A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?
Five dogs have the following heights in millimeters:
300, 430, 170, 470, 600
Which of the following is the mean height for the five dogs?
Which one of the following is a measure of dispersion?
Which of the following technologies would be best suited for creating a multiple linear regression model?
A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?
Which of the following would be used to store unstructured data from different sources?
An analyst is explaining the company’s financial systems and reporting tools to a new coworker. Which of the following data quality dimensions are the most important? (Select three).
Which of the following statistical methods requires two or more categorical variables?
A data analyst has been asked to create a sales report that calculates the rolling 12-month average for sales. If the report will be published on November 1, 2020, which of the following months shouts the report cover?
Which of the following query optimization techniques involves examining only the data that is needed for a particular task?
Angela is aggregating data from CRM system with data from an employee system.
While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.
What kind of issues is Angela facing?
Choose the best answer.