As a data scientist, you are working on a global health data set that has data from more than 50
countries. You want to encode three features such as 'countries', 'race' and 'body organ' as
categories.
Which option would you use to encode the categorical feature?
You have just received a new data set from a colleague. You want to quickly find out summary
information about the data set, such as the types of features, the total number of observations, and
distributions of the data. Which Accelerated Data Science (ADS) SDK method from the ADSDataset
class would you use?
Which Oracle Accelerated Data Science (ADS) classes can be used for easy access to data sets from
reference libraries and index websites such as scikit-learn?
As you are working in your notebook session, you find that your notebook session does not have enough compute CPU and memory for your workload. How would you scale up your notebook session without losing your work?
You are attempting to save a model from a notebook session to the model catalog by using the
Accelerated Data Science (ADS) SDK, with resource principal as the authentication signer, and you
get a 404 authentication error. Which two should you look for to ensure permissions are set up
correctly?
data scientist, you use the Oracle Cloud Infrastructure (OCI) Language service to train custom
models. Which types of custom models can be trained?
Which two statements are true about published conda environments?
You have just received a new data set from a colleague. You want to quickly find out summary information about the data set, such as the types of features, total number of observations, and data distributions, Which Accelerated Data Science (ADS) SDK method from the AD&Dataset class would you use?
You are building a model and need input that represents data as morning, afternoon, or evening.
However, the data contains a time stamp. What part of the Data Science life cycle would you be in
when creating the new variable?
You are a data scientist leveraging the Oracle Cloud Infrastructure (OCI) Language AI service for
various types of text analyses. Which TWO capabilities can you utilize with this tool?
Using Oracle AutoML, you are tuning hyperparameters on a supported model class and have
specified a time budget. AutoML terminates computation once the time budget is exhausted. What
would you expect AutoML to return in case the time budget is exhausted before hyperparameter
tuning is completed?
You are using Oracle Cloud Infrastructure (OCI) Anomaly Detection to train a model to detect
anomalies in pump sensor data. How does the required False Alarm Probability setting affect an
anomaly detection model?
You are working as a data scientist for a healthcare company. They decide to analyze the data to
find patterns in a large volume of electronic medical records. You are asked to build a PySpark
solution to analyze these records in a JupyterLab notebook. What is the order of recommended
steps to develop a PySpark application in Oracle Cloud Infrastructure (OCI) Data Science?
You are a data scientist with a set of text and image files that need annotation, and you want to use Oracle Cloud Infrastructure (OCI) Data Labeling. Which of the following THREE an-notation classes are supported by the tool.?
You are building a model and need input that represents data as morning, afternoon, or evening. However, the data contains a time stamp. What part of the Data Science life cycle would you be in when creating the new variable?
You are a data scientist working inside a notebook session and you attempt to pip install a
package from a public repository that is not included in your conda environment. After running this
command, you get a network timeout error.
What might be missing from your networking configuration?
You are a data scientist leveraging the Oracle Cloud Infrastructure (OCI) Language AI service for various types of text analyses. Which TWO capabilities can you utilize with this tool?
What preparation steps are required to access an Oracle AI service SDK from a Data Science
notebook session?
You have trained three different models on your data set using Oracle AutoML. You want to
visualize the behavior of each of the models, including the baseline model, on the test set. Which
class should be used from the Accelerated Data Science (ADS) SDK to visually compare the models?
You have trained a machine learning model on Oracle Cloud Infrastructure (OCI) Data Science,
and you want to save the code and associated pickle file in a Git repository. To do this, you have to
create a new SSH key pair to use for authentication. Which SSH command would you use to create
the public/private algorithm key pair in the notebook session?
What preparation steps are required to access an Oracle AI service SDK from a Data Science notebook session?
When preparing your model artifact to save it to the Oracle Cloud Infrastructure (OCI) Data Science model catalog, you create a score.py file. What is the purpose of the score.py fie?
You are creating an Oracle Cloud Infrastructure (OCI) Data Science job that will run on a recurring basis in a production environment. This job will pick up sensitive data from an Object Storage bucket, train a model, and save it to the model catalog. How would you design the authentication mechanism for the job?
Six months ago, you created and deployed a model that predicts customer churn for a call
centre. Initially, it was yielding quality predictions. However, over the last two months, users are
questioning the credibility of the predictions.
Which two methods would you employ to verify the accuracy of the model?