Test Yourself: Identifying Data Problems

Section 2.5 Test Yourself: Identifying Data Problems

According to the text, what must a data scientist do to successfully identify a data problem in a specific domain like farming?

Sit in front of a computer all day and work with a program like R.
🔗
No, the text explicitly states that this impression is a mistake.
Focus only on reading books and watching videos about the domain.
🔗
No, the text says the best way to gain domain knowledge is to ask subject matter experts.
Immediately begin verifying stories told by professionals.
🔗
No, the first step is to get the professional to tell the stories; verification comes later.
Become immersed in the problem domain to think like a subject matter expert.
🔗
Correct! The text emphasizes that a data scientist must learn to think like a farmer to identify a farmer’s data problems.

What is the first step in identifying an anomaly?

Look for the exception cases that are far from the center.
🔗
No, this is what you do after you have defined the center or typical case.
Focus attention on a small grouping of trees that lost more fruit.
🔗
No, this is an example of a potential anomaly, not the first step in the process of identifying one.
Define the central or most typical occurrences.
🔗
Right! You first need a yardstick for what is "typical" before you can identify what is "unusual".
Perform a systematic count of lost fruit under a random sample of trees.
🔗
No, this is a method used to establish what is typical, but defining "typical" is the actual first step.

You have attempted of activities on this page.