Skip to main content

Section 12.5 Test Yourself

Checkpoint 12.5.1.

What is a kernel in the context of Data Science and Support Vector Machines?
  • A subset of the data set aside to guage the accuracy and bias of training data.
  • Incorrect. This data is called testing data.
  • The core part of an operating system that manages system resources and allows software to interact with hardware.
  • While technically correct, this particular kernal is more of a focus in Operating Systems courses. The definition of kernal in this chapter relates to Data Science and SVMs.
  • A function that separates data into two or more labels.
  • Incorrect. This is the definition of a separation function.
  • A mapping function that transforms data into a higher-dimensional space.
  • Correct!

Checkpoint 12.5.2.

Why should data be randomized before being divided into training data and test data?
  • Most randomize functions also clean data.
  • Incorrect. It is best practice to create functions that do one thing and one thing only.
  • The data may be sorted or grouped in some way. This can lead to training data bias.
  • Correct!
  • Randomizing data increases the speed at which the data is processed.
  • Incorrect. Some sorting algorithms have a better time complexity with randomized data, but this isn’t the case with dividing data.
  • There is no need to randomize data before dividing it into training and testing data.
  • Incorrect. Unless it can be confirmed that there is absolutely no patterns or ordering in the data, randomizing the data is necessary.
Chapter Challenge
Look up the term "confusion matrix" and then follow-up on some other terms such as Type I error, Type II error, sensitivity, and specificity. Think about how the support vector machine model could be modified to do better at either sensitivity or specificity.
For a super challenge, try using another dataset with the kernlab svm technology. There is a dataset called promotergene that is built into the kernlab package. You could also load up your own data set and try creating an svm model from that.
You have attempted of activities on this page.