.. Copyright (C) Google, Runestone Interactive LLC
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0
International License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-sa/4.0/.
Module A Preface
================
Even if you have never seen or done any statistics or computational science
before this class (or even if you don’t know what those words mean), you have
definitely encountered them in your daily life.
Maybe you’ve heard of `Nate Silver`_ and his blog `FiveThirtyEight`_ (named
after the number of electors in the US electoral college), who `predicted`_ 49
of the 50 states correctly in the 2008 federal election and 50 of the 50 states
correctly in the 2012 federal election. Maybe you’ve read the book `Hidden
Figures`_ or seen `the movie with the same name`_, a
true story that follows `Katherine Johnson`_, `Mary Jackson`_, and `Dorothy
Vaughan`_, three groundbreaking engineers and mathematicians in their careers at
NASA at a time when NASA was racially segregated and gender discrimination was
pervasive.
Or maybe you’ve read about how pathologists are `using machine learning to detect
cancer in real-time`_, or how Deepmind is `using neural networks to train
computers`_ to beat the best `Go`_ players in the world. Maybe you’ve seen
`Moneyball`_ or read `the non-fiction book of the same name`_, in which `Billy
Beane`_, the general manager of the Major League Baseball Oakland A’s team, uses
`“sabermetrics”`_ (essentially just statistics applied to baseball) to assemble
a division-winning team with low budget and no experienced players.
All of these examples fall within the realm of data science, which encompasses
statistics, decision theory, machine learning, artificial intelligence, and so
much more. But data science doesn’t have to be as complicated as sending
astronauts to the moon or predicting elections! It can be as simple as looking
up the weather before deciding what to wear, or checking the price of a t-shirt
at multiple shops before deciding where to buy it, or reading some reviews of a
hit new movie before deciding whether or not to see it.
The more informed you are about data, the better you can use it to your
advantage. Only by understanding how to use it can you find ways to fix flaws
that exist in data collection and usage today. For some examples of problems
with data today, check out `this article`_. These, and the preceding examples,
are all examples of using data to inform predictions or decisions. The purpose
of this class is to give you the tools to help you *use data to make more
informed decisions*, whether that be in your classes, at work, or even
day-to-day life.
In this course, you will learn the following key skills.
- Understand the kinds of questions data science can answer, and learn to
formulate and answer your own research questions using a data-driven
approach.
- Generate your own data through surveys and import/export data to/from
databases.
- Interpret, assess, and handle data, including learning to ethically gather
and manage data in terms of bias, privacy.
- Understand the basics of using Google Sheets to manipulate data, including
sorting, refining, and grouping your data.
- Visualize your data effectively and summarize your findings in different
forms of reports.
This course is designed to prepare you for further studies and for future
careers. Since data is present in all fields of study and careers, the skills
in this textbook prepare you for using and understanding data in any capacity.
.. _Nate Silver: https://en.wikipedia.org/wiki/Nate_Silver
.. _FiveThirtyEight: https://fivethirtyeight.com
.. _predicted: https://venturebeat.com/2012/11/07/nate-silver/
.. _Hidden Figures: https://en.wikipedia.org/wiki/Hidden_Figures_(book)
.. _the movie with the same name: https://en.wikipedia.org/wiki/Hidden_Figures
.. _Katherine Johnson: https://en.wikipedia.org/wiki/Katherine_Johnson
.. _Mary Jackson: https://en.wikipedia.org/wiki/Mary_Jackson_(engineer)
.. _Dorothy Vaughan: https://en.wikipedia.org/wiki/Dorothy_Vaughan
.. _using machine learning to detect cancer in real-time: https://www.youtube.com/watch?v=9Mz84cwVmS0
.. _using neural networks to train computers: https://deepmind.com/research/alphago/
.. _Go: https://en.wikipedia.org/wiki/Go_(game)
.. _Moneyball: https://en.wikipedia.org/wiki/Moneyball_(film)
.. _the non-fiction book of the same name: https://en.wikipedia.org/wiki/Moneyball
.. _Billy Beane: https://en.wikipedia.org/wiki/Billy_Beane
.. _“sabermetrics”: https://en.wikipedia.org/wiki/Sabermetrics
.. _this article: https://www.govtech.com/data/When-Big-Data-Gets-It-Wrong.html