Chapter 1

Chapter 1 - Introduction to Data

For 1.48, the following R code will create a vector scores that can be used to answer the question:

scores <- c(57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94)

The labs are available in the DATA606 R package. To start the first lab, use the startLab function. This will copy the lab to your current working directory and rename the file according to your computer username (as returend by['user']). If this is incorrect, then either provide the file-prefix parameter to startLab, or rename the file after it has been copied.




OpenIntro provides a number of videos. You may find these helpful while reading the chapter.

Case Study: Using Stents to Prevent Strokes

Data Basics: Observations, Variable, and Data Matrices

Data Collection Principles

Observational Studies and Sampling Strategies

Designing Experiments

Summarizing and Graphing Numerical Data

Exploring Categorical Data

Using Randomization to Analyze a Gender Discrimination Study

Note about Pie Charts

There is only one pie chart in OpenIntro Statistics (Diez, Barr, & Çetinkaya-Rundel, 2015, p. 48). Consider the following three pie charts that represent the preference of five different colors. Is there a difference between the three pie charts? This is probably a difficult to answer.


However, consider the bar plot below. Here, we cleary see there is a difference between the ratio of the three colors. As John Tukey famously said:

There is no data that can be displayed in a pie chart that cannot better be displayed in some other type of chart