Stat problems for Friday, Jan 22
-
I've got a data file with some statistics for all 1199 Major League Baseball players in 2010. The first few lines look something like so:
Name |
Team |
Number |
Pos |
Games |
At bats |
Hits |
Home Runs |
On base % |
Bat Avg |
I Suzuki |
SEA |
51 |
OF |
162 |
680 |
214 |
6 |
0.359 |
0.315 |
D Jeter |
NYY |
2 |
SS |
157 |
663 |
179 |
10 |
0.340 |
0.270 |
M Young |
TEX |
10 |
3B |
157 |
656 |
186 |
21 |
0.330 |
0.284 |
J Pierre |
CWS |
1 |
OF |
160 |
651 |
179 |
1 |
0.341 |
0.275 |
R Weeks |
MIL |
23 |
2B |
160 |
651 |
175 |
29 |
0.366 |
0.269 |
-
What does each row represent?
-
Identify the variables in this data table and classify each as
- Continuous numeric,
- Discrete numeric,
- Nominal categoric, or
- ordinal categoric
-
Baseball teams are formed by players who play a number of different positions while they are playing defense on the field:
-
P: Pitcher
-
C: Catcher
-
1B: First base
-
2B: Second base
-
3B: Third base
-
SS: Short stop
-
OF: Outfield
-
DH: Designated hitter
Statistically speaking, some defensive positions tend to be better batters than other. To illustrate, the image below uses our 2010 data set to form a collection of side-by-side box plots for the relationship between a player's position on the horizontal axis and their On Base Percentage (OBP) on the vertical axis. Use this image to roughly rank positions.
-
Use this image to roughly rank the median OBP by a player's position.
-
How would compare the OBP of third baseman (3B) to designated hitters (DH)?