An archived instance of Mark's Discourse site as of Tuesday July 18, 2017.

Playing with some random data

mark

(15 points)

We'll work on this in class together on Monday, June 12 and Thursday, June 15.


I've set up a little random data generator for you to download some data into R that you can play with. Everyone gets their very own personalized, randomly generated data. Since everyone's got their own, we can all share our results here on Discourse. You can access your data in R via the following command:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=YOUR_STUDENT_ID#')

Of course, your student ID number should be your UNCA student ID number. Mine is 987654321 so I can get my data and display the first few rows via the following command:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=987654321')
head(my_data)

## Out:
#  first_name    last_name age    sex height weight income
#  1      Carol       Massey  46 female  61.96 134.04 108727
#  2      Diana       Wright  35 female  66.42 188.55   1154
#  3      David     Hamilton  25   male  69.69 173.91   3676
#  4     Sherry Eichelberger  39 female  61.08 121.00  15471
#  5     Nicole     Mcclarty  22 female  58.75 136.74      4
#  6      Billy       Flores  32   male  69.48 106.11   7533

Here's your assignment: Post a reply to this topic with the following:

  1. The first few lines of your data after import, just as I did above.
    (Without your student ID)
  2. A bar chart for the sexes of the people
  3. A box and whisker plot for the weights of the people.
  4. A summary of the ages in your data (via the summary command)
  5. A histogram for the heights of the men.
  6. A normal probability plot for the heights of the men
  7. A histogram for the incomes
  8. A normal probability plot for the incomes

Note: For each problem, I'd like to see both the code and the output.
Also: Be sure to use a place holder, like MYID for your student ID, rather than the real thing.

audrey

First, I'll load my data:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=123456789')
head(my_data)

first_name last_name age sex height weight income
1 Carol Massey 46 female 61.96 134.04 108727
2 Diana Wright 35 female 66.42 188.55 1154
3 David Hamilton 25 male 69.69 173.91 3676
4 Sherry Eichelberger 39 female 61.08 121.00 15471
5 Nicole Mcclarty 22 female 58.75 136.74 4
6 Billy Flores 32 male 69.48 106.11 7533





Bar plot of sexes

men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))

Box and whisker plot for the weights

boxplot(my_data$weight, horizontal = T)

Summary of the ages

summary(my_data$age)

## Out:
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 20.00   27.00   32.50   34.53   40.00   59.00

Histogram of the heights of the men

men = subset(my_data, sex=='male')
hist(men$height)

Normal probability plot of the heights of the men

qqnorm(men$height)
qqline(men$height)

Histogram of the incomes

incomes = my_data$income
hist(incomes)

Normal probability plot of the incomes

qqnorm(incomes)
qqline(incomes)


Or, the last two without the outlier

incomesWithoutOutlier = incomes[incomes<500000]
hist(incomesWithoutOutlier)

qqnorm(incomesWithoutOutlier)
qqline(incomesWithoutOutlier)

kd95
 my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=#####')
 head(my_data)

first_name last_name age sex height weight income
1 Laura Robinson 40 female 65.01 178.88 52357
2 Marilyn Baskerville 42 female 61.69 195.04 10187
3 Eleanor Resnick 33 female 63.40 169.37 35045
4 Laura Kuebler 42 female 69.21 154.71 15368
5 Ruben Austin 24 male 69.39 165.77 237911
6 Debra Hart 41 female 63.36 159.08 6866





men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))

boxplot(my_data$weight)

summary(my_data$age)

Min. 1st Qu. Median Mean 3rd Qu. Max.
20.00 26.00 33.00 34.06 41.00 59.00

hist(men$height)

qqnorm(men$height)
qqline(men$height)

incomes = my_data$income
hist(incomes)

qqnorm(incomes)
qqline(incomes)

ejoy90

First lines of data:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=.........')
head(my_data)


##Out:
#first_name last_name age sex height weight income
#1    Richard  Waller  24   male  64.49 144.43   2954
#2    Nancy Kettner  20 female  72.06 169.47 165497
#3    Iris Clayton  38 female  61.83 165.44   2710
#4    Michael Bull  28   male  65.97 178.31   8448
#5    Steven Morse  25   male  71.66 206.92   9733
#6    Maria Page  44 female  68.25 210.79  63511

Categorical Bar Plot for Sexes:

###Bar Plot of Sexes
men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))


Box & Whisker Plot for Weights:

###Box and Whisker Plot for Weights
boxplot(my_data$weight, horizontal=T)


Summary of the Ages:

##Out:
#Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#20.00   28.00   35.00   36.28   43.25   58.00

Histogram for Male Heights:

men = subset(my_data, sex=='male')
hist(men$height)


Normal Probability Plot for Heights of Men:

qqnorm(men$height)
qqline(men$height)


Histogram Representing Income:

incomes=my_data$income
hist(incomes)


Normal Probability Plot for Incomes:

qqnorm(incomes)
qqline(incomes)


Histogram of Incomes Without Outlier:

comesWithoutOutlier=incomes[incomes<2500000]
hist(incomesWithoutOutlier)


Normal Probability Plot of Income Without Outlier:

qqnorm(incomesWithoutOutlier)
qqline(incomesWithoutOutliers)

Sarcasticswimmer
my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=#########')

//I didn't want to post my student ID num :wink:

head(my_data)

first_name last_name age sex height weight
1 Michael Deloach 24 male 67.58 157.80
2 Linda Booher 22 female 65.23 148.79
3 Stephen King 43 male 63.20 157.95
4 Meredith Schulz 42 female 60.37 168.73
5 Cheryl Allen 42 female 63.20 213.76
6 Shirley Woods 40 female 64.60 174.87





Bar plot of the sexes

barplot(age)

Whisker Plot for the weight

boxplot(my_data$weight, horizontal = T)

Summary of the ages

summary(my_data$age)

Min. 1st Qu. Median Mean 3rd Qu. Max.
20.00 28.00 36.00 35.91 42.00 58.00

Histogram of the heights of the men

men = subset(my_data, sex=='male')
hist(men$height)

Normal probability plot of the heights of the men

qqnorm(men$height)
qqline(men$height)

Histogram of the incomes

incomes = my_data$income
hist(incomes)

Normal probability plot of the incomes

qqnorm(incomes)
qqline(incomes)

incomesWithoutOutlier = incomes[incomes<500000]
hist(incomesWithoutOutlier)



qqnorm(incomesWithoutOutlier)
qqline(incomesWithoutOutlier)

Amelia

My data:

   my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=ID)
head(my_data)


first_name last_name age    sex height weight  income
1    Michael    Henson  20   male  67.29 185.01 1004177
2   Lorraine      Ball  24 female  59.35 157.13  145090
3      Peggy     Perez  53 female  60.35 200.14     387
4     Elaine    Corley  48 female  65.27 164.43    2955
5    Shirley   Sanchez  40 female  61.70 131.99   13717
6     Donnie     Burch  54   male  66.10 190.88  348855

Bar plot of sexes

men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))

Box and whisker plot for the weights

boxplot(my_data$weight, horizontal = T)


Summary of the ages

summary(my_data$age)

Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
20.0    25.0    34.5    35.2    42.0    59.0

Histogram of men$height

men = subset(my_data, sex=='male')
hist(men$height)

Histogram of incomes

 incomes = my_data$income
 hist(incomes)

Histogram of incomes without outlier

incomesWithoutOutlier = incomes[incomes<500000]
hist(incomesWithoutOutlier)

qqnorm(incomesWithoutOutlier)
qqline(incomesWithoutOutlier)

YOU_SHALL_NEVER_KNOW

TAKE A LOOK AT MY CODE, Y'ALL:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=NOT_FOR_YOUR_EYES')
head(my_data)

Ma Data

    first_name last_name age    sex height weight income
 1      Alvin     Munoz  46   male  62.54 149.59  23659
 2      Steve   Prather  32   male  69.03 146.49  12294
 3        Ana   Goeller  37 female  63.50 160.65 104720
 4       Todd Wilkinson  53   male  71.77 149.33  11796
 5    Barbara    Barker  31 female  62.37 150.29  49810
 6     Jeremy      Bell  40   male  69.90 117.98   8125

Bar Chart of the Sexi People

Males=subset(my_data, sex=='male')
Females=subset(my_data,sex=='female')
barplot(c(dim(Males)[1], dim(Females)[1]))

Kitty Box of Weights

boxplot(my_data$weight)

Summary for the Ages

summary(my_data$age)

Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
20.00   28.00   33.00   34.65   41.00   59.00

Measure the Males

Males=subset(my_data, sex=='male')
hist(Males$height)

But are the Heights of the Males Normal?

qqnorm(Males$height)
qqline(Males$height)

Let's See their Money

income=(my_data$income)
hist(income)

Is that Money Normally Distributed?

Of course not.

qqnorm(income)
qqline(income)

Alison

First, I'll load my data

 my_data <- read.csv('https://www.marksmath.org/cgi-
bin/random_data.csv?id=930xxxxxx')
head(my_data)

first_name last_name age sex height weight
1 Chris Siegel 45 male 69.20 165.63
2 Calvin Lawrence 30 male 62.72 193.06
3 Jennifer Sawyer 47 female 63.89 189.81
4 Richard Lipscomb 36 male 66.04 204.86
5 Cameron Crenshaw 29 male 71.70 182.34
6 Julius Hollen 31 male 70.92 166.33





Here is a summary of my data:

summary(my_data$age)

Min. 1st Qu. Median Mean 3rd Qu. Max.
20.00 26.00 33.00 34.35 40.25 59.00

Bar plot of sexes:

men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))

Box and Whiskers Plot for the Weights:

 boxplot(my_data$weight, horizontal = T)

Histogram of the heights of males only:

men = subset(my_data, sex=='male')

hist(men$height)

Normal Probability Plot of the Heights of Men:

qqnorm(men$height)
qqline(men$height)

Histogram of the Incomes:

incomes = my_data$incomehist(incomes)

Normal Probability Plot of the Incomes:

qqnorm(incomes)
qqline(incomes)

monehish

Here's my data:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=YOURID')
head(my_data)

 first_name last_name age    sex height weight
1     Pamela     Allen  46 female  67.32 176.32
2     Jackie      Myer  21 female  62.97 131.42
3      Grant   Thacher  35   male  72.30 149.33
4     Willie  Martinez  25   male  71.14 187.93
5    Richard   Johnson  36   male  67.52  87.68
6     Philip    Downin  34   male  72.77 153.85

Barplot of the sexes:

men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))

Box and whisker for weights:

boxplot(my_data$weight, horizontal = T)

Summary of ages

summary(my_data$age)
 
## Out:
 #  Min.    1st Qu.  Median    Mean    3rd Qu.    Max. 
 #  20.00   27.75    34.00     35.32   43.00      59.00

Histogram of height of just the men:

men = subset(my_data, sex=='male')
hist(men$height)

Normal Probability Plot of the men's heights

qqnorm(men$height)
qqline(men$height)

Histogram of Incomes

incomes = my_data$income
hist(incomes)

Normal Probability Plot of Incomes

qqnorm(incomes)
qqline(incomes)

Kristian

Here's my data:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=MYID')
head(my_data)

1. The first lines of my data:

 first_name last_name age    sex height weight
1     Bonita     Wiley  44 female  67.57 139.42
2        Ron  Franklin  29   male  71.65 172.11
3     George   Herbert  28   male  74.00 151.89
4     Brenda       Nye  38 female  63.19 189.96
5      Ariel       Lee  30   male  69.85 224.96
6     Teresa  Sorensen  25 female  62.47 160.95

2. Barplot of the Sexes

    men = subset(my_data, sex=='male')
    women = subset(my_data, sex=='female')
    barplot(c(dim(men)[1], dim(women)[1]))

3. A Box and Whisker Plot for the Weights

 boxplot(my_data$weight, horizontal = T)

4. A Summary of the Ages

 summary(my_data$age)

## Out
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 20.0    27.0    35.0    35.9    42.0    59.0

5. A Histogram for the Heights of the Men

men = subset(my_data, sex=='male')
hist(men$height)

6. A Normal Probability Plot for the Heights of the Men

 qqnorm(men$height)
 qqline(men$height)

7. A Histogram for the Incomes

incomes = my_data$income
hist(incomes)

8. A Normal Probability Plot for the Incomes

qqnorm(incomes)
qqline(incomes)

PaulWall

First I'll load my data, then display the first few lines of it:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=STUDENT_ID')
head(my_data)

## Out:
#   first_name last_name age   sex  height weight income
# 1       Rita      Post  23 female  57.77 156.49  12387
# 2       Mary   Shellum  31 female  60.14 169.33   6467
# 3     Carlos   Higgins  37   male  65.56 206.65  14718
# 4      Sonya    Golden  32 female  61.91 169.66 116619
# 5     Esther     Sulik  22 female  67.52 182.04  17540
# 6     Robert   Simpson  50   male  68.26 189.56  11631

Bar plot of the sexes

men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))

Box and whisker plot for the weights

boxplot(my_data$weight, horizontal = T)

Summary of the ages

summary(my_data$age)
## Out:
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 20.00   28.75   34.50   35.54   42.00   59.00

Histogram of the heights of the men

men = subset(my_data, sex=='male')
hist(men$height)

Normal probability plot of the heights of the men

qqnorm(men$height)
qqline(men$height)

Histogram of the incomes

incomes = my_data$income
hist(incomes)

Normal probability plot of the incomes

qqnorm(incomes)
qqline(incomes)

Or, the last two without the three outliers

incomesWithoutOutlier = incomes[incomes<175000]
hist(incomesWithoutOutlier)

qqnorm(incomesWithoutOutliers)
qqline(incomesWithoutOutliers)

Sierra

The following are the first few lines of my data after import:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=MYID')
head(my_data)

##Out:
#   first_name last_name age    sex height weight
# 1      Frank  Troutman  46   male  66.64 184.68
# 2      Kevin      Rowe  25   male  70.35 161.12
# 3     Alonzo   Binegar  54   male  67.51 202.49
# 4      Wilma     Talib  56 female  62.33 154.56
# 5     Edward  Williams  25   male  65.13 145.72
# 6    Suzanne      Mace  44 female  66.03 162.21

The following is a bar chart of the sexes of the people:

barplot(table(my_data$sex))


Below is a box and whisker plot of the weights of the people

boxplot(my_data$weight, horizontal = T)


The following is a summary of the ages in my data:

summary(my_data$age)

## Out: 
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 21.00   29.00   36.00   36.57   43.25   59.00

The following is a histogram of the heights of the men:

men = subset(my_data, sex=='male')
hist(men$height)


The following is a normal probability plot of the heights of the men:

qqnorm(men$height)
qqline(men$height)


The following is a histogram of the incomes:

incomes = my_data$income
hist(incomes)


The following is a normal probability plot of the incomes:

qqnorm(incomes)
qqline(incomes)


The following is a histogram of the incomes without the outlier:

incomesWithoutOutlier = incomes[incomes<300000]
hist(incomesWithoutOutlier)


The following is a normal probability plot of the incomes without the outlier:

qqnorm(incomesWithoutOutlier)
qqline(incomesWithoutOutlier)

wolfpack77

First, I'll load my data:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=YOUR_STUDENT_ID#')
head(my_data)

## Output:
#    first_name last_name age    sex height weight income
#    1       Mary  Mcsorley  26 female  64.08 144.98  15495
#    2   Virginia     Graig  21 female  65.93 169.18  32364
#    3    Richard     Craig  54   male  69.49 156.07   7712
#    4       Mary     Beall  43 female  64.00 140.12   6784
#    5       Lesa  Benefiel  29 female  65.76 182.29  43851
#    6       Rose  Pershall  56 female  63.12 155.70   7127

Bar plot of sexes:

men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))

Box and whisker plot for the weights:

boxplot(my_data$weight, horizontal = T)

Summary of the ages:

summary(my_data$age)

## Output:
Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
20.00   25.00   31.00   33.81   40.25   59.00

Histogram of the heights of the men:

men = subset(my_data, sex=='male')
hist(men$height)

Normal probability plot of the heights of the men:

qqnorm(men$height)
qqline(men$height)

Histogram for the incomes:

income = (my_data$income)
hist(income)

Normal probability plot for the incomes:

qqnorm(incomes)
qqline(incomes)

FRD

First, I'll load my data:

1. My Data

 my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=YOURID')
 head(my_data)
##Out:
#  first_name last_name age    sex height weight income
#1      Kevin    Brandt  24   male  65.87 141.06  20831
#2     Janice   Fuhrman  20 female  67.22 166.66   8650
#3   Franklin   Selders  58   male  66.65 180.89    399
#4      Peter     Skeen  43   male  67.91 163.96  15559
#5    William     Eanes  27   male  66.58 143.44  15257
#6   Jennifer    Bigley  30 female  64.26 134.13   4091

2. Bar Chart for Sexes

barplot(table(my_data$sex))



3. Box and Whisker Plot for the Weights

boxplot(my_data$weight)



4. Summary of the Ages

 summary(my_data$age)
##Out:
#Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#  20.00   26.00   35.00   35.71   43.00   59.00

5. Histogram for the Heights of Men

 men<-subset(my_data, sex == 'male')
 hist(men$height)

6. Normal Probability Plot for the Heights of Men

 men<-subset(my_data, sex == 'male')
 qqnorm(men$height)
 qqline(men$height)

7. Histogram for the Incomes

hist(my_data$income)


 incomes<-my_data$income
 sort(incomes)
 incomes<500000
 incomes[incomes<500000]
 outlierRemoved=.Last.value
 hist(outlierRemoved)

 incomes<-my_data$income
 sort(incomes)
 incomes<500000
 incomes[incomes<500000]
 outlierRemoved=.Last.value
 hist(outlierRemoved)

8. Normal Probability Plot for the Incomes

 qqnorm(my_data$income)
 qqline(my_data$income)

Andy

Loading my data

my_data <- read.csv('https://www.marksmath.org/cgi-
bin/random_data.csv?id=930357379')
head(my_data)

first_name last_name age sex height weight income
1 Damion Lachenauer 38 male 69.98 125.47 132248
2 Bonnie Wu 20 female 64.68 176.99 11351
3 Nathan Major 32 male 72.79 185.01 20504
4 Michelle Pike 25 female 64.60 168.78 73013
5 George Holt 29 male 70.55 152.11 17071
6 Mildred Jacobs 40 female 63.12 142.98 25843






Bar plot of sexes

men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))


Box and whisker plot for the weights

boxplot(my_data$weight, horizontal = T)


Summary of ages

summary(my_data$age)

Min. 1st Qu. Median Mean 3rd Qu. Max.
20.00 25.00 32.00 33.63 41.00 58.00


History of the heights of men

men = subset(my_data, sex=='male')
hist(men$height)


Normal probability plot of the heights of the men

qqnorm(men$height)
qqline(men$height)


History of the incomes

incomes = my_data$income
hist(incomes)


Normal probability plot of the incomes

qqnorm(incomes)
qqline(incomes)


Or the last two without the outlier

incomesWithoutOutlier = incomes[incomes<500000]
hist(incomesWithoutOutlier)

qqnorm(incomesWithoutOutlier)
qqline(incomesWithoutOutlier)

David
> my_data <- read.csv('https://www.marksmath.org/cgi-bin/ random_data.csv?id=YOURID')
> head(my_data)
 first_name   last_name age    sex height weight
1     Willie      Nguyen  31 female  68.69 160.82
2   Patricia Blackwelder  32 female  65.97 189.44
3      Frank    Albright  21   male  71.64 101.66
4  Margarita    Effinger  48 female  64.64 191.34
5  Ernestine     English  54 female  62.78 192.26
6     Leland     Plourde  41   male  69.23 191.75
Prestonw

First, I'll load my data:

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=YOURID')
head(my_data)

first_name last_name age sex height weight income
1 Linda Blanton 20 female 61.21 184.30 3195
2 George Brislin 49 male 69.46 177.17 4376
3 Karen Hinson 28 female 66.44 179.17 5497
4 Grant Charles 41 male 66.47 125.58 7865
5 Barbara Ballard 39 female 64.03 164.28 114532
6 Shauna Douglas 25 female 67.13 151.45 3127





Bar plot of sexes

men = subset(my_data, sex == 'male')
women = subset(my_data, sex == 'female')
barplot(c(dim(men)[1], dim(women)[1]))

Box and whisker plot for the weights

boxplot(my_data$weight, horizontal = T)

Summary of the ages

summary(my_data$age)

## Out:
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 20.00   29.00   36.00   36.14   43.00   59.00

Histogram of the heights of men

height_men = subset(my_data, sex=='male')
hist(height_men$height)

Normal probability plot of the heights of the men

qqnorm(men$height)
qqline(men$height)

Histogram of the incomes

incomesofpeople = my_data$income
hist(incomesofpeople)

Normal probability plot of the incomes

qqnorm(incomesofpeople)
qqline(incomesofpeople)

Nonamaker

Import Data

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=00100001')

1.) Header of data

head(my_data)

first_name last_name age sex height weight
1 Rebecca Weaver 42 female 62.71 159.88
2 Jesus Romero 22 male 69.41 130.98
3 Randy Carver 42 male 69.71 145.56
4 Susan Shiner 40 female 65.91 152.22
5 Robert Mcmurry 30 male 72.21 176.88
6 Gwendolyn Traub 22 female 63.58 147.63






2.) Barplot of Sexes

male=subset(my_data,sex=='male')
female=subset(my_data,sex=='female')
barplot(c(dim(male)[1],dim(female)[1]))


3.) Box and Whisker Plot of Weights

boxplot(my_data$weight)

4.) Summary of Ages

summary(my_data$age)

n. 1st Qu. Median Mean 3rd Qu. Max.
20.00 28.00 33.50 34.89 40.25 58.00


5.) Histogram of Heights of Men

hist(male$height)

6.) Normal Probability of the Heights of Men

qqnorm(male$height)
qqline(male$height)

7.) Histogram of Incomes

income = my_data$income
hist(income)

8.) Normal Probability of Incomes

qqnorm(income)
qqnline(income)

not_sam

my_data <- read.csv('https://www.marksmath.org/cgi-bin/random_data.csv?id=#########')
head(my_data)
first_name last_name age sex height weight income
1 Mary Thomas 25 female 63.00 182.80 2059
2 Mandy Lopez 48 female 60.93 125.72 9232
3 Julia Conaughty 38 female 61.83 140.57 57423
4 Eddy Cox 28 male 66.64 193.08 17527
5 Thelma Johnson 31 female 62.92 172.45 20406
6 Dennis Mcconnell 44 male 68.23 161.24 507
men = subset(my_data, sex=='male')
women = subset(my_data, sex=='female')
barplot(c(dim(men)[1], dim(women)[1]))


boxplot(my_data$weight, horizontal = T)














Summary

Min. 1st Qu. Median Mean 3rd Qu. Max.

20.00 27.00 32.50 34.53 40.00 59.00

bset(my_data, sex=='male')

hist(men$height)


Normal Probability PLot of the Heights of the Men

qqnorm(men$height)
qqline(men$height)



incomes = my_data$income
hist(incomes)



qqnorm(incomes)
qqline(incomes)



incomesWithoutOutlier = incomes[incomes<500000]
hist(incomesWithoutOutlier)



qqnorm(incomesWithoutOutlier)
qqline(incomesWithoutOutlier)



mark

@Sarcasticswimmer I think you're using quotes instead of code blocks.