Question for Lab 1 - 8:00 AM

edited January 26 in Assignments

(10 pts)

I've got a fun program on my webpage that generates random CSV data for people. You can access it and examine the first few rows via Python like so:

import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=mark')
df.head()
first_name last_name age sex height weight income activity_level
0 Donna Dinan 35 female 65.37 164.26 1947 high
1 Antonia Davis 39 female 64.95 140.40 2188 none
2 Stephanie Buss 30 female 60.75 181.83 18108 high
3 Wendell Elmore 26 male 64.68 157.90 1935 moderate
4 Nina Mcilhinney 21 female 59.94 163.38 5675 none

Here's the cool thing - the data is randomly generated but the random number generator is seeded using the username query parameter in the URL. Thus, if I execute that command several times, I get the same result every time. That result depends upon the username, however. Thus, if you do it with your forum username, you'll get a different result. Thus, we all have our own randomly generated data file!

The problem: Using the code above with your username, generate your data file and then

  1. Compute the mean and sample standard deviation of the heights in your data and
  2. create a histogram of the heights.

Be sure to include both the code that you typed, as well as the results in your post.

Comments

  • edited February 1

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=hmcdiarm')
    df.head()

    first_name  last_name   age sex height  weight  income  activity_level
    0   Donna   Dinan   35  female  65.37   164.26  1947    high
    1   Antonia Davis   39  female  64.95   140.40  2188    none
    2   Stephanie   Buss    30  female  60.75   181.83  18108   high
    3   Wendell Elmore  26  male    64.68   157.90  1935    moderate
    4   Nina    Mcilhinney  21  female  59.94   163.38  5675    none
    
    mark
  • edited January 27

    I imported my data like. so:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=audrey')
    df.head()
    
    
     first_name last_name   age sex height  weight  income  activity_level
    0   George  Howerton    51  male    70.98   222.96  216670  high
    1   Evan    Cherry  41  male    69.19   159.91  12730   high
    2   Nathan  Gore    36  male    70.62   202.34  42592   high
    

    I can compute the mean and standard deviation using the describe command:

    [df.height.mean(), df.height.std()]
    
    # Output:
    # [66.49599999999998, 3.9782326922309807]
    

    Here's my histogram:

     df.hist('height',  edgecolor='black', grid=False);
    

    mark
  • edited January 29
    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? username=SGriffin')
    df.head()    
    

    The mean for my heights is 65.69149999999999
    My mean code is below:

    df.height.mean()
    

    My standard deviation is 3.9195194961637934
    My standard deviation code is below:

    df.height.std()
    

    Here is my Histogram:
    My histogram code is also below:

    df.hist('height');
    
    mark
  • edited January 28
    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=lelandflynn')
    

    I can compute the mean and standard deviation using the df.height.describe() command.

    df.height.describe()
    

    mean 66.261000
    std 3.559419

    df.hist('height');
    

    mark
  • edited January 27

    My Data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=elainanakos')
    df.head()
    
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Alberta Jackson 56  female  65.29   218.35  23035   moderate
    1   Tammy   Townsend    26  female  64.56   147.14  149948  none
    2   Richard Mcenaney    32  male    68.03   172.60  59903   high
    3   Amber   Barker  30  female  62.12   191.54  756 none
    4   Janna   Nettles 37  female  65.67   157.12  9318    high
    

    Mean and Standard Deviation

    [df.height.mean(), df.height.std()]
    
    (65.94279999999999, 4.085751808221326)
    
    
    df.hist('height');
    

    mark
  • edited January 28

    Here's my dataset

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? username=moomoo')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Jeanette    Williams    37  female  65.31   195.97  2822    high
    1   Carolyn Wheelock    31  female  65.20   161.61  125216  moderate
    2   Raul    Cousins 40  male    72.59   142.87  119219  none
    3   Javier  Royster 40  male    71.44   177.45  5526799 none
    4   Barbara Whipple 22  female  60.77   154.98  88034   moderate
    

    By insterting the lines:

    df.height.mean()
    df.height.std()
    

    It will calculate the mean and standard deviation:
    [66.32059999999997, 4.233640365864199]

    I made the histogram using:

    df.hist('height',  edgecolor='black', grid=False);
    

    mark
  • edited January 27

    I imported my data like. so:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=katelynnecampbell')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Arthur  Dean    56  male    69.99   139.90  20184   high
    1   Glenn   Pelletier   28  male    69.27   197.71  4242    none
    2   June    Casey   26  female  59.59   190.90  5383    none
    3   William Dinardo 31  male    65.74   226.97  34717   none
    4   Mary    Stewart 25  female  63.04   143.08  2573    high
    

    I can compute the mean and standard deviation using the describe command:

    [df.height.mean(), df.height.std()]

    [67.02159999999999, 3.7525925637477564]

    Made a histogram using:

    df.hist('height',  edgecolor='black', grid=False);
    

    mark
  • edited January 27

    Imported my data like so:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=danniarm')
    df.head()
    
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Nola    Boyce   55  female  67.19   160.92  5016    high
    1   Leslie  Reeves  35  female  66.07   175.77  7864    none
    2   Rebeca  Easter  41  female  62.82   122.33  95978   none
    3   Irene   Boelter 25  female  65.94   208.97  13657   moderate
    4   Robert  Brun    37  male    67.34   186.81  4674    high
    

    I computed the mean and standard deviation using this code:

    [df.height.mean(), df.height.std()]
    [66.42790000000002, 3.887205455404338]
    

    Made a histogram using:

    df.hist('height',  edgecolor='black', grid=False);
    

    mark
  • edited January 29
     import pandas as pd
     df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=vzaia')
     df.head()
    
    
    first_name  last_name   age sex height  weight  income  activity_level
     0  Mark    Oyler   23  male    66.10   199.93  16480   high
     1  Johanne Bickel  34  female  64.40   174.69  245267  none
     2  Allan   Orvis   21  male    66.78   165.83  19581   high
    

    I can compute the mean and standard deviation with the describe command.

    df.height.describe()
     mean      66.321900
     std        3.870064
    
    [df.height.mean(), df.height.std()]
    
    # Output:
    # [66.321900, 3.870064] 
    

    here's my Histogram

    df.hist('height',);
    

    mark
  • edited January 29

    My code was:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=jdowning')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Marie   Finch   20  female  63.77   181.11  12202   high
    1   Brian   Flores  27  male    69.33   149.62  53407   none
    2   Olivia  Martinez    41  female  63.93   137.77  1681    high
    3   Nancy   Mize    44  female  63.88   165.43  1504    moderate
    4   Julius  Williamson  31  male    72.69   179.37  36567   
    

    Mean and standard deviation is as followed:

    df.height.mean(), df.height.std()
    

    (65.803, 3.9516191541435193)

    Histogram:

    df.hist(' height',  edgecolor= 'black')
    

    mark
  • edited January 29

    I imported my data like so:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=jwh1234')
    df.head()
    
        first_name  last_name   age sex height  weight  income  activity_level
    0   Pete    Horsley 28  male    73.06   176.21  683757  high
    1   Marchelle   Joyce   22  female  64.13   158.46  17715   high
    2   Gloria  Thomson 42  female  67.35   167.72  20890   none
    3   Manuel  Miller  30  male    69.32   152.79  13109   high
    4   Lisa    Kennedy 55  female  65.36   142.06  7106    moderate
    

    Mean and Standard Deviation:

    df.height.mean(), df.height.std()
    
    (66.19229999999997, 3.5974201909567127)
    

    I made my histogram using:

    df.hist('height')
    

    mark
  • edited January 29

    I Imported my data like this:

     import pandas as pddf = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?    username=madysongold')df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Thomas  Walker  33  male    67.90   139.84  24469643    none
    1   Peter   Castleberry 31  male    62.59   202.02  106838  moderate
    2   Juan    Rehkop  33  male    71.48   243.73  36728   moderate
    3   Elsa    Burns   26  female  61.03   191.14  56  high
    4   Melissa Robertson   26  female  63.38   183.01  116265  none
    

    My mean and my standard deviation

      df.height.mean()
    
      66.2591
    
      df.height.std()
    
      3.564852581705578
    

    Here is my Histogram

    hist('height', edgecolor='black', grid=False);
    

    mark
  • edited January 29

    I used my school username jgreenba as opposed to justin because I forgot that my name here was Justin....

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=jgreenba')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Margaret    Weintraub   20  female  63.59   172.64  1642    moderate
    1   Daniel  Shulman 23  male    72.12   148.89  231130  none
    2   Daniel  Urbanek 21  male    68.21   184.13  2407    moderate
    3   Sonia   Dehart  42  female  62.03   141.22  99475   none
    4   Norman  Stiger  43  female  60.27   127.52  152 none
    
    df.height.mean()
    65.9579
    
    df.height.std()
    3.679784879252558
    
    df.hist('height')
    

    mark
  • edited January 27

    I imported my data like so:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=cbrown15')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Derek   Jackson 40  male    65.13   147.88  91963   high
    1   Katharine   Bratton 57  female  59.99   156.99  996 high
    2   William Gipson  38  male    66.65   155.34  33973   high
    3   Mike    Wilson  22  male    65.68   166.44  5217    high
    4   Sasha   Sampson 54  female  67.27   178.54  10203   high
    

    I can compute the mean and standard deviation using the describe command:

    df.height.mean(), df.height.std()
    (66.146, 3.8395214874415347)
    

    Here's my histogram:

    df.hist('height',  edgecolor='black', grid=False);
    

    mark
  • edited January 28
    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=lindieclark_')
    df.head()
    

    Here's my histogram:

     df.hist('height',  edgecolor='black', grid=False);
    


    ![]

    Here's the mean:

    df.height.mean(), df.height.std()
    

    (66.0837, 3.859199596451643)

    mark
  • edited January 29

    I imported my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=driordan')
    df.head()
    

    Mean height

    df.height.mean()
    66.40280000000001
    

    Standard deviation height

    df.height.std()
    4.127169421570119
    

    The command used to generate the histogram...

    df.hist('height');
    

    The histogram for my given data is as follows:

    mark
Sign In or Register to comment.