Question for Lab 1

edited August 17 in Assignments

(10 pts)

I've got a fun program on my webpage that generates random CSV data for people. You can access it and examine the first few rows via Python like so:

import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=mark')
df.head()
first_name last_name age sex height weight income activity_level
0 Donna Dinan 35 female 65.37 164.26 1947 high
1 Antonia Davis 39 female 64.95 140.40 2188 none
2 Stephanie Buss 30 female 60.75 181.83 18108 high
3 Wendell Elmore 26 male 64.68 157.90 1935 moderate
4 Nina Mcilhinney 21 female 59.94 163.38 5675 none

Here's the cool thing - the data is randomly generated but the random number generator is seeded using the username query parameter in the URL. Thus, if I execute that command several times, I get the same result every time. That result depends upon the username, however. Thus, if you do it with your forum username, you'll get a different result. Thus, we all have our own randomly generated data file!

The problem: Using the code above with your username, generate your data file and then

  1. Compute the mean and sample standard deviation of the heights in your data and
  2. create a histogram of the heights.

Be sure to include both the code that you typed, as well as the results in your post.

«1

Comments

  • edited August 14

    Here is my data for the assignment:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=hyoung1')
    df.head()
    
    #Output: 
       first_name   last_name   age sex height  weight  income  activity_level
    0   Arthur  Evans   21  male    66.12   187.59  6521    moderate
    1   Dan Robinson    33  male    70.51   181.97  15980   high
    2   Randy   Strickland  32  male    65.17   171.87  3293    high
    3   Brent   Bickel  46  male    71.61   224.98  2227    none
    4   Jeffrey Russ    29  male    67.89   177.04  243252  high
    

    The mean of my data is 66.272

    df.height.mean()
    

    The standard deviation of my data is 3.82

    df.height.std()
    

    Here is the histogram for my data:

    df.hist('height');
    

    mark
  • edited August 14

    Code:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csvusername=taylordurall')
    df.tail()
    
        first_name  last_name   age sex height  weight  income  activity_level
    95  Billy   Mixon   25  male    68.43   162.73  11168   none
    96  Mark    Brown   41  male    64.47   167.39  25803   none
    97  Simon   Cespedes    43  male    73.51   174.65  784 moderate
    98  Mark    Anglin  55  male    68.85   198.70  6819    moderate
    99  Paige   Salazar 25  female  69.06   141.84  70977   none
    

    Mean of heights is 66.74

    df.height.mean()
    

    Standard deviation of heights is 4.29

    df.height.std()
    

    Histogram:

    df.hist('height');
    

    mark
  • edited August 14
    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=JakeDodd')
    df.tail()
    

    Mean:

    df.height.mean()
    
    65.76910000000002
    

    Standard Deviation:

    df.height.std()
    
    4.207242455671896
    

    Histogram:

    df.hist('height');
    

    mark
  • edited August 14

    I generated my data with:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=Jackson_L')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Cheri   Despain 39  female  68.09   165.83  158204  none
    1   Aida    Lugo    56  female  59.04   120.86  1314    moderate
    2   David   Darby   30  male    70.18   161.92  4808    none
    3   Adria   Fowler  22  female  61.03   148.83  1585    none
    4   Patricia    Schmidt 28  female  66.92   167.60  14823   high
    

    The mean of my data is 66.397 and was obtained from the command:

    df.height.mean()
    

    The standard deviation is 4.156 and was obtained from the command:

    df.height.std()
    

    Histogram:

    df.height.hist()
    

    mark
  • edited August 14

    data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=janeturlington')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Donald  Shultz  32  male    66.96   195.72  70002   moderate
    96  Stephen Bloom   24  male    71.98   139.01  40704   moderate
    97  Dawn    Jacobs  26  female  63.52   193.47  283 high
    98  Bernita Davis   41  female  62.83   202.44  159222  moderate
    99  Joey    Macha   37  male    66.88   183.63  1421    high
    

    mean: 66.74

    df.height.mean()
    

    standard deviation: 3.83

    df.height.std()
    

    mark
  • edited August 14

    Code and Data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=AudreyAlt')
    df.tail()
    
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Jaime   Olson   26  male    72.69   173.56  28362   none
    96  Pamela  Daniele 26  female  60.44   194.92  694 none
    97  Jeff    Chase   27  male    65.98   118.43  107620  none
    98  Jann    Hamilton    31  female  60.73   150.60  393184  none
    99  Mae Davis   31  female  68.64   179.03  6970    moderate
    

    Mean of height is 66.59

    df.height.mean()
    

    standard deviation height is 3.8

    df.height.std()
    

    histogram code:

    df.height.hist()
    

    mark
  • edited August 14

    I generated my data with:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=amanda')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Sarah   Leflore 22  female  65.69   174.91  4522    high
    96  Marilyn Jacklin 21  female  62.00   134.42  13371   high
    97  Nora    Woodson 38  female  62.97   185.69  8023    none
    98  Wilma   Taylor  56  female  65.40   199.93  90  moderate
    99  James   Fells   33  male    69.16   172.15  926 high
    

    The mean of my heights is 66.394, obtained from this command:

    df.height.mean()
    

    The standard deviation of my heights is 3.82 , obtained from this command:

    df.height.std()
    

    Hist0gram code and data:

    df.height.hist()
    

  • edited August 14

    This is how I generated my data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv username=pkdimond')df.tail() 
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Terrance    Baremore    44  male    66.68   160.10  190 none
    96  Pearl   Girres  42  female  65.22   171.04  7583    moderate
    97  Edith   Davis   28  female  63.88   118.39  30860   high
    98  Al  Whatley 33  male    74.40   151.24  8091    none
    99  James   Martell 41  male    68.57   188.25  62831   high
    

    My mean was also: 66.147
    and I got it from this command:

    df.height.mean()
    

    The standard deviation for my data is: 3.684
    I got this from this command:

    df.height.std()
    

    The histogram looks like:

  • My Data looks like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=John_South')
    df.head()
    

    Using the following line of code, I was able to find the mean of my height data:

    df.height.mean()
    

    The code returned 66.1932999999.

    Using the following line of code, I was able to find the standard deviation of my height data:

    df.height.std()
    

    The code returned 3.5443878117.

    This last line of code created a histogram of my height data:

    df.hist('height');
    

    mark
  • edited August 14
    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=wheadri1')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
     0  Jerry   Taft    49  male    67.00   187.16  28030   high
    1   Connie  Wilcox  21  female  61.03   167.82  14546   moderate
    2   Janet   Holland 39  female  67.31   154.52  5713    high
    3   Josephine   Warden  42  female  60.87   200.20  7027    none
    4   Dianne  Goodman 24  female  63.70   197.99  80959   high
    

    Mean of height is 66.74

    df.height.mean()
    

    Standard deviation is 4.281957179004515

    df.height.std()
    

    mark
  • edited August 14
    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi- bin/random_data.csv?username=SterlingS')
    df.tail()
    

    Output

    first_name  last_name   age     sex     height  weight  income  activity_level
    95  Sam     Fullerton   27  male    70.90   197.60  270     moderate
    96  Elizabeth   Scott   49  female  59.52   159.30  6631    high
    97  Felix   Simms   38  male    71.25   135.55  87  none
    98  Alana   Jones   33  female  65.57   151.70  2780    none
    99  Larry   Cowels  20  male    71.83   174.72  5625    none
    

    Mean:

    df.height.mean()
    

    Mean is 66.92030000000001

    Std deviation:

    df.height.std()
    

    Std deviation is 4.106630867366592

    Histogram:

    df.hist('height');
    

    mark
  • edited August 14

    Code and Data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=ebrady2')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Shirley Garcia  25  female  62.52   119.57  5010    none
    96  Danny   Hymes   26  male            73.81   153.20  57164   high
    97  April   Lebrecque   34  female  69.20   141.89  45640   moderate
    98  Karen   Babcock 21  female  61.40   172.48  44646   moderate
    99  Tony    Benedict    26  male            69.80   174.12  22386   high
    

    Mean of height is 65.93

    df.height.mean()
    

    Standard Deviation height is 3.8

    df.height.std()
    

    Histogram code:

    df.height.hist()
    

  • edited August 14
    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=amberc')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Richard Schmidt 40  male    69.04   184.16  425 moderate
    96  Paul    Fleury  29  male    64.96   206.06  11212   moderate
    97  Agnes   Pollard 39  female  61.78   233.80  12416   moderate
    98  Diane   Morrison    42  female  61.50   179.02  17823   none
    99  Frances Horn    38  female  60.43   132.64  336 moderate
    

    Mean of heights is 66.10199999999998

    df.height.mean()
    

    Standard deviation of heights is 3.7860936399546503

    df.height.std()
    

    Histogram for data:

    df.hist('height');
    

    mark
  • df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=lmusial')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Barbara Dolan   57  female  57.01   177.34  6631    high
    1   Caren   Walters 22  female  63.84   157.23  8015    high
    2   Wesley  Avery   39  male    66.90   204.50  2201    moderate
    3   Michael Numbers 41  male    67.09   164.74  5184    none
    4   Bruce   Williams    37  male    67.07   180.17  9517    none
    

    My mean is: 65.457

    df.height.mean()
    

    My std is: 3.492

    df.height.std()
    
    df.hist('height');
    

    mark
  • edited August 14

    Here's my data;

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=mark')
    df.tail()
    
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Craig   Allison 27  male    68.57   151.03  923 high
    96  Dwight  Tate    59  male    67.87   217.11  48956   high
    97  Colin   Brown   48  male    69.76   177.49  2495    high
    98  April   Ruiz    29  female  62.88   192.76  20849   moderate
    99  Richard Rumberger   28  male    73.70   154.14  617 high
    

    my histogram looks like

    df.height.hist()
    

    The mean of my heights is 66.8194

    df.height.mean()
    

    The standard deviation is 4.0438455

    df.height.std()
    
    mark
  • Code:

    import pandas as pd
    
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=hburnett777')
    df.head()
    

    Data:

        first_name  last_name   age sex height  weight  income  activity_level
    0   Jacqueline  Hinton  56  female  62.69   156.69  5379    high
    1   Mary    Saleh   31  female  65.75   134.38  855 none
    2   Richard Harris  33  male    68.98   158.41  13447   moderate
    3   Matthew Hottel  23  male    69.28   154.41  1350    high
    4   Roland  Dunham  41  male    69.38   89.92   55068   moderate
    

    grid

    Mean:

    df.height.mean()
    
    66.0693
    

    Standard Deviation:

    df.height.std()
    
    4.223488015870534
    

    Histogram:

    df.hist('height')
    

    mark
  • edited August 14

    My data set:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=eallen4')
    df.tail()
    

    histogram:

    df.height.hist()
    

    mean:
    66.14470000000001

    df.height.mean()
    

    standard deviation:
    3.815758249706344

    df.height.std()
    
    mark
  • edited August 14

    My data is:

     import pandas as pd
     df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=mark')
    df.tail()
    

    The mean of my heights is 66.2664, as obtained from this computation:

    df.height.mean()
    

    The sample standard deviation of my heights is 3.695405424648998, obtained from this computation:

    df.height.std()
    

    Here is the histogram obtained from this calculation:

    df.hist('height'); 
    

    mark
  • Here's where my answer will go.

    My data is:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=audrey_m')
    df.tail()
    
        first_name  last_name   age sex height  weight  income  activity_level
    0   Retha   Reese   41  female  63.91   166.41  24811   moderate
    1   Felicia Hamm    41  female  61.88   152.91  11829   high
    2   Lauren  Poindexter  22  female  69.36   219.97  4259    none
    3   Erin    Davis   29  female  60.52   224.19  19351   moderate
    4   Minnie  Bouie   20  female  69.10   140.45  12624   high
    

    The mean of my heights is 66.147, as obtained from this computation:

    df.height.mean()
    

    My histogram looks like:

     df.height.hist()
    

    mark
  • edited August 14

    My data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=ljohns13')
    df.head()
    
        first_name  last_name   age sex height  weight  income  activity_level
    0   David   Krause  21  male    74.85   152.97  7391    moderate
    1   Karen   Liebsch 27  female  70.06   116.25  4694    moderate
    2   Dorothy Hill    35  female  59.56   153.08  11234   high
    3   Brenda  Bott    26  female  66.71   181.44  105242  none
    4   Maria   Flournoy    50  female  67.07   189.09  7789    none
    

    The mean of my heights is 66.44, as computed by:

    df.height.mean()
    

    The Standard Deviation of my heights is 3.47, as computed by:

    df.height.std()
    

    My histogram looks like:

    df.hist('height');
    

    mark
  • import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?   username=ccross2')
    df.head()
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   John    Sanchez 22  male    68.97   174.75  1675    none
    1   Dennis  Palmer  22  female  67.04   168.24  176113  high
    2   Jennifer    Walker  32  female  66.34   192.27  2180    moderate
    3   Shannon Clay    38  female  62.52   169.43  3165    high
    4   Leona   Benson  31  female  62.58   211.73  1969    moderate
    

    The mean of my height is 66.486, as obtained from this computation:

    df.height.mean()
    

    The standard deviation of my heigh is 4.107, as obtained from this computation:

    df.height.std()
    

    My histogram looks like:

    df.height.hist()
    

    mark
  • edited August 17

    Here's my data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv username=dawsonsalter')
    df.head()
    
        first_name  last_name   age sex height  weight  income  activity_level
    0   Anthony Kluender    21  male    69.28   185.20  7107    moderate
    1   Steve   Ridenhour   40  male    72.79   186.64  2921    moderate
    2   Rodolfo Allen   44  male    70.27   208.60  38059   moderate
    3   Alba    Aaberg  23  female  62.77   117.89  2331    moderate
    4   Yvette  Sipp    51  female  69.71   188.88  11189   moderate
    

    The mean of my heights is 66.463, as obtained from my computation:

    df.height.mean()
    

    The standard deviation of my heights is 4.097, as obtained from my computation:

    df.height.std()
    

    My histogram looks like, as obtained from my computation:

    df.hist('height');
    

    mark
  • My Data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.cs username=ksims')
    df.head()
    
        first_name  last_name   age sex height  weight  income  activity_level
     0  Carl    Gerard  36  male    69.45   215.64  2663    moderate
     1  Lila    Johnson 49  female  61.57   175.93  25959   none
     2  Brenda  Mossien 40  female  61.94   188.08  14445   high
     3  Gary    Lafave  49  male    67.95   169.63  76683   high
     4  Jefferey    Johnson 23  male    71.37   190.35  33960   none
    

    My histogram looks like

    df.hist('height');
    

    the mean of my height is 66.69

    df.height.mean()
    

    The standard deviation is approx 4.11

    df.height.std()
    
    mark
  • edited August 14

    Data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=Salem')
    df.head()
    
    
    first_name  last_name   age sex height  weight  income  activity_level
    0   Crystal Menard  21  female  66.10   184.02  6515    none
    1   Amy         Wise            38  female  57.31   203.48  1683    none
    2   Judith  Monk    30  female  63.31   186.16  1156    moderate
    3   Humberto    Ray         50  male            70.73   211.53  1183    high
    4   Claude  Baker   22  male            70.68   151.08  12819   none
    

    Mean:

    66.51339
    df.height.mean()
    

    Standard deviation:

    4.148
    df.height.std()
    

    Histogram:

    df.hist('height')
    

    mark
  • My data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=AnjuliH')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Kathleen    Allen   30  female  60.70   139.41  520 none
    96  Sarah   Roche   35  female  68.96   183.76  32649   high
    97  Christine   Griffin 29  female  63.36   159.37  7969    moderate
    98  Pamela  Small   30  female  67.62   174.56  5076    high
    99  Betty   Overton 44  female  61.65   168.26  159 high
    

    The mean of my heights is: 66.14279999999998, as obtained by this computation:

    df.height.mean()
    

    The standard deviation of my height is: 3.7008101979447847, as obtained by this computation:

    df.height.std()
    

    My histogram looks like:

    df.height.hist()
    

    mark
  • Here's my data

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=ltipton')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Marilyn Parris  29  female  64.68   150.69  115310  high
    96  Robert  Flores  51  male    71.18   136.76  24487   moderate
    97  Daniel  Bohne   36  male    72.07   177.61  833 none
    98  Richard Schubbe 32  male    68.57   145.97  47000   high
    99  David   Smith   41  male    69.95   176.52  3249    high
    

    The mean of my height is 66.8057

    df.height.mean()
    

    My histogram looks like

       df.height.hist()
    

    My standard deviation is 3.8103095807309546

    df.height.std()
    
    mark
  • this is my data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    

    username=bgillen')
    df.tail()

    first_name  last_name   age sex height  weight  income  activity_level
    95  Stewart Horton  29  male    68.29   180.10  1616    none
    96  Catherine   Maddox  34  female  65.84   187.60  7534    moderate
    97  Margaret    Martin  28  female  66.34   205.22  229127  high
    98  Robert  Armstrong   21  male    65.95   157.64  55894   none
    99  Sheila  Wallis  41  female  63.27   124.36  4767    moderate
    

    The mean of my heights is: 66.0328
    as obtained from df.height.mean()

    The Standard Deviation of my heights is: 3.7207670220430664
    as obtained from df.height.std()

    My histogram looks like:

  • edited August 14

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?amcbride=mark')
    df.tail()

    Natasha,Dutton,32,female,63.77,190.59,10437,high
    Jeffrey,Kinard,23,male,64.89,104.55,7432,high
    Tammy,Gallagher,20,female,65.55,135.1,1566,none
    Brandi,Grubbs,23,female,65.65,168.62,1033,moderate
    Margie,Nash,31,female,62.85,133.21,3765,moderate
    Mark,Crye,36,male,67.8,165.29,8328,high

  • edited August 14

    My Answer:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=mark')
    df.tail()
    
    95  Michael Vogel   31  male    65.23   145.48  894 moderate
    96  Karl    Cornely 27  male    70.21   215.22  3908    moderate
    97  Phyllis Hoffman 52  female  63.34   162.52  35325   none
    98  Patricia    Winslow 28  female  67.92   185.47  3091    high
    99  Lamont  Morales 40  male    72.57   173.11  4561    high
    

    The mean of my heights is 66.121 as optained from this computation

    df.height.mean()
    

    The Standard deviation of my height is 3.838 as optained from this computation

    df.height.std()
    

    My histogram looks like:

    df.hist('height');
    

    mark
  • edited August 14
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=tgermann')
    df.tail()
    
    first_name  last_name   age sex height  weight  income  activity_level
    95  Daryl   Graza   24  male    73.26   149.85  76406   moderate
    96  Blake   Neely   35  male    72.52   149.27  1349    moderate
    97  Tiny    East    43  female  65.47   205.34  1161    none
    98  Ben Shaffer 38  male    67.30   177.77  5258    none
    99  Christen    Bailey  25  female  63.92   185.45  5216    high
    

    https://marksmath.org/classes/Fall2020Stat185/StatTalk/uploads/editor/bi/ewuktcg1fc12.png "")

    My histogram looks like

    df.hist('bachelors');

    The mean is 19.033757556474733

    df.bachelors.mean()
    

    The standard deviation is 8.663062575676577

    df.bachelors.std()
    
Sign In or Register to comment.