Bar plot and proportion from data - 8:00 AM

edited February 1 in Assignments

(10 points)

Using your own random data from last week's forum question, let's use Colab to examine the activity_level variable. Specifically:

  • Generate a bar chart for activity_level and
  • Compute the proportion of folks whose activity level is high.

Note that creative burden is higher in this lab than in the last in that the Colab link above leads to a blank notebook. Nonetheless, you can find sample code that should help in our class presentation on Categorical Data.

Comments

  • I grabbed my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=audrey')
    

    I then computed my value_counts:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    high 39
    moderate 33
    none 28

    We can see right away that the proportion of folks with high activity is
    $39/100 = 0.39$.

    Finally, my bar plot looks like so:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    This is how I grabbed my data.

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=lindieclark_')
    df.head()
    

    I then computed my 'value_counts'

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    moderate 36
    none 32
    high 32

    We can see right away that the proportion of folks with high activity is $32/100= 0.32$.

    Finally, my bar plot looks like:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    I imported data from the last forum question, and changed my username to SGriffin as follows:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=SGriffin')
    

    Then I generated a value count for the variable "activity level" as follows:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    It gave me these outputs:
    moderate 36
    high 34
    none 30

    I then used that data to plot a bar chart with the following code:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    Then I analyzed the proportion of people that were listed as "high activity" with the following code:

    value_counts['high']/len(df)
    

    That told me my proportion of those with a high activity level is .34 $(34/100)$

    Finally, my bar plot looks like this:

    mark
  • edited February 1

    Using last weeks table-

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=moomoo')
    df.head()
    

    Using value_counts; we can see how many people have a high activity level

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    moderate 42
    none 34
    high 24
    Name: activity_level, dtype: int64

    Our proportion with high activity is
    $34/100=0.34$

    To make a bar plot of this use

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    I grabbed my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=leland')
    

    I then computed my 'value_counts':

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    high 39
    none 35
    moderate 26
    Name: activity_level, dtype: int64

    Our proportion with high activity is:

    39/100=0.39

    To make a bar plot of this use:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • I received my data like so:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=driordan')
    

    I then computed my 'value_counts':

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    moderate 37
    high 34
    none 29

    We can see from the collected data that the proportion of people with high activity is:

    $ 34/100 = 0.34 $

    Finally, my bar plot looks like so:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    Grabbed my data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=elainanakos')
    

    My value counts

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    none 43
    high 30
    moderate 27

    The proportion of people with high activity level is:

    $30/100 = 0.30$.

    Finally, the bar plot looks like this:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    I used the same data set as last week

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=jgreenba')
    df.head()
    
        first_name  last_name   age sex height  weight  income  activity_level
    0   Margaret    Weintraub   20  female  63.59   172.64  1642    moderate
    1   Daniel  Shulman 23  male    72.12   148.89  231130  none
    2   Daniel  Urbanek 21  male    68.21   184.13  2407    moderate
    3   Sonia   Dehart  42  female  62.03   141.22  99475   none
    4   Norman  Stiger  43  female  60.27   127.52  152 none
    

    Used value counts to identify the numbers of people with different activity levels

    value_counts = df['activity_level'].value_counts()
    value_counts
    
    none        37
    moderate    36
    high        27
    Name: activity_level, dtype: int64
    

    Then I made a bar chart using the data

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    The proportion with high activity is 27/100=.27

    mark
  • edited February 1

    I grabbed my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=jdowning')
    

    I then computed my value counts:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    High 39
    moderate 34
    none 27

    We can see that the proportion of folks with high activity

    39/100 = 0.39

    finally, my bar plot looks like:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 5

    I grabbed my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=katelynnecampbell')
    

    I then computed my value_counts:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    none 33
    moderate 33
    high 34

    We can see right away that the proportion of folks with high activity is
    34/100=0.34

    Finally, my bar plot looks like so:

    value_counts.plot.bar(figsize=(12,7), rot = 0);

    mark
  • edited February 1

    I grabbed my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=cbrown15')
    df.head()
    

    I then computed my value_counts:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    none 35
    high 34
    moderate 31

    We can see right away that the proportion of folks with high activity is

    $34/100 = 0.34$

    Finally my bar chart looks like so:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    This is how I collected my data

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?  username=vzaia') 
    df.head() 
    

    I then computed my value_counts

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    none 34
    high 34
    moderate 32

    We can see right away that the proportion of folks with high activity is 34/100= 0.34.

    Finally, my bar plot looks like this:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    I grabbed my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv
    username=jwh1234')
    df.head()
    

    I then computed my value_counts:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    high 39
    moderate 32
    none 29

    We can see right away that the proportion of folks with high activity is
    39/100 = 0.39

    Finally, my bar plot looks like so:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • edited February 1

    I got my data like this

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=madysongold')
    

    Then i computed my value_counts:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    none 38
    moderate 31
    high. 31

    The proportion of people with high activity is
    .31
    I found this using the

    value_counts['high']/len(df)
    

    Lastly, my bar chart looks like:

    value_counts.plot.bar(figsize=(12,7), rot = 0);]
    

    mark
  • This is how I grabbed my data:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=victoria')
    

    I then computed my value count:

    value_counts = df['activity_level'].value_counts()
    value_counts
    
    high        37
    

    moderate 34
    none 29

    We can see right away that the proportion of folks with high activity is:

    37/100 = .37

    Finally, my bar plot looks like this:

    value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
  • I grabbed my data like this:

    import pandas as pd
    df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv? 
    username=danniarm')
    df.head()
    

    I then computed my value_counts:

    value_counts = df['activity_level'].value_counts()
    value_counts
    

    none 39
    moderate 31
    high 30

    Finally, my bar plot looks like so:

     value_counts.plot.bar(figsize=(12,7), rot = 0);
    

    mark
Sign In or Register to comment.