MML Discourse archived in May, 2026

Lab 1 Hand In

mark

(20 pts)

Submit your results to Lab 1 by responding here. Your response should include:

  • A list of the variables that you used in your model,
  • Any other changes that you made to the code in the lab notebook,
  • A link to the upload of your resulting submission file, and
  • The score obtained from our scoring tool.
mark
numeric_variables = ['GrLivArea']
nominal_variables = ['Neighborhood']
qual_variables = ['KitchenQual']

housing_predictions_demo.csv (19.8 KB)

My Score:
0.196455

User 009

numeric_variables = ['GrLivArea', 'OverallQual', 'YearBuilt', 'OverallCond', 'GarageCars', 'Fireplaces', 'FullBath', 'BsmtFullBath', 'TotalBsmtSF', 'YearRemodAdd', 'TotRmsAbvGrd']

nominal_variables = ['Neighborhood']

qual_variables = ['KitchenQual']

housing_predictions_demo.csv (19.8 KB)

Score: 0.14427434825801208

User 018
numeric_variables = ['GrLivArea', 'OverallQual','YearBuilt','OverallCond','GarageCars','BsmtFullBath','Fireplaces',
                    'TotalBsmtSF', 'YearRemodAdd', 'TotRmsAbvGrd', 'FullBath', 'ScreenPorch', 'LotArea', 'WoodDeckSF', 'BsmtFinSF1']
nominal_variables = ['Neighborhood']
qual_variables = ['KitchenQual']
target_variables = ['SalePrice']

housing_predictions_demo.csv (19.8 KB)

User 020

numeric_variables = ['GrLivArea', 'YearBuilt', 'GarageCars', 'OverallQual', 'OverallCond']

nominal_variables = ['Neighborhood']

qual_variables = ['KitchenQual']

target_variables = ['SalePrice']


housing_predictions_demo (1).csv (19.8 KB)

User 008

numeric_variables = ['GrLivArea', 'OverallQual', 'YearBuilt', 'OverallCond', 'GarageCars']
nominal_variables = ['Neighborhood']
qual_variables = ['KitchenQual']
target_variables = ['SalePrice']
housing_predictions_demo (1).csv (19.8 KB)

User 002

numeric_variables = ['GrLivArea', 'MSSubClass','GarageCars', 'OverallQual','OverallCond']

nominal_variables = ['Neighborhood','YearBuilt']

qual_variables = ['KitchenQual']

target_variables = ['SalePrice']

housing_predictions_demo.csv (19.8 KB)

Score: 0.15877368818470677

User 013
numeric_variables = ['OverallQual','GrLivArea','YearBuilt','OverallCond','GarageCars','BsmtFullBath']
nominal_variables = ['Neighborhood']
qual_variables = ['KitchenQual']

My score was 0.1499

housing_predictions_demo (15).csv (19.8 KB)

User 022

numeric_variables = ['GrLivArea','OverallQual','GarageCars','Fireplaces','ScreenPorch','3SsnPorch','WoodDeckSF','EnclosedPorch','OverallCond','YearRemodAdd','LotArea']

nominal_variables = ['Neighborhood']

qual_variables = ['KitchenQual']

housing_predictions_demo (13).csv (19.8 KB)

User 007

numeric_variables = ['GrLivArea','YearBuilt', 'OverallQual', 'OverallCond', 'GarageCars']

nominal_variables = ['Neighborhood']

qual_variables = ['KitchenQual']

target_variables = ['SalePrice']

housing_predictions_demo (1).csv (19.8 KB)

Score: 0.15417995829350553

User 012
numeric_variables = [ 'GrLivArea', 'OverallQual', 'YearBuilt', 'OverallCond', 
                      'GarageCars', 'BsmtFullBath', 'Fireplaces', 
                      'TotalBsmtSF', 'YearRemodAdd', 'TotRmsAbvGrd',
                      'FullBath', 'ScreenPorch', 'LotArea', 'WoodDeckSF', 
                      'BsmtFinSF1']
nominal_variables = ['Neighborhood', 'BldgType', 'RoofMatl']
qual_variables = ['KitchenQual']

housing_predictions_demo.csv (19.8 KB)

My Score:
0.13724343241678663

User 010

image

housing_predictions_demo (1).csv (19.8 KB)

User 003

Variables

numeric_variables = ['GrLivArea','OverallQual','NHmedian','YearBuilt','OverallCond','GarageCars','BsmtFullBath','Fireplaces',	'TotRmsAbvGrd','ScreenPorch','TotalBsmtSF','YearRemodAdd','FullBath','LotArea']
nominal_variables = ['Neighborhood']
qual_variables = ['KitchenQual']
target_variables = ['SalePrice']

The NHmedian variable is the median sale price calculated for each neighborhood--location, location, location.

Score
0.14289

housing_predictions_demo.csv (19.8 KB)

User 019
numeric_variables = ['GrLivArea', 'OverallQual', 'YearBuilt', 'OverallCond', 'GarageArea', 'BsmtFullBath', 'Fireplaces', 'TotalBsmtSF',
                     'YearRemodAdd', 'TotRmsAbvGrd', 'FullBath', 'ScreenPorch', 'LotArea']
nominal_variables = ['Neighborhood']
qual_variables = ['KitchenQual']
target_variables = ['SalePrice']

housing_predictions_demo.csv (19.8 KB)

User 001

numeric_variables = ['GrLivArea', 'YearBuilt', 'OverallCond', 'Fireplaces', 'FullBath' ]

nominal_variables = ['Neighborhood']

qual_variables = ['KitchenQual']

target_variables = ['SalePrice']

our score is 0.1690946121350628.

housing_predictions_demo (3).csv (19.8 KB)

User 014

numeric_variables = ['GrLivArea','OverallQual','YearBuilt','GarageCars','BsmtFullBath',
'Fireplaces','TotalBsmtSF','YearRemodAdd','TotRmsAbvGrd','FullBath','ScreenPorch','LotArea']
housing_predictions_demo.csv (19.8 KB)
0.14423910325793002

User 023

My changes were

numeric_variables = ['OverallQual','OverallCond','GarageCars','LotArea','TotalBsmtSF','GarageArea','WoodDeckSF','OpenPorchSF','PoolArea']
nominal_variables = ['Neighborhood','RoofMatl','Utilities','Foundation','BsmtCond']
qual_variables = ['KitchenQual']

housing_predictions_demo.csv (19.8 KB)

User 005

numeric_variables = ['GrLivArea', 'OverallQual', 'YearBuilt', 'OverallCond', 'GarageCars', 'BsmtFullBath', 'Fireplaces', 'TotalBsmtSF', 'YearRemodAdd', 'TotRmsAbvGrd', 'FullBath', 'ScreenPorch', 'LotArea', 'WoodDeckSF', 'BsmtFinSF1', 'HalfBath', '1stFlrSF', 'EnclosedPorch', '3SsnPorch', 'GarageArea', 'BsmtHalfBath', 'BsmtFinSF2', 'MasVnrArea', 'LowQualFinSF', 'BsmtUnfSF', '2ndFlrSF', 'MoSold', 'BedroomAbvGr', 'OpenPorchSF', 'MiscVal', 'YrSold', 'KitchenAbvFr', 'PoolArea', 'MSSubClass']
nominal_variables = ['Neighborhood']
qual_variables = ['KitchenQual']
target_variables = ['SalePrice']
Copy of HousingPriceRegressionLab2026.ipynb - Colab
Mean Squared Error: 0.13224937762230213

mark

@User 028

What's your score???

User 023

Sorry, forgot to add it. Mine was 0.16678281096727288

User 006


Here are my variables

my score is 0.18985376615729802.

User 026

The easiest way to start improving the score was by adding more variables to consider in our regression. I started with numeric which made a massive improvement but limited me to a score roughly of 0.1475. By adding nominal variables I was able to reduce it to roughly 0.1425

numeric_variables = [ 'GrLivArea', 'OverallQual', 'YearBuilt', 'OverallCond', 
                      'GarageCars', 'BsmtFullBath', 'Fireplaces', 
                      'TotalBsmtSF', 'YearRemodAdd', 'TotRmsAbvGrd',
                      'FullBath', 'ScreenPorch', 'LotArea', 'WoodDeckSF', 
                      'BsmtFinSF1', 'MSSubClass', 'PoolArea', 'YrSold', 'MiscVal',
                      ]
nominal_variables = ['Neighborhood', 'BldgType', 'RoofMatl', 'HouseStyle', 'Foundation', 'SaleCondition',]

The next big jump I was able to do was by adding more quality variables. The initial code would not allow for this due to the shape of the array so I had to map the quality order to each index in the array and encoding them to quality order. This got me down to 0.1356.

qual_variables = ['KitchenQual', 'ExterQual', 'BsmtQual', 'HeatingQC', 'ExterCond', 'BsmtCond']

qual_encoder = OrdinalEncoder(categories=[quality_order] * len(qual_variables))

Finally, due to the increase in variables it did turn out to be useful to use RidgeCV so I uncommented:

regress = RidgeCV(
    alphas=np.logspace(-1, 1, 100)
  )

and

regress.alpha_

I'm pretty sure there is a better way to narrow down the noise by consolidating similar attributes such as total square footage instead of each sqft attribute individually but I couldn't get that to work properly. There also might be some logspace values that could possibly help in the RidgeCV but that also didn't work out for me

Final score: 0.13166469694390048

Lab01_MML.csv (19.8 KB)

User 016

numeric_variables = ['GrLivArea', 'OverallQual', 'YearBuilt', 'BedroomAbvGr', 'OverallCond', 'GarageCars', 'BsmtFullBath', 'Fireplaces', 'TotalBsmtSF', 'YearRemodAdd', 'FullBath', 'LotArea', 'WoodDeckSF', 'HalfBath', 'MasVnrArea']
nominal_variables = ['Neighborhood','MSSubClass', 'Street', 'Condition1', 'Functional', 'CentralAir', 'RoofStyle', 'Foundation', 'BldgType', 'Heating', 'MSZoning', 'HouseStyle', 'LotShape']
qual_variables = ['KitchenQual', 'ExterQual', 'HeatingQC', 'BsmtQual', 'GarageQual', 'BsmtExposure', 'BsmtFinType1', 'BsmtFinType2']
Maggie.csv (19.8 KB)
score is 0.13383794318864528
Definitely diminishing returns on just adding variables.

User 021

numeric_variables = ['GrLivArea', 'YearBuilt', 'OverallCond', 'GarageCars', 'Fireplaces', 'BsmtFullBath', 'TotalBsmtSF', 'TotRmsAbvGrd', 'KitchenAbvGr']

nominal_variables = ['Neighborhood']

qual_variables = ['KitchenQual']

target_variables = ['SalePrice']

My score was a 0.15369876296498602

housing_predictions_demo (4).csv (19.8 KB)

User 004

numeric_variables = ['GrLivArea', 'OverallQual','YearBuilt','OverallCond','GarageCars','BsmtFullBath','Fireplaces',
'TotalBsmtSF', 'YearRemodAdd', 'TotRmsAbvGrd', 'FullBath', 'ScreenPorch', 'LotArea', 'WoodDeckSF', 'BsmtFinSF1']

nominal_variables = ['Neighborhood', 'BldgType', 'RoofMatl', 'HouseStyle']

qual_variables = ['KitchenQual']

target_variables = ['SalePrice']

housing_predictions_demo (1).csv (19.8 KB)

User 024

image

mark