Scenario
You have been hired by the D. M. Pan National Real Estate Company to develop a model to predict housing prices for homes sold in 2019. The CEO of D. M. Pan wants to use this information to help their real estate agents better determine the use of square footage as a benchmark for listing prices on homes. Your task is to provide a report predicting the housing prices based square footage. To complete this task, use the provided real estate data set for all U.S. home sales as well as national descriptive statistics and graphs provided.
Describe the report: Give a brief description of the purpose of your report.
Define the question your report is trying to answer.
Explain when using linear regression is most appropriate.
When using linear regression, what would you expect the scatterplot to look like?
Explain the difference between predictor (x) and response (y) variables in a linear regression to justify the selection of variables.
Data Collection
Sampling the data: Select a random sample of 50 houses. Describe how you obtained your sample data (provide Excel formulas as appropriate).
Identify your predictor and response variables.
Scatterplot: Create a scatterplot of your predictor and response variables to ensure they are appropriate for developing a linear model.
Data Analysis
Histogram: Create a histogram for each of the two variables.
Summary statistics: For your two variables, create a table to show the mean, median, and standard deviation.
Interpret the graphs and statistics:
Based on your graphs and sample statistics, interpret the center, spread, shape, and any unusual characteristic (outliers, gaps, etc.) for house sales and square footage.
Compare and contrast the center, shape, spread, and any unusual characteristic for your sample of house sales with the national population (under Supporting Materials, see the National Summary Statistics and Graphs House Listing Price by Region PDF). Determine whether your sample is representative of national housing market sales.
Develop Your Regression Model
Scatterplot: Provide a scatterplot of the variables with a line of best fit and regression equation.
Based on your scatterplot, explain if a regression model is appropriate.
Discuss associations: Based on the scatterplot, discuss the association (direction, strength, form) in the context of your model.
Identify any possible outliers or influential points and discuss their effect on the correlation.
Discuss keeping or removing outlier data points and what impact your decision would have on your model.
Calculate r: Calculate the correlation coefficient (r).
Explain how the r value you calculated supports what you noticed in your scatterplot.
Determine the Line of Best Fit. Clearly define your variables. Find and interpret the regression equation. Assess the strength of the model.
Regression equation: Write the regression equation (i.e., line of best fit) and clearly define your variables.
Interpret regression equation: Interpret the slope and intercept in context. For example, answer the questions: what does the slope represent in this situation? What does the intercept represent? Revisit the Scenario above.
Strength of the equation: Provide and interpret R-squared.
Determine the strength of the linear regression equation you developed.

Sample Answer
Report Description
This report details the construction and interpretation of a linear regression model designed to predict the sale price of a home based on its square footage. The model will help D. M. Pan’s real estate agents benchmark listing prices, providing a data-driven approach to their sales strategies.
Question the Report is Trying to Answer
The central question this report aims to answer is: “How can square footage be used to predict the listing price of a home sold in 2019?”
When Linear Regression is Most Appropriate
Linear regression is most appropriate when there is a suspected linear relationship between two continuous variables. This means that as one variable (the predictor) increases or decreases, the other variable (the response) tends to increase or decrease at a relatively constant rate. It’s used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data.
Expected Scatterplot for Linear Regression
When using linear regression, we would expect the scatterplot to show a roughly linear pattern of points. This means the points should tend to cluster around a straight line, indicating a positive or negative association. The points shouldn’t show a clear curve, fan out significantly (heteroscedasticity), or form distinct clusters that suggest a non-linear relationship.
Predictor (x) and Response (y) Variables in Linear Regression
In linear regression, the predictor variable (x) is the independent variable that is used to explain or predict changes in the response variable. The response variable (y) is the dependent variable, whose value is being predicted or explained.
In this scenario:
- Predictor Variable (x): Square Footage
- Justification: The CEO wants to use square footage as a benchmark for listing prices. It is a measurable characteristic of a home that is likely to influence its price. We are trying to predict price based on square footage.
- Response Variable (y): Listing Price (House Sales Price)
- Justification: This is the outcome we are trying to predict. The goal is to determine how square footage affects the price at which a home is sold.
Data Collection
Sampling the Data
To obtain a random sample of 50 houses from the provided real estate data set, I would use the following steps in Microsoft Excel:
- Assign a Random Number: In a new column (e.g., Column C), next to your existing data, enter the formula
=RAND()
in the first data row (e.g., C2).
Is this question part of your Assignment?
We can help
Our aim is to help you get A+ grades on your Coursework.
We handle assignments in a multiplicity of subject areas including Admission Essays, General Essays, Case Studies, Coursework, Dissertations, Editing, Research Papers, and Research proposals
Header Button Label: Get Started NowGet Started Header Button Label: View writing samplesView writing samples