Realty Sale Price Case Study

Tyler MacDougall
Professor Hovhannisyan
EC 220 Business Statistics
3/20/15
Case Study Part 2: Longwell Realty
Longwell Realty is a realtor based out of Colombia, MD which has provided data
showing the quantity and types of homes they have sold in the past three months. The company
is interested in learning how to appropriately price a house in order to gain a larger share of the
housing market. The data they have obtained can be formatted to determine which prices to sell
each particular type of house at in order to maximize profitability, and essentially increase their
market share. There are multiple factors which influences the sale of houses in the realty
business. Over the past three months Longwell has obtained data for variables such as: housing
type, number of beds, number of rooms/bathrooms, square feet, parking type, year built, and last
sale price. After performing a statistical analysis on the data obtained by Longwell Realty Inc.
the information can be analyzed to determine which variables effect how to appropriately price a
home.
In order to begin to understand how to appropriately price a house, we first have to get a
feel for what a typical house costs in Longwell’s market. According to the table in (1a), we can
see that the mean price of a randomly selected house that Longwell Realty has sold in the past
three months is $339,969.61. Therefore, it can be estimated that any house that Longwell is able
to sell will go, on average, for that amount. Table (1b) organizes sale price data in way which is
useful in providing statistics on the frequency of houses sold in different ranges of sale price.
This histogram of sale price shows us that the majority (31%) of houses sold by Longwell Realty
were listed at a sale price in the range of $200,000-$299,999. The statistics in table (1a) show
that the sale price mean is greater than the sale price median. Judging by this knowledge we can
determine that the data is skewed to the right. The estimation of average house sale price, the
majority of sale price frequency and the skewness of the data are all inferential statistics, while
all of the other explicit findings that our graphs provide are descriptive statistics.
The next step in analyzing the housing price data from Longwell Realty would be to
separate the sales into the different types of houses. A non-graphical way to see this can be found
in table (2a) which is a pivot table showing the frequency of sales within a specified range for the
sale of condos, single family residences, and townhouses. Another way to view this data would
be graphically. Table (2b) organizes the house type data into a bar chart which shows the number
of each type of homes sold within each sale price range. Judging by the bar chart, we could say
that the majority of homes sold by Longwell are priced between $200,000 and 499,999. Of the
houses sold within that range, Longwell found more success with the sale of Single Family
Residences and Townhomes (94%), than with the sale of condos (6%). The two most significant
facts that come from this data set are that Longwell Realty sales drop off when houses are priced
above $499,999, and that condos make up a very small portion of Longwell sales.
Another useful way to analyze the data from Longwell Realty is to find which variables
have the greatest effect on sale price. For example, by performing a regression analysis on sale
price and square footage, we find that square footage has a positive correlation of 0.629756493.
This correlation is considered week because it falls between -0.7 and 0.7. Although it is
considered week, this correlation still gives us valuable information on the effects of sale price.
As we see in the Square Footage Regression table (3a), square footage actually has the strongest
relationship with sale price because the R-squared is the closest to 1.
An easier way to interpret the relationship between sale price and square footage is
graphically with a scatter plot. Table (4a) is a scatter plot which depicts a graphical relationship
between sale price and square footage. From the statistics in the square footage regression table
we can estimate the regression equation for the square footage variable. The equation can be
found in table (4b). There are two coefficients generated from this equation. The first is the
independent variable of 130.579(SQFT), and the second coefficient is the Y-Intercept of
82,172.97. The coefficient for square footage shows that for every additional one square foot,
there is an increase of $130.58 in sale price, while the $82,172.97 represents the theoretical price
of a house with zero square feet. The R-squared of the square foot regression shows us that
62.83% of variation in sale price has been explained by estimated regression model. If we used
the value from the first observation in Longwell Realty data of 929 sqft to plug into our
estimated regression equation, we would find that the estimated sale price is $203,480.90. (4c)
Based off of this estimate, it would be reasonable to price the house of 929 sqft at $142,000
because the difference of $61,480.86 is less than the standard error of $83,375.47.
Out of all of the variables in the Longwell Realty data set, the dependent variable is last
sale price because that value is being effected by all of the other variables. The independent
variables in this set would be everything else such as home type, number of beds, number of
baths, square footage and year built. Since all of these independent variables increase the value
of the house, I would estimate there to be a positive correlation between all of the independent
variables and last sale price. By performing a multiple regression on all of the variables, we can
find the relationship between each independent variable and sale price (5b). Since home type,
and parking type are both qualitative variables, we had to use dummy variables in order to find
the correlation. After doing this I discovered that my expectation for the independent variables
was wrong since both reserved and garage parking types have a negative correlation with last
sale price (5b). The correlation of square footage is different in the multiple regression equation
of 55.97, than from the previous regression analysis of 130.57. This is because there are more
independent variables included in the regression which make the equation more accurate. For the
type of home variable we used two dummy variables to compare townhouse and single family
residence. The correlation of townhouse was 26,960, while the correlation of single family
residence 121,779. This means that if the house sold was a townhouse, then there is an increase
of $26,960 in sale price, and if the house sold is a single family residence then there is an
increase of $121,779 in last sale price. This makes sense theoretically because the cost of a single
family home would be significantly higher than a townhouse.
After conducting a statistical regression analysis on all of the variables in Longwell
Realty’s sample set, we are able to find a useful model that helps use to understand how to
appropriately price a house. It would be more useful to have a sample set that spans over at least
one year, because the housing market fluctuates over time so having a sample that reflects those
fluctuations would be most useful. However, the sample set of three months still provides us
with helpful information to predict the most profitable sale price for Longwell Realty.
Appendix:
(1a)
Descriptive Statistics: Sale Price
Mean
339969.6
Standard Error
8736.143
Median
309000
Mode
245000
Standard Deviation 136742.3
Sample Variance
1.87E+10
Kurtosis
0.972903
Skewness
0.93217
Range
732000
Minimum
118000
Maximum
850000
Sum
83292556
Count
245
(1b)
Last Sale Price
100000-199999
200000-299999
300000-399999
400000-499999
500000-599999
600000-699999
700000-799999
800000-899999
Total
Frequency
33
75
68
38
17
11
2
1
245
Relative
Frequency
0.13
0.31
0.28
0.16
0.07
0.04
0.01
0.00
1
Percent
Frequency
13.47%
30.61%
27.76%
15.51%
6.94%
4.49%
0.82%
0.41%
100.00%
(2a)
Count of LAST
SALE PRICE
Row Labels
100000-199999
200000-299999
300000-399999
400000-499999
500000-599999
600000-699999
700000-799999
800000-899999
Grand Total
Column Labels
Single Family
Condo Residential
16
3
4
1
24
Grand
Total
Townhouse
3
19
33
31
12
11
2
1
112
14
53
31
6
5
109
(2b)
Frequency of House Type
60
50
40
30
20
10
0
Condo
Single Family Residential
Townhouse
33
75
68
38
17
11
2
1
245
(3a)
Regression Statistics
Multiple R
0.793572
R Square
0.629756
Adjusted R
Square
0.628233
Standard Error
83375.47
Observations
245
121307.9
82172.97
203480.9
142000
61480.86
ANOVA
df
Regression
Residual
Total
Intercept
SQFT
1
243
244
Coefficients
82172.97
130.5796
Significance
SS
MS
F
F
2.87E+12 2.87E+12 413.3248
2.39E-54
1.69E+12 6.95E+09
4.56E+12
Standard
Error
t Stat
13753.72 5.974599
6.422877 20.33039
(4a)
P-value
8.16E-09
2.39E-54
Upper
Lower 95%
95%
55081.24 109264.7
117.928 143.2312
Lower
95.0%
55081.24
117.928
LAST SALE PRICE
900000
800000
Sale Price
700000
600000
500000
400000
300000
200000
100000
0
0
1000
2000
3000
4000
5000
6000
Square Feet
(4b)
Last Sales Price = 130.579*(Square Feet) + $82,172.79
(4c)
130.579*(292sqft) + $82,172.79 = $203,480.86
(5b)
Last Sale Price = 3838.81*(BEDS) + 38599.40*(BATHS) +55.98*(SQFT) +
3352.62*(YEARBUILT)* + 121,799.17*(SingleResidential) + 26960.33(Townhouse) 36,179.96*(Reserved) - 1,629.90*(Garage)