Copyright American Real Estate Society 2007| [Headnote] |
| Abstract |
| The housing literature has traditionally employed hedonic price models to investigate the impact of house and neighborhood characteristics on housing prices. These models, however, are not necessarily equipped to take into account the cross-classified, hierarchical nature of housing markets. This paper employs a hierarchical linear model (HLM) to examine the impact that housing characteristics, neighborhood affluence, and school-achievement scores have on housing prices in a cross-classified setting within a single municipality. More specifically, this paper analyzes the impact that differences in affluence across neighborhoods and school-achievement scores across school zones have on the valuation of certain individual housing characteristics in particular and, through them, on housing prices in general. |
Researchers have traditionally used hedonic price models to investigate the impact of structural and neighborhood variables on housing prices. [For example, see Sirmans, Macpherson, and Zietz (2005) for an overview of more than 100 empirical hedonic studies published since the early 1990s. Some earlier, well-known studies include Krantz, Weaver, and Alter (1982), Goodman (1983), and Kohlhase (1991), among others.] Hedonic price models, however, fail to address properly a certain peculiarity of housing markets-the fact that a housing market may constitute not simply a hierarchical environment but a cross-classified, hierarchical environment. This paper employs a hierarchical linear model (HLM) to examine and test for the impact on housing prices of the interrelationships between the various hierarchies in a cross-classified spatial setting within a single municipality.
Dwellings are located within blocks, block groups, school zones, school districts, census tracts, towns, etc. Whether administratively defined or not, such "groupings" form hierarchies (sometimes called "levels" or "nestings"). Complicating matters even more, these hierarchies may not always be "purely nested" hierarchies because dwellings do not necessarily and always "map" into higher levels that are mutually exclusive. For instance, dwellings in the same census block group may be located within the boundaries of several different school zones. By the same token, a school zone may encompass dwellings located in several different census block groups.1 Such spatial cross-classifications are more the norm than the exception in housing markets, and traditional hedonic price models will not capture this cross-classification.
The dwellings within a neighborhood share the demographic, economic, environmental, and social attributes specific to that neighborhood, and are also more likely to have similar structural features.2 Thus, because neighborhoods represent a level in the hierarchy, dwellings grouped within a hierarchy are also likely to be more similar to each other than to those sampled from the population of all dwellings. Therefore, hierarchical data violate the assumption of independence of observations. This means that the application of traditional methods of estimation (such as ordinary least squares, or OLS) to hierarchical environments would yield statistically unreliable results since the violation of the assumption increases the risk of committing a Type-I error (e.g., Garner and Raudenbush, 1991; Osborne, 2000; and Raudenbush and Bryk, 2002). One advantage of HLM is that it explicitly corrects for this problem in hierarchical environments.
Hedonic price models have traditionally accounted for the hierarchical nature of housing markets by assigning the attributes for a neighborhood (the "higher" hierarchy) to all the individual dwellings (the "lower" hierarchy) located in that neighborhood. That is, they express the individual dwelling prices (the outcome variable) as a function of the neighborhood-specific attributes, as well as the structural features of that dwelling. In this case, the aforementioned problem concerning the statistical reliability of OLS results still holds. In addition, this approach implicitly assumes that all of the individual dwellings in a neighborhood are affected in the same way and to the same extent by the neighborhood attributes. This is rather unrealistic because it does not allow for proper interaction among the attributes from different hierarchies, and it does not permit the researcher to disentangle the impacts of dwelling-specific and neighborhood-specific variables on individual housing values either. The problem is compounded when the dwellings are spatially cross-classified (e.g., Osborne, 2000; and Raudenbush and Bryk, 2002). Therefore, another advantage of HLM is that it allows for attribute-specific variable interaction across hierarchies.
HLM is an appropriate modeling approach when one has a cross-classified hierarchical environment such as a housing market because as Raudenbush and Bryk (2002, pp. 6-7) state, "With hierarchical linear models, each of the levels in the (hierarchical) structure is formally represented by its own submodel. These submodels express relationships among variables within a given level, and specify how variables at one level influence relations occurring at another," and also account for interactions across levels.3 HLM decomposes the total variance in dwelling prices between hierarchies and allows a researcher to identify which levels in the hierarchy are responsible for how much of the variation in dwelling prices, even when the dwellings are cross-classified. This allows a researcher to account for both individual and joint impacts across hierarchies of variables on housing prices. Obviously, the larger the geographical extent of a housing market is and, consequently, the greater the number and types of neighborhoods are, the more significant such relationships and impacts become.
HLM can perhaps best be explained in the context of "economic rent." Hedonic price models have already well established that, given everything else, dwellings with identical features are likely to be viewed differently by households and command different prices if located in different neighborhoods or school "areas," due to neighborhood- or school-specific factors. The same literature has also found that, for example, given everything else, larger dwellings are likely to command higher prices than smaller dwellings. What the traditional hedonic price models are not necessarily equipped to uncover, though, is the possible locational rent accruing to differences in specific structural features of dwellings across neighborhoods. Using HLM, on the other hand, a researcher can uncover not only whether, but more importantly to what extent a more affluent neighborhood directly "confers" a dwelling a higher price, for instance, for being larger or having more bathrooms than a less affluent neighborhood does. Kahane (2001, pp. 629, 631) provides a succinct discussion of this particular aspect in the context of player salaries in sports.
In summary, HLM is a multilevel modeling approach that partitions the variance in dwelling prices between levels of the hierarchy while accounting for cross-classification as well. As such, it enables a researcher to address a more varied, and even richer, set of questions about housing markets than traditional hedonic price models. At the same time, it helps a researcher to avoid possible structural misspecification when the housing environment is cross-classified.
Brasington (1999), building on the model developed in Haurin and Brasington (1996), extends the traditional hedonic price model by correcting for spatial autocorrelation arising from the influence that neighboring houses have on each others' prices (e.g., Anselin, 1988). In a more recent article published in this journal, Lipscomb (2006, p. 143) proposes a multi-stage estimation technique designed to check and correct for spatial effects "in the hedonic coefficients beyond those captured in the (traditional) hedonic regression itself." While these approaches to estimating hedonic models improve regression estimates, these studies do not specifically focus on cross-classified housing markets. Since HLM allows a researcher to deal with both the hierarchical nature of housing markets and the potential for cross-classification, this paper estimates an HLM model rather than a hedonic price model with correction for spatial autocorrelation.
The widest application of HLM to date has been in the education field (e.g., Garner and Raudenbush, 1991; Raudenbush, 1993; Osborne, 2000; and Raudenbush and Bryk, 2002). Recently it has also been used in the sports economics literature (Kahane, 2001; and Brown and Jepsen, 2004). The most relevant and direct application of HLM to the determination of dwelling prices that we are aware of are Brown and Uyar (2004) and Goodman and Thibodeau (1998). The Brown and Uyar paper is a very basic pedagogical study whose objective is to introduce HLM to the housing literature. As such, its scope is quite limited in that the model only has one structural (level-1) and one neighborhood Gevel-2) variable and, more importantly, it assumes a pure hierarchy. The Goodman and Thibodeau paper applies HLM to single-family dwellings within the Dallas metropolitan area. Their HLM model is also a simple one with only two level-1 variables and one level-2 variable. Their contribution to the housing literature is stated in their objective. They use their model to demonstrate how HLM can be used to identify the way market forces and preferences may actually segment housing markets (or markets for other types of properties, such as offices, retail outlets, industrial establishments, etc.) into mutually exclusive environments, independently of any administratively or politically imposed divisions or boundaries. The level-2 variable (the "spatial dimension") they use to identify the way their metropolitan housing market is segmented is "the quality of public education (as measured by student performance on standardized tests)." (Ibid, p. 121.)
In this paper, we use HLM to model housing prices as a function of house-specific and neighborhood-specific variables in a cross-classified housing market. Specifically, we analyze the impact that differences in affluence across neighborhoods and school-achievement scores across school zones have on the valuation of certain individual housing characteristics in particular and, through them, on housing prices in general. Thus, we are using HLM to investigate the impact that the interrelationships between the various hierarchies have on housing prices in a cross-classified setting.
Study Design, Variables, and Data
The study is based on 710 dwellings sold in a mid-size city in the Midwest. The city is approximately 75 miles away from the nearest urban center that is comparable in size, in the type of employment and shopping opportunities provided, and in the kinds of entertainment and cultural amenities offered. As a result, the city can be treated as a "location node," which encompasses and exhausts all the hierarchies relevant for this study. The fact that we have access to a rich database for such a city is a definite advantage for this study. If the city had instead been a part of an "urban-corridor," failure to account for that as a "hierarchy" in the model would have constituted a specification error.
The 1980 Census reports indicate that during the period of study (1984-1985), the city had approximately 45,000 residents in 13 census tracts divided into 43 census block groups (CBGs).4 In most such studies, neighborhoods are defined to be census tracts (or even larger designations). In this study, the neighborhoods are defined as CBGs. As stated earlier, they are smaller and, therefore, are likely to be more homogeneous than census tracts. This permits greater inter-neighborhood variation in the housing market than would have been possible if census tracts had been used instead. The entire city constitutes a single school district, so public spending per student is the same throughout the city. There are, however, ten school "zones," and the children have to attend the school within whose boundaries they reside.
Exhibit 1 shows how the sample of 710 dwellings is cross-classified into 43 CBGs (neighborhoods) and 10 school zones. Looking at the exhibit by "row," of the 13 dwellings in CBG 22, five are in school zone 2 and eight are in school zone 7. Looking at the exhibit by "column," 67 of the dwellings are located in school zone 2, mapped across five CBGs.5
Exhibit 2 shows some basic descriptive statistics for these 710 dwellings. The average dwelling sold for $56,100, had 35 rooms, and had slightly more than 1,250 square feet of heated floor space. It had approximately 63 feet of road frontage, was almost 37-years-old, and was located 0.44 miles away from the school within whose (administrative) boundary it was located. Also, of the 710 dwellings, 52% were single-story buildings, 15% had finished basements, 11% had two full bathrooms or more, 17% had sunrooms, 33% had a fireplace, and 98% were on a paved street with a curb and gutter. The variable "bank owned" shows that 5% of the dwellings were in fact in the possession of the lending agency (in foreclosure) when they were bought.
According to the city maps prepared for insurance purposes, 18% of the dwellings were in the 100-year flood zone. However, the fact that virtually all such dwellings were within approximately 150 yards of the river also meant that they enjoyed the riparian opportunities offered by their location, and a number of these dwellings had unobstructed views of the river.
| Exhibit 1. Dwellings Cross-Classified by Neighborhoods (Census Block Groups) and School Zones |
| Exhibit 1. Dwellings Cross-Classified by Neighborhoods (Census Block Groups) and School Zones |
| Exhibit 2. Basic Descriptive Statistics |
The city has three employment-shopping-entertainment centers. The variable "access" is a "gravity-type" index. It is computed as the weighted sum of the distances between a dwelling and each of the three centers. The weights are the ratios of each center's taxable sales to the total taxable sales for the three centers combined. The way the index is constructed, and with all else the same, there is an inverse relationship between the value of the index and distance to the centers.6
The two level-2 variables whose impact on dwelling-specific features (and, therefore, on housing prices) are of interest are shown in the neighborhood and school zone sections of Exhibit 2. The variable "affluence" is an index based on the same idea as the "deprivation index," which is used quite extensively in health and education research. [See, for example, SUNY Downstate Medical Center (2004) and Garner and Raudenbush (1991) for details.] Following the literature, the "affluence" index for each of the neighborhoods (CBGs) was derived as the sum of the standardized z-scores for six socioeconomic variables for that neighborhood. The six variables are: the percentage of dwellings in a CBG that are owner-occupied, the percentage of the residents who are white, the percentage of the residents who are above the poverty line, the percentage of the residents who are age 25 or older with more than high school education, the median income in a CBG, and the median house value. Thus, higher positive values for the index indicate greater neighborhood affluence. In addition to being a commonly agreed upon measure of affluence in general, the index enables us to address two potential problems. One is the problem of collinearity we would have encountered if we had used all six, or even only some, of the component variables in the model instead. The other is that the index allows us to save degrees of freedom which, as will be explained below, can be a very important constraint in HLM.
Finally, the variable "score" for each school is the three-year average of the median reading and math scores on the standardized, national test known as the 6th grade Iowa Test of Basic Skills. It represents the "quality" of education. As Rosen and Fullerton (1978, p. 438) state, the issue here is not "whether or not (such scores) reflect school quality . . . The question is whether or not they represent perceived quality of education better than" other variables.7
Model8
In HLM, each of the hierarchies is represented by its own submodel. These submodels can be used to test for the statistical significance of various cross-hierarchical interactions. As such, HLM can take many forms depending on a number of factors, including the number of hierarchies in the data, what the researcher is interested in testing, whether the submodels have explanatory variables, and whether those submodels account for random effects.
Unconditional Model
The simplest version of the cross-classified HLM model is the "unconditional" model; that is, one with no explanatory variables at any level. The unconditional model enables estimation of the extent to which dwelling prices vary between school zones, between neighborhoods, and within neighborhood-school zone cells (i.e., between dwellings). The results from this model guide the specification of more sophisticated versions of HLM.
It accounts for the correlation between the prices of homes that are located both in the same neighborhood and in the same school zone.
Conditional Model
The aforementioned correlations from the unconditional model indicate to what extent the neighborhood attributes, the school-zone attributes, and the dwelling-specific features each account for the deviation in dwelling prices. This helps guide the specification of the conditional model at each level. The conditional model enables explanation of the impact that the higher level explanatory variables have on dwelling prices through the intercept and slope coefficients from level-1. It can be difficult to explain the empirical results of HLM when the explanatory variables cannot meaningfully have a value of zero. For that reason, some of the explanatory variables, particularly the nonbinary ones, are centered at their grand means.
 | |
Level-2 ["Between-Cell"] Conditional Model The intercept and slope coefficients (π^sub 0jk^ and π^sub 1jk^) from level-1 become the "response" variables in level-2, and are expressed as functions of neighborhood and school-zone characteristics. Thus, in level-2, we are explaining how the intercept and slope coefficients from level-1 vary across the neighborhood-school zone cells (estimating, in essence, separate regressions). We are essentially tracing through these coefficients (π^sub 0jk^ and π^sub 1jk^) how the impact of a level-1 variable (specific to the structure itself) on the dwelling price may in fact be attenuated or accentuated by the neighborhood and/or school-zone attributes (level-2 variables).
 | |
For p = 0, Equation (8) shows how average dwelling prices across neighborhoods and/or school zones may be affected by the differences in the neighborhood (W^sub j^) and school-zone (X^sub k^) attributes. By the same token, for p = 1, Equation (8) shows the impact the neighborhood and/or school-zone variables may have on the households' valuation of certain dwelling-specific features and, therefore, on the dwelling prices, across different neighborhoods and school zones.
 | |
Note that Equation (9) is the outcome of a very simple specification, only presented to illustrate the level-by-level evolution of a conditional model in cross-classified HLM. It only has two levels. Each dwelling is only mapped into two "locations," a neighborhood and a school zone. Finally, it has a total of four predictors, including the term (W*X). Yet, Equation (9) already has a combined total of 19 coefficients and residuals that need to be estimated. Obviously, as more predictors are added at different levels of the model, and as the number of levels (hierarchies) increases, and as dwellings are cross-classified by more than two location identifiers, the equation becomes increasingly more complex; the number of coefficients and residuals to be estimated increase drastically. Whether one can obtain statistically reliable empirical results would then depend on the number of observations available at each level and the number of observations in each cell. Therefore, unless one has a sufficient number of observations in each cell and at each level, one has to impose a priori restrictions on the model. As mentioned before, one such restriction is setting the random neighborhood-by-school-zone interaction effects (d^sub 0jk^) equal to zero. Another is to determine which predictors at which level of the conditional model are most likely to have "fixed" effects rather than random. The conditional model estimated below is specified with these considerations in mind.
Empirical Results
Unconditional Model
The empirical results for the "restricted" version (i.e., with d^sub 0jk^ = 0) of the unconditional model (Equation 3) are shown in Exhibit 3. The total variation in dwelling prices is decomposed into three components: τ^sub b00^ = 0.048, τ^sub c00^ = 0.062, and σ^sup 2^ = 0.103. As stated earlier, these components enable computation of three correlations. Using Equations (4), (5), and (6), we obtain INC = 22.5%, ISC = 29.1%, and ICC = 51.6%, respectively. These correlations indicate that, ceteris paribus, 22.5% of the total variation in the dwelling prices is estimated to be between neighborhoods (CBGs) and, again with all else the same, 29.1% of the total variation is estimated to be between school zones. It can, therefore, be inferred from these that, with all else the same, approximately 48% of the total variation in dwelling prices is associated with differences in the dwelling-features alone.11
Conditional Model
| Exhibit 3. Results for the Unconditional Model |
The empirical results from this full model are presented in Exhibit 4. When the aggregate statistics for the conditional HLM model are compared, shown in Exhibit 4, with those for the unconditional model as shown in Exhibit 3, we see how much of the variation in dwelling prices is actually explained by the conditional model. The estimated within-cell variance in dwelling prices (σ^sup 2^) has declined from 0.103 to 0.055. This indicates that, given all else (that is, after having accounted for the impact of the differences in neighborhood affluence and school scores), the level-1 variables included in Equation (10) account for almost 47% of the remaining differences in dwelling prices in the same neighborhood and school zone. The estimated variance (τ^sub b00^) in the dwelling prices across neighborhoods has dropped from 0.048 to 0.003. Thus, after accounting for the impact of the differences in dwelling features and school scores, the variable affluence accounts for almost 94% of the remaining variation in average dwelling prices across neighborhoods. Finally, the estimated variance (τ^sub c00^) in dwelling prices across school zones has declined from 0.062 to 0.592E-06. This shows that after accounting for the impact of dwelling features and neighborhood affluence, the score variable accounts virtually for all of the remaining differences in average dwelling prices across school zones. Thus, score appears to capture the (real or perceived) differences in school "quality" in the city quite well. This is not that surprising if one considers that this is a relatively homogeneous city. It consists of a single school district, which means that the public expenditure per pupil is essentially the same for all schools. Therefore, the differences in achievement scores may indeed indicate the differences in the quality of education in different schools, at least as perceived by home buyers. Finally, we compare the "deviance" statistics from the two tables. We observe that it has declined by 522.458 (= 505.459-(-16.999)). This statistic has a chi-square distribution with 20 degrees of freedom (d.f. = 24-4), which is statistically significant (p-value [approximate] 0). This indicates the "explanatory power" of the conditional model compared to the unconditional. In other words, the level-1 and level-2 variables included in the conditional model indeed account for a significant portion of the differences in prices across dwellings in the same neighborhood and school zone and also in different neighborhoods and school zones.
 | |
| Exhibit 4. Results for the Conditional Model |
Next, we turn to the results for the individual variables as reported in the upper part of Exhibit 4. As expected, the more affluent a neighborhood is, the higher is the average house price. The variables rooms, averageroomsize, and baths relate to dwelling size. The variable rooms has a positive and significant coefficient indicating that, with all else the same, when the number of rooms increases by one, the dwelling price is expected to also increase by 24.2.%.13 The coefficient for averageroomsize implies that when the average size of a room increases by one square foot, the dwelling price is expected to increase by 0.1% (when evaluated at the actual average dwelling price of $56,100 in the municipality, as seen from Exhibit 2, this is equivalent to $56.10 per square foot). The variable baths is a binary variable, assuming a value of 1 for a dwelling with two-or-more full baths and a value of 0 otherwise.14 While it is not statistically significant, its coefficient implies that the price of a dwelling with two or more full baths is expected to be 32% higher than that of a dwelling with less than two full baths. With all else the same, dwellings that have a fireplace or a finishedbasemenet or a (four season) sunroom are each valued (by 11.5%, 6.6%, and 7.6%, respectively) more than those without these features. Also, single-story (onest) homes are expected to be priced 9.8% higher than other types.
The variable frontage is the road frontage of the property, measured in feet. It determines the expanse of view a house has, as well as ease of access to the main street. It may also represent the potential for certain pleasurable activities, such as growing flowers, having cookouts, etc. These would all contribute to "curb appeal." On the other hand, frontage may also represent certain (for most people rather unpleasant) chores of homeownership, such as snow blowing and mowing. In our model, its coefficient is positive and significant, indicating greater frontage is viewed mainly as a desirable attribute for a dwelling; with all else the same, each additional foot of frontage is associated with a price differential of 0.1%. As expected, houses that are located on a street that is paved and has both a curb and a gutter are valued more (by 10.2%) than houses located on other types of streets.
Next we look at the results for the variables age, river, schooldist, and access whose level-1 coefficients are expressed as functions of the level-2 variables, affluence and score, allowing for interaction across hierarchies. The equation estimated for the level-1 slope coefficient for dwelling age shows that both its intercept and its slope, associated with the affluence index, are significantly negative. These results indicate that age has a negative impact on dwelling prices, and the negative impact is exacerbated (made more negative) by neighborhood affluence. Of two dwellings that are located in the same neighborhood and are otherwise identical, the one that is one year older is estimated to be priced 0.7% less than the other. If the older dwelling is instead located in a more affluent neighborhood, its price is estimated to be 0.8% lower when the difference in the affluence index is one unit (set equal to one standard deviation, which is 4.72, as can be seen from Exhibit 2).15 This may be due to affluent neighborhoods mostly encompassing newer developments, and offering potential buyers a wide variety of newer homes with more modern ("up-to-date") features already built-in.16
The variable river accounts both for the potential dangers of being located in a flood zone (albeit a 100-year one) and also all the amenities (e.g., opportunities for picnicking, sailing, boating, fishing, having a nice view, etc.) that being close to a body of water offers. As can be seen from Exhibit 4, the intercept coefficient associated with the slope for river is significantly negative but the (slope) coefficient associated with neighborhood affluence is significantly positive. That is, as the level of neighborhood affluence increases, the slope coefficient for river becomes less negative. The coefficients indicate that, with all else the same, being in the flood zone by itself lowers the value of a dwelling by 7.6%; however, an increase of one unit (one standard deviation) in neighborhood affluence offsets that by 5.8%. After all, more affluent families are in a better position to take advantage of the amenities that being close to a river offers. Therefore, they are more likely to focus on the benefits of having easy access to the river and to discount the potential for flooding.
Next we look at the results for the equation estimated for the level-1 slope coefficient of the variable schooldist, representing the distance between a dwelling and the school in whose zone the dwelling is located. For families with school-age children, being close to a school would be desirable since it makes for an easier and faster commute. However, particularly for families with no school-age children, it may also mean an increased likelihood of traffic congestion and noise and even possible vandalism. Which is more important is an empirical issue. Our intercept term shows that the relationship between dwelling price and schooldist by itself is statistically insignificant but negative. The value of the intercept, however, is rather large and indicates that, with all else the same, being one mile farther from a school may lower dwelling price by 5.1%. What is interesting, and particularly revealing, is the negative and significant slope coefficient associated with score. It shows how the relationship between distance to school and dwelling price is exacerbated by school quality. In particular, the negative coefficient on score implies that as score increases (indicating higher-quality schools), schooldist has an increasingly negative impact on the value of the house. In other words, the better the quality of the school, the faster the price falls as distance from the school increases. For example, suppose two dwellings are located in two different school zones; they are the same distance from their respective schools and are also otherwise identical. The dwelling that is in the zone where score is higher by one unit (set equal to one standard deviation, which is 0.32, as can be seen from Exhibit 2) is expected to be valued 9.4% less than the other dwelling. These results imply that, on the average and with all else the same, not only do households want to be closer to the school in whose zone they are located but, and more importantly, the higher is the "quality" of that school, the more they are willing to pay to be closer to it. Thus, there is evidence of locational rent where school quality is concerned as well.
The equation estimated for the level-1 slope coefficient of the access variable has an insignificant intercept term. This indicates that, since the city is relatively small, dwelling location vis-à-vis the entertainment-shopping-employment centers by itself is not a statistically important determinant of dwelling price. However, the (slope) coefficient associated with affluence is both negative and significant. This suggests that as affluence increases, access has an increasingly negative impact on the value of the house. For example, suppose two dwellings have the same access value and are also identical in every respect except that one is located in a neighborhood whose affluence-index is higher by one unit (defined to be one standard deviation, 4.72, as before). That dwelling is expected to be valued 12.8% less than the one located in the less affluent neighborhood. Given that access is inversely related to distance, this indicates that the more affluent the neighborhood is, the more the residents value distance from the entertainment-shopping-employment centers. Since the city in question is relatively small, this might be taken to reveal that the monetary value of time involved in commuting is negligible compared to the peace and quiet "purchased" by distance. Finally, the last two variables in the model are control variables. One is for the timing of the sale (yearsold). The other variable (bankowned) controls for foreclosure sales.
In this model, the relationship between neighborhood affluence and dwelling price is assumed to be the same across school zones. By the same token, the relationship between school achievement score and dwelling price is assumed to be the same across neighborhoods. The next logical step is to conduct tests to see whether the relationship between affluence and dwelling price varies across school zones, and whether the relationship between score and dwelling price varies across neighborhoods. This necessitates "activating" in each of the level-2 equations (11)-(15) the coefficients corresponding to those that are designated as c^sub p1k^ and b^sub p1j^ in Equation (8), and then reestimating the full conditional model accordingly. Then the aggregate statistics could be compared with those reported in Exhibit 4. This, however, necessitates greater degrees of freedom than we have.
Conclusion
When a housing market is cross-classified, HLM is an appropriate tool for studying the determination of dwelling prices. Even though the degrees-of-freedom constraints prevented us from engaging in more advanced tests, the results still demonstrate how HLM enabled us to identify the extent to which the differences in such variables as neighborhood affluence and school achievement scores are likely to confer "premiums" upon certain dwelling features and, therefore, are likely to affect the value households place on otherwise identical dwellings.
As noted earlier, as the actual geographic extent of a housing market increases, the number of observations also increases and the number of degrees of freedom becomes less of an issue. In addition, both the number and the level of cross-classified hierarchies are quite likely to increase with increasing market size. Therefore, HLM becomes even more relevant and useful in uncovering neighborhood-related "premiums" in larger housing markets.
| [Footnote] |
| Endnotes |
| 1 Census block groups are sub-units of census tracts. The U.S. Census Bureau defines such areas to include 250-550 dwellings (see U.S. Department of Commerce. 1980 Population Census or see the census.gov website). School zones, on the other hand, are administrative areas, established by the school districts. Their boundaries are set to match school capacity with the number of school-age children. Since the boundaries for census block groups and school zones are drawn independently, a block group may extend across several school zones and a school zone across several block groups, resulting in cross-classification of dwellings. |
| 2 Relative homogeneity along such dimensions is one of the guidelines used in establishing census tracts and block groups within those tracts. Given their relative sizes, a block group is much more likely to represent a "single" market than a census tract, for instance. As pointed out by an anonymous reviewer, Lipscomb and Farmer (2005) have found some evidence that different household types may also coexist in the same neighborhood. |
| 3 When data are cross-classified hierarchical, observations are no longer independent. One advantage of HLM over, e.g., OLS, in such a setting is that HLM provides efficient estimates (see Raudenbush and Bryk, 2002, Chapters 3 and 4). |
| 4 The source for most of the data is the records kept by the assessors' office, and includes the dwellings sold during 1984-1985. In addition to the data commonly available to everyone, the assessor also shared with us information not usually collected or made publicly available by assessors, on the condition that the city, the state, as well as the names and addresses of the property owners would be kept out of print. Our objective in this paper is not to model a specific housing market in a specific time period. Rather, it is to test for the impact that neighborhood affluence and school quality (as measured by achievement test scores) have on dwelling prices, through their impact on certain dwelling features, in a city with cross-classified hierarchies. The variables for which data were collected by the assessors' office enable us to do just that. For more information, see Uyar and Brown (2005, pp. 429, 431, 440 fn. 10). As we indicated to the editor, we can provide the database used in this paper and the name of the city upon request, provided the name of the city is kept out of print. |
| It is also worth mentioning that crime rate, pollution, green areas, etc, are not significant in this housing market. Its relative socio-economic homogeneity, small size, and the fact that it is essentially a service-commerce center (as opposed to being a manufacturing, industrial one) account for this. |
| 5 As discussed below, a number of cells in Exhibit 1 contain no dwellings, and the number of dwellings vary greatly across cells. As the size of the relevant housing market gets larger, there will be few(er) blank cells due to increased likelihood of cross-classification of dwellings across that housing market and it will be even more important to use a method such as HLM. |
 | |
| [Footnote] |
| 7 Jud and Watts (1981, p. 461) concur with Rosen and Fullerton (1977). In their study of the relationship between dwelling value and school quality, they state that their "use of achievement scores as an index of school quality is dictated by (their) belief that many parents, rightly or wrongly, judge school quality by the average achievement level of their child's classmates." Jud and Watts also control for the racial composition of schools. We also accounted for racial make-up in our affluence index, even though our municipality is rather homogeneous with very few minorities. Also see Brasington (1999) for an assessment of various indicators, including achievement scores that researchers have used as measures of school quality. |
| 8 In order to make it easier for interested researchers to follow up for more detail, we kept our discussion of the HLM model in this section very close to our "root" literature. In addition, and also to facilitate comparison, the notation we used in specifying our HLM equations is very similar (and for some variables, even identical) to the notation used in the literature (see Garner and Raudenbush, 1991; Raudenbush, 1993; Osborne, 2000; Kahane, 2001; Raudenbush and Bryk, 2002, Chapters 2 and 12; and Raudenbush, Bryk, Cheong and Congdon, 2004, Chapters 1, 2, 10, 11). |
| 9 Some researchers use price as their dependent variable. Using ln(price) reduces the impact of heteroscedasticity and accounts for the likely differences the same variable(s) may have on houses of different value [see Sirmans, Macpherson and Zietz (2005, pp. 4, 6) and some of its references]. More recently, Lipscomb and Farmer (2007) and Zietz, Zietz, and Sirmans (2007) have investigated how same variables may affect dwellings of different value, using quantile regression. We thank an anonymous reviewer for bringing these two studies to our attention. |
| 10 The authors also emphasize that the lack of observations in this context is more the rule than the exception. Perhaps as a result, the HLM6 software (Raudenbush, Bryk, Cheong, and Congdon, 2004) does not even allow the user the option of including d^sub 0jk^ in the actual estimation of the model. |
| 11 This is (1-INC-ISC), representing the share of the estimated σ^sup 2^ in total unconditional variation, subject to d^sub 0jk^ = 0. |
| 12 The "full" model is the equation corresponding to Equation (9) but including all the coefficients and residuals resulting from the substitution of Equations (11)-(16) into Equation (10). We opted not to write it out in the paper due to space considerations. Instead, we formatted Exhibit 4 to reflect the full model. We can provide the actual equation upon request. |
| 13 The dependent variable is ln(price). So, the coefficients reported in Exhibit 4 are "log-scaled" percentage changes in price when a particular independent variable changes (differs) by one unit. The actual percentage change in price is e^sup 0.217^, which yields 24.2%. The same interpretation holds for all the level-1 variables except the ones that interact with the level-2 variables affluence and score. |
| 14 The assessors did not collect data on the actual number of bathrooms. They simply coded what kind of bathroom(s) a dwelling had. The categories ranged from "No Regular 3-fixture bath," meaning there were no full-bathrooms, to "More than" two full bathrooms. |
| 15 Again, the actual percentage difference in price is obtained by [e^sup -0.007+(-2.2E-04)(4.72)^], which yields approximately 0.8% for the total impact. |
| 16 Zietz, Zietz, and Sirmans (2007) report that in an overwhelming number of hedonic models of dwelling prices, the coefficient for age is negative and statistically significant, as we have found. Their quantile regression results, however, show that "there is a lower premium for newness for higher priced homes. Lower-priced homes have the highest premium for newness (or discount for age)" (p. 10). This is different than our HLM findings. To the extent that neighborhood affluence and dwelling prices are positively correlated, we find that affluent neighborhoods discount for age more. |
| [Reference] |
| References |
| Anselin, L. Spatial Econometrics: Methods and Models. London and Dordrecht: Kluwer Academic Publishing, 1988. |
| Brasington, D.M. Which Measures of School Quality Does the Housing Market Value?/owrwa/ of Real Estate Research, 1999, 18:3, 395-413 |
Brown, K.H. and L.K. Jepsen. An HLM Model of Baseball Player Value. Unpublished manuscript, University of Northern Iowa, Department of Economics, 2004. |
| Brown, K.H. and B. Uyar. A Hierarchical Linear Model Approach for Assessing the Effects of House and Neighborhood Characteristics on Housing Prices. Journal of Real Estate Practice and Education, 7:1, 2004, 15-23. |
| Garner, C.L. and S.W. Raudenbush. Neighborhood Effects on Educational Attainment: A Multilevel Analysis. Sociology of Education, 1991, 64, 251-62. |
| Goodman, A.C. Capitalization of Property Tax Differentials Within and Among Municipalities. Land Economics, 1983, 59:2, 211-19. |
| Goodman, A.C. and T.G. Thibodeau. Housing Market Segmentation. Journal of Housing Economics, 7, 1998, 121-43. |
| Haurin, D.R. and D. Brasington. School Quality and Real House Prices: Inter- and Intrametropolitan Effects. Journal of Housing Economics, 1996, 5, 351-68. |
| Jud, G.D. and J.M. Watts. Schools and Housing Values. Land Economics, 1981, 57:3, 459-70. |
Kahane, L.H. Team and Player Effects on NHL Player Salaries: A Hierarchical Linear Model Approach. Applied Economics Letters, 2001, 8, 629-32. |
| Kohlhase, J.E. The Impact of Toxic Waste Sites on Housing Values. Journal of Urban Economics, 1991, 30, 1-26. |
| Krantz, D.P., R.D. Weaver, and T.R. Alter. Residential property Tax Capitalization: Consistent Estimates Using Micro-Level Data. Land Economics, 1982, 58:4, 488-96. |
| Lipscomb, C.A. An Alternative Spatial Hedonic Estimation Approach. Journal of Housing Research, 2006, 15:2, 143-60. |
| Lipscomb, C.A. and M.C. Farmer. Household Diversity and Market Segmentation within A Single Neighborhood. Annals of Regional Science, 2005, 39, 791-810. |
. An Application of Quantile Regression to Submarket Level Hedonic Estimation. Working Paper, Valdosta State University, 2007. |
| Osborne, J.W. Advantages of Hierarchical Modeling. Practical Assessment, Research & Evaluation, 2000, 7:1, <http://pareonline.net/getvn.asp?v=7&n=l>. |
| Raudenbush, S.W. A Crossed random Effects Model for Unbalanced Data with Applications in Cross-sectional and Longitudinal Research. Journal of Educational Statistics, 1993, 18:4, 32149. |
| Raudenbush, S.W. and A.S. Bryk. Hierarchical Linear Models: Applications and Data Analysis Methods. Second edition, Sage Publications, 2002. |
| Raudenbush, S.W., A.S. Bryk, Y.F. Cheong, and R. Congdon. HLM 6: Hierarchical Linear & Nonlinear Modeling. SSI Scientific Software International, Inc., 2004. |
| Rosen, H.S. and D.J. Fullerton. A Note on Local Tax Rates, Public Benefit Levels and Property Values. Journal of Political Economy, 1977, 85, 433-40. |
| Sirmans, S.G., D.A. Macpherson, and E.N. Zietz. The Composition of Hedonic Pricing Models. Journal of Real Estate Literature, 2005, 13:1, 3-47. |
| Song, S. Some Tests of Alternative Accessibility Measures: A Population Density Approach. Land Economics, 1996, 72:4, 474-82. |
SUNY Downstate Medical Center. Deprivation Index. The Social and Health Landscape of Urban & Suburban America. June 30, 2004 <<http://www.downstate.edu/urbansoc_> healthdata/. |
| Uyar, B. and K.H. Brown. Impact of Local Public Services and Taxes on Dwelling Choice within a Single Taxing Jurisdiction: A Discrete Choice Model. Journal of Real Estate Research, 2005, 27:4, 427-43. |
Zietz, J., E.N. Zietz, and G.S. Sirmans. Determinants of House Prices: A Quantile Regression Approach. Department of Economics and Finance Working paper Series, Middle Tennessee State University, 2007. |
| [Author Affiliation] |
Bulent Uyar, University of Northern Iowa, Cedar Falls, IA 50614-0129 or bulent.uyar@uni.edu. |
| Kenneth H. Brown, University of Northern Iowa, Cedar Falls, IA 50614-0129 or ken.brown@uni.edu. |