35.2 Structural Models, Endogeneity
Good Empirical Research requires
Good Data
Original
Cool results
Exogenous
Good Theory
- Interesting Hypotheses
Cool new approach
Analysis
- Descriptive
- Predictive
- Causal
- Prescriptive or policy-oriented.
Both 3 and 4, you need structural or experimental research.
Structural Equation Modeling is different Structural Equation
Causal as in experiments, there
In the structural model, we still want to make causal inference.
- Endogeneity
Models:
Instrumental Variables
Joint Estimation of Supply and Demand Models
Empirical Bargaining Models
Ask:
Bounds analyses:
35.2.1 Background
Types of Empirical models in marketing
Descriptive (no need to concern for endogeneity): covert data into info
Statements about facts
High-quality and relevant data
Accurate Interpretation
Structural (also known as latent/ path models)
Experimental (including quasi-experimental)
The data and research questions should always determine methodological approach.
Under structural models, we rely on
Formal formal specification linking Y and X
Stochastic specification connects theoretical model to data. Ex: heterogeneity helps explain the imperfect fit by including
Consumer preference
Consumer decision-making errors
Measurement errors
Structural models help recover counterfactuals.
Structural models differ from descriptive models because it can recover the structural parameters using reduced from.
Reduced form regression means that you know the structure of the data generating process.
A reduced from only exist with an underlying structural model. When researchers say they use “reduced-form analysis” when they only do regression: They erroneously assign a causal interpretation to the regression coefficients.
IV methods even with valid instruments can still have poor sampling properties (finite sample bias, large sampling errors).
Problems with Instrumental variables in marketing
It’s hard to find instrument for advertising and promotional variables
Lagged marketing variables are invalid instruments when advertising and promotional variables are unobserved.
Control functions can still work under nonlinear demand model (e.g., choice model).
Endogenous variables in marketing:
Price, advertising, promotion, entry order, distributions, market structure, market share, revenue, networks
Instruments:
lagged variables,
costs (input and wholesale prices).
Cost input: Theoretically good instrument for endogenous price, but hard to measure (especially marginal cost measured by BLS that has high measurement error) (p. 666).
Wholesales price to deal with price endogeneity is plausible (but people can still argue that wholesalers set price in anticipation of adveritsing and promotion). But they have less variation (frequency of changes is lower than retail price) hence using wholesale price as an instrument, you account for the difference between long-run and short-run effects of price, instead of endogeneity.
other products. Good instrument for endogenous price when unobserved demand shocks (that vary by market and time, for those shocks that only vary by market, but not only time, FE can only fix) are uncorrelated across market (exogeneity), but costs are correlated across market (relevance).
fixed effects (brand, time dummies). Good but only for linear models.
- Price endogeneity: (Villas-Boas and Winer 1999) (another flaw - no heterogeniety and state-dependence for packaged goods panel) uses lag price as instruments, but it is bad (unmatched time) and is not supported.
demographics (bad instruments),
product characteristics (S. Berry, Levinsohn, and Pakes 1995),
price indices,
display and features.
People tend to use lagged variables to fix endogenous price (price correlates with unobserved quality, which induces downward endogenous in price sensitivity).
The Hausman test can only be used to determine the validity of one set of instruments based on the validity of another set of instruments.
35.2.2 Examples
(S. Berry, Levinsohn, and Pakes 2004)
Second-choice data as an instrument: if consumers hadn’t purchased their cars, what would have been their second choice. But you still need high variation in this variable to estimate the model
General Motors data set: second choice = substitution pattern. (This might only help with non-parametric estimate)
Prior models: To estimate substitution coefficient (pattern): match consumer attributes to consumer choices (observables).
Identification: estimation based on changes across markets (or across time).
Assume the distribution of consumers’ underlying tastes, conditional on an observed distribution of consumer incomes and demographics (i.e., observables) is constant across markets and time.
Hence, substitution coefficient is estimated from the data on changes in (1) characteristics and number of product, and (2) changes in observed consumer attributes across markets.
In other words, estimation is based on
\(1\) switchers of consumers (i.e., people buy different product when there are changes in product prices, choice set, or other characteristics).
\(2\) different people (distribution of consumer attributes) will choose different product for a set of product.
But the prior models are without unobserved heterogeneity and only with observed consumer attributes are actually bad at replicating the substitution pattern observed in the second-choice data.
This paper identification strategy is based on the second-choice data
Advantages:
(1): direct data-driven substitution pattern.
\(2\) more identification power without the exogenous changes in choice sets.
Disadvantages:
- Since second-choice data is available for single market (i.e., not across market), we can’t estimate across-market pattern of substitution.
Future research:
- Combine across market second-choice data (i.e., SUVs switch to minivan).
Baseline model (S. Berry, Levinsohn, and Pakes 1995)
\[ u_{ij} = \sum_{k} x_{jk} \tilde{\beta}_{ik} + \xi_j + \epsilon_{ij} \]
where
\(u_{ij}\) = linear utility of consumer \(i\) consuming product \(j\) (\(j \in [0, J]\) where \(j =0\) means the consumer did not buy from any of the competing market
\(k\) = observed product characteristics
\(r\) = observed household attributes.
\(x_{jk}\) = observed product characteristics
\(\xi_j\) = unobserved product characteristics (pick up all the impact that weren’t observed, but it might also correlate with the observe, in which case results in small price elasticities).
\(\epsilon_{ij}\) = individual preferences (independent of the product attributes and each other).
\(\tilde{\beta}_{ik} = \bar{\beta}_k + + \sum_{r} \mathbf{z}_{ir} \beta_{kr}^o + \beta_k^u \mathbf{v}_{ik}\) (consumer taste)
\(\mathbf{z}_i\) = vectors of observed consumer attributes
\(\mathbf{v}_{ik}\) = vector of unobserved consumer attributes
This model also assumes that there is only one unobserved characteristics (i.e., without subscript \(r\)) per household.
Substitute the above two equation
\[ u_{ij} = \delta_j + \sum_{kr} x_{jk} \mathbf{z}_{ir} \beta_{kr}^o + \sum_{k} x_{jk} \mathbf{v}_{ik} \beta_k^u +\mathbf{\epsilon}_{ij} \]
where
- \(\delta_j = \sum_k x_{jk} \bar{\beta}_k + \xi_j\) (choice-specific constant). (equation 4)
Without any additional assumption on \(\xi\) (i.e., product characteristics), we can have consistent estimators of \(\mathbf{\theta = (\delta, \beta^o, \beta^u)}\)
But we need to know the identifying assumption of \(\xi_j\) to be able to estimate \(\bar{\beta}\):
- \(\epsilon_j\) are mean independent of the nonprice characteristics of all the products.
Estimation
2 choices to estimate \(\xi_j\):
Estimate \(\mathbf{\theta = (\beta^o, \beta^u, \delta)}\) (always consistent)
Restrict the joint distribution of \((\xi, \mathbf{x})\) and estimate only \((\mathbf{\beta^o, \beta^u, \bar{\beta}})\) (efficient if there the restrictions are true, but inconsistent if the restrictions are wrong). Hence, better off with first choice.
Choice of estimation methods:
ML: computationally costly
Method of moments: matched on 3 sets of moments
Covariances of the observed first-choice product characteristics (\(\mathbf{x}\))with the observed consumer attributes (\(\mathbf{z}\)) for estimating \(\mathbf{\beta}^o\): help identify \(\mathbf{\beta}^o, \mathbf{x,z}\)
Covariances of first choice product characteristics and second-choice product characteristics: help identify unobserved consumer characteristics.
Market share of \(J\) products: help identify \(\mathbf{\delta}\) (choice-specific constant).
(BLP) (S. Berry, Levinsohn, and Pakes 1995)
Question:
Hand-waving: “For computational simplicity, …, \(\epsilon_{ij}\) have an independently and identically distributed extreme value”double exponential” distribution”. Basically it was modeled this way to have a tractable form of the model’s choice probabilities conditional on \((\mathbf{z,v})\): \(P(y_i^1 = j | \mathbf{z}_i, \mathbf{v}_i, \mathbf{\theta}, \mathbf{x})\)
- Closed-form solution: pretty close to the normal distribution (see MacFadden).
To construct the choice set: the car characteristics: the authors only used modal vehicle (combinations of options that was most commonly purchased). and price was average price of the model vehicle.
- Defensible thing to do
Python implementation of this paper: (C. Conlon and Gortmaker 2020)
(Draganska, Klapper, and Villas-Boas 2010)
How do we measure power in the distribution channel?
Between manufacturers and retailers
Manufacturers
Bargain over profit margins with retailer
Bounded by agreement with retailer
Bargaining power comes form size of manufacturer and supplying product for retailers
Retailers:
Intense composition in mature coffee market
bounded by consumer price sensitivity
A shift of bargaining power from manufacturers to retailers
Standard models are good to measure distribution channel power.
Bargaining position: stand to lose more (endogenously determined by the substitution patterns on the demand side)
Bargaining power: negation skills, patience, risk tolerance (exogenous - depends on negotiation partners).
Channel margin and split = f(bargaining position, bargaining power)
Contributions:
Bargaining power is still with manufacturer (manufacturer gets over half of the pie).
Overall profit of the distribution channel is not a zero-sum game
Quantify the effects of bargaining power on channel profits
Bargaining power predominantly affects manufacturers
Bargaining power weakly affects retailers. retailer margins tied down by pricing power over consumers
(Ozturk, Chintagunta, and Venkataraman 2019)
Impact of Chapter 11 on consumer demand for the bankrupt firms’ competitors
Possibilities:
Consumers go to the competitions (competitive effect)
reduced demand also fro the competitors (negative info about the industry: contagion effect)
Research question: temporally local effect of chapter 11 on demand for rival firms
Data: dealer-model-day level
Challenge:
General decline in economic condition: Great Recession
“Cash for Clunkers” program: anticipation for the program may decrease demand
Remedies: regression discontinuity in time (RDiT)
Control variables (price, ads, recalls, Macroeconomic conditions)
Competitors’ sales patterns in Canada (where Chrysler didn’t file)
Results: Negative effect on competitors.
The mechanism:
Increased consumer uncertainty about car purchases
Decreased cross-traffic form the bankrupt firm’s dealers to competitors’ dealers
Jayarajan et al. (2021) Changing the Power Equation: A Structural Analysis of the Impact of Used Car Markets on the Automobile Retail Channel
Main idea: study the automobile retail channel where retailers sell new and used cars
Structural model:
Demand: used and new cars, heterogeneity, price endogeneity (IV)
Supply: Oligopolistic structure with multiple retailers and dealers
Outcomes: profits, margins and power in the distribution channel
Counterfactual analysis: What if we change used cars’ quality and availability?
Main result: selling used cars are important for retailers profits and bargain power.