35.2 Structural Models, Endogeneity

Good Empirical Research requires

Good Data
1. Original
2. Cool results
3. Exogenous
Good Theory
1. Interesting Hypotheses
Cool new approach

Analysis

Descriptive
Predictive
Causal
Prescriptive or policy-oriented.

Both 3 and 4, you need structural or experimental research.

Structural Equation Modeling is different Structural Equation

Causal as in experiments, there

In the structural model, we still want to make causal inference.

Endogeneity

Models:

Instrumental Variables
Joint Estimation of Supply and Demand Models
Empirical Bargaining Models

Ask:

Bounds analyses:
- (Ellickson and Misra 2011)
- (Manski and Tamer 2002)

35.2.1 Background

(Reiss 2011)

Types of Empirical models in marketing

Descriptive (no need to concern for endogeneity): covert data into info
1. Statements about facts
2. High-quality and relevant data
3. Accurate Interpretation
Structural (also known as latent/ path models)
Experimental (including quasi-experimental)

The data and research questions should always determine methodological approach.

Under structural models, we rely on

Formal formal specification linking Y and X
Stochastic specification connects theoretical model to data. Ex: heterogeneity helps explain the imperfect fit by including
1. Consumer preference
2. Consumer decision-making errors
3. Measurement errors

Structural models help recover counterfactuals.

Structural models differ from descriptive models because it can recover the structural parameters using reduced from.

Reduced form regression means that you know the structure of the data generating process.

A reduced from only exist with an underlying structural model. When researchers say they use “reduced-form analysis” when they only do regression: They erroneously assign a causal interpretation to the regression coefficients.

(Rossi 2014)

IV methods even with valid instruments can still have poor sampling properties (finite sample bias, large sampling errors).
Problems with Instrumental variables in marketing
It’s hard to find instrument for advertising and promotional variables
Lagged marketing variables are invalid instruments when advertising and promotional variables are unobserved.
Control functions can still work under nonlinear demand model (e.g., choice model).
Endogenous variables in marketing:
- Price, advertising, promotion, entry order, distributions, market structure, market share, revenue, networks
- Instruments:
  - lagged variables,
  - costs (input and wholesale prices).
    - Cost input: Theoretically good instrument for endogenous price, but hard to measure (especially marginal cost measured by BLS that has high measurement error) (p. 666).
    - Wholesales price to deal with price endogeneity is plausible (but people can still argue that wholesalers set price in anticipation of adveritsing and promotion). But they have less variation (frequency of changes is lower than retail price) hence using wholesale price as an instrument, you account for the difference between long-run and short-run effects of price, instead of endogeneity.
  - other products. Good instrument for endogenous price when unobserved demand shocks (that vary by market and time, for those shocks that only vary by market, but not only time, FE can only fix) are uncorrelated across market (exogeneity), but costs are correlated across market (relevance).
  - fixed effects (brand, time dummies). Good but only for linear models.
    - Price endogeneity: (Villas-Boas and Winer 1999) (another flaw - no heterogeniety and state-dependence for packaged goods panel) uses lag price as instruments, but it is bad (unmatched time) and is not supported.
  - demographics (bad instruments),
  - product characteristics (S. Berry, Levinsohn, and Pakes 1995),
  - price indices,
  - display and features.
- People tend to use lagged variables to fix endogenous price (price correlates with unobserved quality, which induces downward endogenous in price sensitivity).
The Hausman test can only be used to determine the validity of one set of instruments based on the validity of another set of instruments.

35.2.2 Examples

(S. Berry, Levinsohn, and Pakes 2004)

Second-choice data as an instrument: if consumers hadn’t purchased their cars, what would have been their second choice. But you still need high variation in this variable to estimate the model
General Motors data set: second choice = substitution pattern. (This might only help with non-parametric estimate)
Prior models: To estimate substitution coefficient (pattern): match consumer attributes to consumer choices (observables).
- Identification: estimation based on changes across markets (or across time).
- Assume the distribution of consumers’ underlying tastes, conditional on an observed distribution of consumer incomes and demographics (i.e., observables) is constant across markets and time.
- Hence, substitution coefficient is estimated from the data on changes in (1) characteristics and number of product, and (2) changes in observed consumer attributes across markets.
- In other words, estimation is based on
  - \(1\) switchers of consumers (i.e., people buy different product when there are changes in product prices, choice set, or other characteristics).
  - \(2\) different people (distribution of consumer attributes) will choose different product for a set of product.
But the prior models are without unobserved heterogeneity and only with observed consumer attributes are actually bad at replicating the substitution pattern observed in the second-choice data.
This paper identification strategy is based on the second-choice data
- Advantages:
  - (1): direct data-driven substitution pattern.
  - \(2\) more identification power without the exogenous changes in choice sets.
- Disadvantages:
  - Since second-choice data is available for single market (i.e., not across market), we can’t estimate across-market pattern of substitution.
Future research:
- Combine across market second-choice data (i.e., SUVs switch to minivan).

Baseline model (S. Berry, Levinsohn, and Pakes 1995)

\[ u_{ij} = \sum_{k} x_{jk} \tilde{\beta}_{ik} + \xi_j + \epsilon_{ij} \]

where

\(u_{ij}\) = linear utility of consumer \(i\) consuming product \(j\) (\(j \in [0, J]\) where \(j =0\) means the consumer did not buy from any of the competing market
\(k\) = observed product characteristics
\(r\) = observed household attributes.
\(x_{jk}\) = observed product characteristics
\(\xi_j\) = unobserved product characteristics (pick up all the impact that weren’t observed, but it might also correlate with the observe, in which case results in small price elasticities).
\(\epsilon_{ij}\) = individual preferences (independent of the product attributes and each other).
\(\tilde{\beta}_{ik} = \bar{\beta}_k + + \sum_{r} \mathbf{z}_{ir} \beta_{kr}^o + \beta_k^u \mathbf{v}_{ik}\) (consumer taste)
- \(\mathbf{z}_i\) = vectors of observed consumer attributes
- \(\mathbf{v}_{ik}\) = vector of unobserved consumer attributes
- This model also assumes that there is only one unobserved characteristics (i.e., without subscript \(r\)) per household.

Substitute the above two equation

\[ u_{ij} = \delta_j + \sum_{kr} x_{jk} \mathbf{z}_{ir} \beta_{kr}^o + \sum_{k} x_{jk} \mathbf{v}_{ik} \beta_k^u +\mathbf{\epsilon}_{ij} \]

where

\(\delta_j = \sum_k x_{jk} \bar{\beta}_k + \xi_j\) (choice-specific constant). (equation 4)

Without any additional assumption on \(\xi\) (i.e., product characteristics), we can have consistent estimators of \(\mathbf{\theta = (\delta, \beta^o, \beta^u)}\)

But we need to know the identifying assumption of \(\xi_j\) to be able to estimate \(\bar{\beta}\):

\(\epsilon_j\) are mean independent of the nonprice characteristics of all the products.

Estimation

2 choices to estimate \(\xi_j\):
1. Estimate \(\mathbf{\theta = (\beta^o, \beta^u, \delta)}\) (always consistent)
2. Restrict the joint distribution of \((\xi, \mathbf{x})\) and estimate only \((\mathbf{\beta^o, \beta^u, \bar{\beta}})\) (efficient if there the restrictions are true, but inconsistent if the restrictions are wrong). Hence, better off with first choice.
Choice of estimation methods:
- ML: computationally costly
- Method of moments: matched on 3 sets of moments
  1. Covariances of the observed first-choice product characteristics (\(\mathbf{x}\))with the observed consumer attributes (\(\mathbf{z}\)) for estimating \(\mathbf{\beta}^o\): help identify \(\mathbf{\beta}^o, \mathbf{x,z}\)
  2. Covariances of first choice product characteristics and second-choice product characteristics: help identify unobserved consumer characteristics.
  3. Market share of \(J\) products: help identify \(\mathbf{\delta}\) (choice-specific constant).

(BLP) (S. Berry, Levinsohn, and Pakes 1995)

Question:

Hand-waving: “For computational simplicity, …, \(\epsilon_{ij}\) have an independently and identically distributed extreme value”double exponential” distribution”. Basically it was modeled this way to have a tractable form of the model’s choice probabilities conditional on \((\mathbf{z,v})\): \(P(y_i^1 = j | \mathbf{z}_i, \mathbf{v}_i, \mathbf{\theta}, \mathbf{x})\)
- Closed-form solution: pretty close to the normal distribution (see MacFadden).
To construct the choice set: the car characteristics: the authors only used modal vehicle (combinations of options that was most commonly purchased). and price was average price of the model vehicle.
- Defensible thing to do
Python implementation of this paper: (C. Conlon and Gortmaker 2020)

(Draganska, Klapper, and Villas-Boas 2010)

How do we measure power in the distribution channel?
Between manufacturers and retailers
- Manufacturers
  - Bargain over profit margins with retailer
  - Bounded by agreement with retailer
  - Bargaining power comes form size of manufacturer and supplying product for retailers
- Retailers:
  - Intense composition in mature coffee market
  - bounded by consumer price sensitivity
A shift of bargaining power from manufacturers to retailers
Standard models are good to measure distribution channel power.
Bargaining position: stand to lose more (endogenously determined by the substitution patterns on the demand side)
Bargaining power: negation skills, patience, risk tolerance (exogenous - depends on negotiation partners).
Channel margin and split = f(bargaining position, bargaining power)
Contributions:
- Bargaining power is still with manufacturer (manufacturer gets over half of the pie).
- Overall profit of the distribution channel is not a zero-sum game
- Quantify the effects of bargaining power on channel profits
  - Bargaining power predominantly affects manufacturers
  - Bargaining power weakly affects retailers. retailer margins tied down by pricing power over consumers

(Ozturk, Chintagunta, and Venkataraman 2019)

Impact of Chapter 11 on consumer demand for the bankrupt firms’ competitors
Possibilities:
- Consumers go to the competitions (competitive effect)
- reduced demand also fro the competitors (negative info about the industry: contagion effect)
Research question: temporally local effect of chapter 11 on demand for rival firms
Data: dealer-model-day level
Challenge:
- General decline in economic condition: Great Recession
- “Cash for Clunkers” program: anticipation for the program may decrease demand
Remedies: regression discontinuity in time (RDiT)
- Control variables (price, ads, recalls, Macroeconomic conditions)
- Competitors’ sales patterns in Canada (where Chrysler didn’t file)
Results: Negative effect on competitors.
The mechanism:
- Increased consumer uncertainty about car purchases
- Decreased cross-traffic form the bankrupt firm’s dealers to competitors’ dealers

Jayarajan et al. (2021) Changing the Power Equation: A Structural Analysis of the Impact of Used Car Markets on the Automobile Retail Channel

Main idea: study the automobile retail channel where retailers sell new and used cars

Structural model:

Demand: used and new cars, heterogeneity, price endogeneity (IV)
Supply: Oligopolistic structure with multiple retailers and dealers

Outcomes: profits, margins and power in the distribution channel

Counterfactual analysis: What if we change used cars’ quality and availability?

Main result: selling used cars are important for retailers profits and bargain power.

References

Berry, Steven, James Levinsohn, and Ariel Pakes. 1995. “Automobile Prices in Market Equilibrium.” Econometrica 63 (4): 841. https://doi.org/10.2307/2171802.

———. 2004. “Differentiated Products Demand Systems from a Combination of Micro and Macro Data: The New Car Market.” Journal of Political Economy 112 (1): 68–105. https://doi.org/10.1086/379939.

Conlon, Christopher, and Jeff Gortmaker. 2020. “Best Practices for Differentiated Products Demand Estimation with PyBLP.” The RAND Journal of Economics 51 (4): 1108–61. https://doi.org/10.1111/1756-2171.12352.

Draganska, Michaela, Daniel Klapper, and Sofia B. Villas-Boas. 2010. “A Larger Slice or a Larger Pie? An Empirical Investigation of Bargaining Power in the Distribution Channel.” Marketing Science 29 (1): 57–74. https://doi.org/10.1287/mksc.1080.0472.

Ellickson, Paul B., and Sanjog Misra. 2011. “Structural Workshop PaperEstimating Discrete Games.” Marketing Science 30 (6): 997–1010. https://doi.org/10.1287/mksc.1110.0675.

Manski, Charles F., and Elie Tamer. 2002. “Inference on Regressions with Interval Data on a Regressor or Outcome.” Econometrica 70 (2): 519–46. https://doi.org/10.1111/1468-0262.00294.

Ozturk, O. Cem, Pradeep K. Chintagunta, and Sriram Venkataraman. 2019. “Consumer Response to Chapter 11 Bankruptcy: Negative Demand Spillover to Competitors.” Marketing Science 38 (2): 296–316. https://doi.org/10.1287/mksc.2018.1138.

Reiss, Peter C. 2011. “Structural Workshop PaperDescriptive, Structural, and Experimental Empirical Methods in Marketing Research.” Marketing Science 30 (6): 950–64. https://doi.org/10.1287/mksc.1110.0681.

Rossi, Peter E. 2014. “Invited PaperEven the Rich Can Make Themselves Poor: A Critical Examination of IV Methods in Marketing Applications.” Marketing Science 33 (5): 655–72. https://doi.org/10.1287/mksc.2014.0860.

Villas-Boas, J. Miguel, and Russell S. Winer. 1999. “Endogeneity in Brand Choice Models.” Management Science 45 (10): 1324–38. https://doi.org/10.1287/mnsc.45.10.1324.