35.3 Cross-Category and Store Choice Models
- Models: Restricted Boltzman Machine Learning Models
How would you name the topic for this week?
Store Choice Model -> Category Choice Model -> Brand choice -> Quantity
35.3.1 Background
Typically outcome variables of interest:
store choice (Which store visited?)
Incidence (whether the product category was purchased)
brand choice (which brand)
quantity (how many?)
Incidence Outcomes in Multiple Categories
- Multi-category “whether to Buy” models
Base Model:
(Manchanda, Ansari, and Gupta 1999): assumed joint distribution (not independent normal dist from the binary probit model) of two products (underestimate cross-category correlation and overestimates the effectiveness of the marketing mix as compared to (Chib, Seetharaman, and Strijnev 2002))
(Chib, Seetharaman, and Strijnev 2002): 12 products category, and find that accounting the effects of unobserved heterogeneity across households can recover the overestimated cross-category correlation and underestimated effectiveness of marketing mix.
(Ma, Seetharaman, and Narasimhan 2012) (publish 5 years later) address the spurious correlation due to 0 outcome (i.e., no purchase) by the multivariate logit model.
- Multi-category “When to to Buy” models
Multivariate Hazard model
(P. K. Chintagunta and Haldar 1998): bivariate hazard model with only positive correlation between two timing outcomes
Ma and Seetharaman (2004) used Multivariate Proportional Hazard Model to account for both positive and negative pair-wise correlations in the outcomes.
- Bundle Choice Models
whether or not to buy a bundle
(Chung and Rao 2003) uses nested logit with error terms follow a joint Gumbel distribution, assumes:
Degree of comparability among product categories
Fully comparable attributes (e..g, brand reliability)
Partially comparable attributes
Non-comparable attributes
Two types of attributes:
Non-balancing attributes
balancing attributes
(Jedidi, Jagpal, and Manchanda 2003): consumer’s (random) utility = sum of reservation price + random component
- Multinomial probit
Brand choice outcome models in multiple categories
- Correlated marketing mix sensitivities across categories
(Ainslie and Rossi 1998): Multinomial Probit model of brand choice. Found correlation between responsiveness to price and feature advertising across product categories
(Seetharaman, Ainslie, and Chintagunta 1999) found household inertia is correlated among product categories
(Iyengar, Ansari, and Gupta 2003): high coefficients across categories, leveraging info across categories (one observed, focal wasn’t)
- Correlated Brand Preferences across categories
(Russell and Kamakura 1997): Poisson model for brand’s purchase volume, they found Inter-category correlation in purchase volume
(Tulin Erdem 1998) (Tülin Erdem and Winer 1998):using multinational logit brand choice model: signaling theory of umbrella branding explains correlated quality perceptions among product categories
Other papers: (V. P. Singh, Hansen, and Gupta 2005) (Hansen, Singh, and Chintagunta 2006)
Models of Multiple Outcomes in Multiple Categories
Incidence and Brand Choice
Incidence as an alternative in a multiple choice model:
Deepak et al. (2002): used Multivariate Probit (MVP) of incidence and brand choice outcomes.
(Manchanda, Ansari, and Gupta 1999) found cross-category correlations in marketing mix sensitivities of household
Ma, Seetharaman and Narasimhan (2005): used Multivariate Logit Model to model incidence and brand choice outcome.
Incidence and Brand choice as 2 decision stages:
(Mehta 2007): Simultaneous model of incidence and brand choice
Chib et al. (2005): Brand choice within each product category
Incidence and Quantity
- (Niraj, Padmanabhan, and Seetharaman 2008) Two-stage bivariate logit model
Incidence, brand choice and quantity
- (Song and Chintagunta 2007): simultaneous model: cross-category effects come from the incidence and brand choice outcomes, not from the quantity outcomes
Estimation: Bayesian framework is a better fit for this type of models. (see (Albert and Chib 1993))
Store Choice Outcomes:
35.3.2 Examples
35.3.2.1 (Bucklin, Siddarth, and Silva-Risso 2008)
Changes in the intensity of mature distribution networks (by car make) influence consumer choice.
Three measures for intensity level (for each make)
Dealer accessibility (buyer’s distance to the nearest outlet): prefer closer
Dealer concentration (i.e.,the distance required to encircle a given number of same make dealers around a given buyer) (number of dealers near a buyer): prefer more dealers
Dealer spread (dispersion of the multiple dealers relative to the buyer’s locations): prefer skewed toward the buyer (think of the circle). Using Gini coefficient from the Lorenz curve).
Used logit choice model to model the correlation of the three measure with new car choices.
found significant correlation between measures and car choice.
Motivations:
Want to infer causation between distribution coverage/ intensity and sales
- It’s hard. It might depend on product categories (e..g, convenience, shopping or specialty goods).
Focus: relationship between distribution intensity and buyer choice in consumer durables market
Leveraging slow changes in the distribution channel, the authors probe the effect of distribution intensity on choice.
But because it was cross sectional, need to include constant heterogeneity in preferences and other marketing mix effects to avoid confounds.
Data: individual-level purchase record by Power Information Network (PIN), under J. Power and Associates from 1997 to 2004 in Cali.
Different from previous literature: instead of store choice, brand choice was modeled as a function of outlet locations.
Utility:
\[ U_{it}^h = \alpha_i^h + \Sigma_j \beta_j^h X^h_{ijt} \]
where
\(U_{it}^h\) = buyer \(h\)’s utility for \(i\) at time \(t\)
\(X_{ijt}^h\) = attribute \(j\)’s value at time \(t\) by buyer \(h\)
\(\alpha_i^h\) = product-specific constant (vary by household) (i.e., brand preference)
Heterogeneity is modeled at the zip-code level (buyers in the same zip code share \(\alpha, \beta\)
Endogeneity:
Measurement Level: individual data, less measurement error.
Simultaneity: Not much changes in distribution network (with empirical evidence). Hence, unlikely
Sample selection: large and representative sample of Cali market.
Omitted variable bias:
Include heterogeneity at the dis aggregate level (capture unobserved geographical effects)
Since model at the make level, we have less correlation with the unobserved model-level factors
Individual makes have less correlation with manufacturer unobserved variables.
Logit choice probability
\[ P_{it}^h = \frac{\exp(U^h_{it})}{\sum_k\exp(U_{kt}^h)} \]
Using Hierarchical Bayes
Choice probability buyer \(h\) in zip code \(z\) pick make \(i\) at time \(t\)
\[ \text{Prob}_t^h(i | \mathbf{\beta}^z, X_{it}^h) = \frac{\exp(\mathbf{\beta}^{\mathbf{Z}}X^h_{it})}{\sum_j\exp(\mathbf{\beta}^{\mathbf{Z}}\mathbf{X}^h_{jt})} \]
where
\(\mathbf{\beta}^{\mathbf{Z}}\) = a zip-code-specific parameter vector (\(\mathbf{\beta}^{\mathbf{Z}} \sim MVN (\mathbf{\mu}, \mathbf{\Sigma})\)
\(\mathbf{\mu} \sim MVN (\mathbf{\eta}, \mathbf{C})\)
\(\mathbf{\Sigma}^{-1} \sim \text{Wishart}[(\rho R)^{-1}, \rho]\)
Structural model:
Demand: sensitivity to travel distance and taste for new product
Supply: responses to changes in store locations.
Outlets focus on lower-value consumers with lower desire for newness (correlation between travel sensitivity and taste for new products).
Outlets help regular store introduce more new products (possibly improve quality).
35.3.2.2 (Donnelly et al. 2021)
Model for estimating single product choice from alternatives:
Heterogeneity in Individual preferences for product attributes and price sensitivity (across products).
Account for time-varying product attributes, and out-of-stock.
Improvement from traditional model due to:
estimate heterogeneity in individual preferences.
estimate preferences of infrequent (purchase) custeomers
35.3.2.3 (Gabel and Timoshenko 2021)
Deep network model accounts for
cross-product relationships,
time-series filters to capture purchase dynamics for product with varying inter-purchase times