35.3 Cross-Category and Store Choice Models

Models: Restricted Boltzman Machine Learning Models

How would you name the topic for this week?

Store Choice Model -> Category Choice Model -> Brand choice -> Quantity

35.3.1 Background

(Seetharaman et al. 2005)

Typically outcome variables of interest:
- store choice (Which store visited?)
- Incidence (whether the product category was purchased)
- brand choice (which brand)
- quantity (how many?)

Incidence Outcomes in Multiple Categories

Multi-category “whether to Buy” models

Base Model:
- (Manchanda, Ansari, and Gupta 1999): assumed joint distribution (not independent normal dist from the binary probit model) of two products (underestimate cross-category correlation and overestimates the effectiveness of the marketing mix as compared to (Chib, Seetharaman, and Strijnev 2002))
- (Chib, Seetharaman, and Strijnev 2002): 12 products category, and find that accounting the effects of unobserved heterogeneity across households can recover the overestimated cross-category correlation and underestimated effectiveness of marketing mix.
- (Ma, Seetharaman, and Narasimhan 2012) (publish 5 years later) address the spurious correlation due to 0 outcome (i.e., no purchase) by the multivariate logit model.

Multi-category “When to to Buy” models

Multivariate Hazard model
- (P. K. Chintagunta and Haldar 1998): bivariate hazard model with only positive correlation between two timing outcomes
- Ma and Seetharaman (2004) used Multivariate Proportional Hazard Model to account for both positive and negative pair-wise correlations in the outcomes.

Bundle Choice Models

whether or not to buy a bundle
- (Chung and Rao 2003) uses nested logit with error terms follow a joint Gumbel distribution, assumes:
  - Degree of comparability among product categories
    - Fully comparable attributes (e..g, brand reliability)
    - Partially comparable attributes
    - Non-comparable attributes
  - Two types of attributes:
    - Non-balancing attributes
    - balancing attributes
- (Jedidi, Jagpal, and Manchanda 2003): consumer’s (random) utility = sum of reservation price + random component
  - Multinomial probit

Brand choice outcome models in multiple categories

Correlated marketing mix sensitivities across categories

(Ainslie and Rossi 1998): Multinomial Probit model of brand choice. Found correlation between responsiveness to price and feature advertising across product categories
(Seetharaman, Ainslie, and Chintagunta 1999) found household inertia is correlated among product categories
(Iyengar, Ansari, and Gupta 2003): high coefficients across categories, leveraging info across categories (one observed, focal wasn’t)

Correlated Brand Preferences across categories

(Russell and Kamakura 1997): Poisson model for brand’s purchase volume, they found Inter-category correlation in purchase volume

(Tulin Erdem 1998) (Tülin Erdem and Winer 1998):using multinational logit brand choice model: signaling theory of umbrella branding explains correlated quality perceptions among product categories

Other papers: (V. P. Singh, Hansen, and Gupta 2005) (Hansen, Singh, and Chintagunta 2006)

Models of Multiple Outcomes in Multiple Categories

Incidence and Brand Choice
1. Incidence as an alternative in a multiple choice model:
  1. Deepak et al. (2002): used Multivariate Probit (MVP) of incidence and brand choice outcomes.
  2. (Manchanda, Ansari, and Gupta 1999) found cross-category correlations in marketing mix sensitivities of household
  3. Ma, Seetharaman and Narasimhan (2005): used Multivariate Logit Model to model incidence and brand choice outcome.
2. Incidence and Brand choice as 2 decision stages:
  1. (Mehta 2007): Simultaneous model of incidence and brand choice
  2. Chib et al. (2005): Brand choice within each product category
Incidence and Quantity
1. (Niraj, Padmanabhan, and Seetharaman 2008) Two-stage bivariate logit model
Incidence, brand choice and quantity
1. (Song and Chintagunta 2007): simultaneous model: cross-category effects come from the incidence and brand choice outcomes, not from the quantity outcomes

Estimation: Bayesian framework is a better fit for this type of models. (see (Albert and Chib 1993))

Store Choice Outcomes:

(Bell and Lattin 1998)
(Bell, Ho, and Tang 1998)

35.3.2 Examples

35.3.2.1 (Bucklin, Siddarth, and Silva-Risso 2008)

Changes in the intensity of mature distribution networks (by car make) influence consumer choice.
Three measures for intensity level (for each make)
- Dealer accessibility (buyer’s distance to the nearest outlet): prefer closer
- Dealer concentration (i.e.,the distance required to encircle a given number of same make dealers around a given buyer) (number of dealers near a buyer): prefer more dealers
- Dealer spread (dispersion of the multiple dealers relative to the buyer’s locations): prefer skewed toward the buyer (think of the circle). Using Gini coefficient from the Lorenz curve).
Used logit choice model to model the correlation of the three measure with new car choices.
found significant correlation between measures and car choice.
Motivations:
- Want to infer causation between distribution coverage/ intensity and sales
  - It’s hard. It might depend on product categories (e..g, convenience, shopping or specialty goods).
Focus: relationship between distribution intensity and buyer choice in consumer durables market
Leveraging slow changes in the distribution channel, the authors probe the effect of distribution intensity on choice.
But because it was cross sectional, need to include constant heterogeneity in preferences and other marketing mix effects to avoid confounds.
Data: individual-level purchase record by Power Information Network (PIN), under J. Power and Associates from 1997 to 2004 in Cali.
Different from previous literature: instead of store choice, brand choice was modeled as a function of outlet locations.

Utility:

\[ U_{it}^h = \alpha_i^h + \Sigma_j \beta_j^h X^h_{ijt} \]

where

\(U_{it}^h\) = buyer \(h\)’s utility for \(i\) at time \(t\)
\(X_{ijt}^h\) = attribute \(j\)’s value at time \(t\) by buyer \(h\)
\(\alpha_i^h\) = product-specific constant (vary by household) (i.e., brand preference)

Heterogeneity is modeled at the zip-code level (buyers in the same zip code share \(\alpha, \beta\)

Endogeneity:

Measurement Level: individual data, less measurement error.
Simultaneity: Not much changes in distribution network (with empirical evidence). Hence, unlikely
Sample selection: large and representative sample of Cali market.
Omitted variable bias:
1. Include heterogeneity at the dis aggregate level (capture unobserved geographical effects)
2. Since model at the make level, we have less correlation with the unobserved model-level factors
3. Individual makes have less correlation with manufacturer unobserved variables.

Logit choice probability

\[ P_{it}^h = \frac{\exp(U^h_{it})}{\sum_k\exp(U_{kt}^h)} \]

Using Hierarchical Bayes

Choice probability buyer \(h\) in zip code \(z\) pick make \(i\) at time \(t\)

\[ \text{Prob}_t^h(i | \mathbf{\beta}^z, X_{it}^h) = \frac{\exp(\mathbf{\beta}^{\mathbf{Z}}X^h_{it})}{\sum_j\exp(\mathbf{\beta}^{\mathbf{Z}}\mathbf{X}^h_{jt})} \]

where

\(\mathbf{\beta}^{\mathbf{Z}}\) = a zip-code-specific parameter vector (\(\mathbf{\beta}^{\mathbf{Z}} \sim MVN (\mathbf{\mu}, \mathbf{\Sigma})\)
- \(\mathbf{\mu} \sim MVN (\mathbf{\eta}, \mathbf{C})\)
- \(\mathbf{\Sigma}^{-1} \sim \text{Wishart}[(\rho R)^{-1}, \rho]\)

(Ngwe 2017)

Structural model:
- Demand: sensitivity to travel distance and taste for new product
- Supply: responses to changes in store locations.
Outlets focus on lower-value consumers with lower desire for newness (correlation between travel sensitivity and taste for new products).
Outlets help regular store introduce more new products (possibly improve quality).

35.3.2.2 (Donnelly et al. 2021)

Model for estimating single product choice from alternatives:
- Heterogeneity in Individual preferences for product attributes and price sensitivity (across products).
- Account for time-varying product attributes, and out-of-stock.
Improvement from traditional model due to:
- estimate heterogeneity in individual preferences.
- estimate preferences of infrequent (purchase) custeomers

35.3.2.3 (Gabel and Timoshenko 2021)

Deep network model accounts for
- cross-product relationships,
- time-series filters to capture purchase dynamics for product with varying inter-purchase times

References

Ainslie, Andrew, and Peter E. Rossi. 1998. “Similarities in Choice Behavior Across Product Categories.” Marketing Science 17 (2): 91–106. https://doi.org/10.1287/mksc.17.2.91.

Albert, James H., and Siddhartha Chib. 1993. “Bayesian Analysis of Binary and Polychotomous Response Data.” Journal of the American Statistical Association 88 (422): 669–79. https://doi.org/10.1080/01621459.1993.10476321.

Bell, David R., Teck-Hua Ho, and Christopher S. Tang. 1998. “Determining Where to Shop: Fixed and Variable Costs of Shopping.” Journal of Marketing Research 35 (3): 352. https://doi.org/10.2307/3152033.

Bell, David R., and James M. Lattin. 1998. “Shopping Behavior and Consumer Preference for Store Price Format: Why “Large Basket” Shoppers Prefer EDLP.” Marketing Science 17 (1): 66–88. https://doi.org/10.1287/mksc.17.1.66.

Bucklin, Randolph E., S. Siddarth, and Jorge M. Silva-Risso. 2008. “Distribution Intensity and New Car Choice.” Journal of Marketing Research 45 (4): 473–86. https://doi.org/10.1509/jmkr.45.4.473.

Chib, Siddhartha, P. B. Seetharaman, and Andrei Strijnev. 2002. “Analysis of Multi-Category Purchase Incidence Decisions Using IRI Market Basket Data.” In, 57–92. Emerald (MCB UP ). https://doi.org/10.1016/s0731-9053(02)16004-x.

Chintagunta, Pradeep K., and Sudeep Haldar. 1998. “Investigating Purchase Timing Behavior in Two Related Product Categories.” Journal of Marketing Research 35 (1): 43. https://doi.org/10.2307/3151929.

Chung, Jaihak, and Vithala R. Rao. 2003. “A General Choice Model for Bundles with Multiple-Category Products: Application to Market Segmentation and Optimal Pricing for Bundles.” Journal of Marketing Research 40 (2): 115–30. https://doi.org/10.1509/jmkr.40.2.115.19230.

Donnelly, Robert, Francisco J. R. Ruiz, David Blei, and Susan Athey. 2021. “Counterfactual Inference for Consumer Choice Across Many Product Categories.” Quantitative Marketing and Economics 19 (3-4): 369–407. https://doi.org/10.1007/s11129-021-09241-2.

Erdem, Tulin. 1998. “An Empirical Analysis of Umbrella Branding.” Journal of Marketing Research 35 (3): 339. https://doi.org/10.2307/3152032.

Erdem, Tülin, and Russell S. Winer. 1998. “Econometric Modeling of Competition: A Multi-Category Choice-Based Mapping Approach.” Journal of Econometrics 89 (1-2): 159–75. https://doi.org/10.1016/s0304-4076(98)00059-1.

Gabel, Sebastian, and Artem Timoshenko. 2021. “Product Choice with Large Assortments: A Scalable Deep-Learning Model.” Management Science, April. https://doi.org/10.1287/mnsc.2021.3969.

Hansen, Karsten, Vishal Singh, and Pradeep Chintagunta. 2006. “Understanding Store-Brand Purchase Behavior Across Categories.” Marketing Science 25 (1): 75–90. https://doi.org/10.1287/mksc.1050.0151.

Iyengar, Raghuram, Asim Ansari, and Sunil Gupta. 2003. “Leveraging Information Across Categories.” Quantitative Marketing and Economics 1 (4): 425–65. https://doi.org/10.1023/b:qmec.0000004845.25649.6c.

Jedidi, Kamel, Sharan Jagpal, and Puneet Manchanda. 2003. “Measuring Heterogeneous Reservation Prices for Product Bundles.” Marketing Science 22 (1): 107–30. https://doi.org/10.1287/mksc.22.1.107.12850.

Ma, Yu, P. B. Seetharaman, and Chakravarthi Narasimhan. 2012. “Modeling Dependencies in Brand Choice Outcomes Across Complementary Categories.” Journal of Retailing 88 (1): 47–62. https://doi.org/10.1016/j.jretai.2011.04.003.

Manchanda, Puneet, Asim Ansari, and Sunil Gupta. 1999. “The “Shopping Basket”: A Model for Multicategory Purchase Incidence Decisions.” Marketing Science 18 (2): 95–114. https://doi.org/10.1287/mksc.18.2.95.

Mehta, Nitin. 2007. “Investigating Consumers’ Purchase Incidence and Brand Choice Decisions Across Multiple Product Categories: A Theoretical and Empirical Analysis.” Marketing Science 26 (2): 196–217. https://doi.org/10.1287/mksc.1060.0214.

Ngwe, Donald. 2017. “Why Outlet Stores Exist: Averting Cannibalization in Product Line Extensions.” Marketing Science 36 (4): 523–41. https://doi.org/10.1287/mksc.2017.1031.

Niraj, Rakesh, V. Padmanabhan, and P. B. Seetharaman. 2008. “Research NoteA Cross-Category Model of Households’ Incidence and Quantity Decisions.” Marketing Science 27 (2): 225–35. https://doi.org/10.1287/mksc.1070.0299.

Russell, Gary J, and Wagner A Kamakura. 1997. “Modeling Multiple Category Brand Preference with Household Basket Data.” Journal of Retailing 73 (4): 439–61. https://doi.org/10.1016/s0022-4359(97)90029-4.

Seetharaman, P. B., Andrew Ainslie, and Pradeep K. Chintagunta. 1999. “Investigating Household State Dependence Effects Across Categories.” Journal of Marketing Research 36 (4): 488. https://doi.org/10.2307/3152002.

Seetharaman, P. B., Siddhartha Chib, Andrew Ainslie, Peter Boatwright, Tat Chan, Sachin Gupta, Nitin Mehta, Vithala Rao, and Andrei Strijnev. 2005. “Models of Multi-Category Choice Behavior.” Marketing Letters 16 (3-4): 239–54. https://doi.org/10.1007/s11002-005-5888-y.

Singh, Vishal P., Karsten T. Hansen, and Sachin Gupta. 2005. “Modeling Preferences for Common Attributes in Multicategory Brand Choice.” Journal of Marketing Research 42 (2): 195–209. https://doi.org/10.1509/jmkr.42.2.195.62282.

———. 2007. “A DiscreteContinuous Model for Multicategory Purchase Behavior of Households.” Journal of Marketing Research 44 (4): 595–612. https://doi.org/10.1509/jmkr.44.4.595.