Chapter 2 Introduction

This module introduces market basket analysis.

The name of this techniques, market basket analysis, probably derives from its frequent application to the analyzing consumer buying behavior of consumers in supermarkets. Upon check-out, the items in the cart or basket are recorded. Especially when linked to customer or membership cards, the data provides valuable insight in buying patterns.

The same holds true for e-commerce organizations, like Amazon. Compared to traditional bookstores, Amazon has a much better insight in buying patterns. From purchasing patterns, Amazon is able to detect which titles are frequently ordered by the same buyers. This knowledge enables Amazon to effectively cross-sell items, by recommending titles to buyers under headings like "customers who bought this item also bought ..."

However, as we will show, the technique can be applied in all cases where we are looking for patterns, or relationships between characteristics of objects or persons. Some examples:

  1. Purchases in supermarkets
  2. Behavioral traits of criminals
  3. DNA patterns in diseases
  4. Or the Titanic data ...

The popularity of the technique has everything to do with the vast amount of data that large organizations (supermarkets; online sellers, such as Amazon) have about people (customers; members, etc.) and their transactions.

Traditionally, marketing specialists rely on intuitive knowledge of buying behavior, sometimes supported by information from consumer surveys. Driving questions are, for example, what are the expected sales numbers if the price for a product is reduced by 5%? Or, for supermarkets, what is the optimal store layout (order of products and shelves; place on the shelf)?

Of course, it is possible to use experiments, but frequent changes in store layout, for example, will easily lead to customer confusion and annoyance. With the emergence of barcodes and scanning systems that are linked to customer cards, companies can better map who buys what, when and in what quantities.

Online stores such as Amazon have an enormous competitive advantage over traditional stores. To traditional stores, most customers are just anonymous buyers. But online stores know the profile of the customers and their buying habits. Amazon knows the genres of books their customers enjoy, and can recommend books by other authors.

** Unsupervised Learning **

Like cluster analysis via K-means clustering, market basket analysis is an example of unsupervised learning. That is, it is not necessary to first train the algorithm, and then see if the algorithm provides good predictions in a test set of data not used in the training.

What we are looking for is a set of rules that relate characteristics to one another. We can then use these rules for all types of decisions. In the setting of the supermarkets, the characteristics are the groceries (items) in the shopping basket of the customer.

If you want to read more about market basket analysis, then (Lantz 2013) and (Dietrich, Heller, and Yang 2015) are worth reading!


Dietrich, David, Barry Heller, and Beibei Yang. 2015. Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data. Indianapolis: Wiley.

Lantz, Brett. 2013. Machine Learning with R. 2nd ed. Birmingham, UK: Packt Publishing.