22.3 RFM

RFM is calculated as:

  • A recency score is assigned to each customer based on date of most recent purchase.
  • A frequency ranking is assigned based on frequency of purchases
  • Monetary score is assigned based on the total revenue generated by the customer in the period under consideration for the analysis
library("rfm")
rfm_data_customer
## # A tibble: 39,999 × 5
##    customer_id revenue most_recent_visit number_of_orders recency_days
##          <dbl>   <dbl> <date>                       <dbl>        <dbl>
##  1       22086     777 2006-05-14                       9          232
##  2        2290    1555 2006-09-08                      16          115
##  3       26377     336 2006-11-19                       5           43
##  4       24650    1189 2006-10-29                      12           64
##  5       12883    1229 2006-12-09                      12           23
##  6        2119     929 2006-10-21                      11           72
##  7       31283    1569 2006-09-11                      17          112
##  8       33815     778 2006-08-12                      11          142
##  9       15972     641 2006-11-19                       9           43
## 10       27650     970 2006-08-23                      10          131
## # ℹ 39,989 more rows
# a unique customer id
# number of transaction/order
# total revenue from the customer
# number of days since the last visit


rfm_data_orders # to generate data_orders, use rfm_table_order()
## # A tibble: 4,906 × 3
##    customer_id         order_date revenue
##    <chr>               <date>       <dbl>
##  1 Mr. Brion Stark Sr. 2004-12-20      32
##  2 Ethyl Botsford      2005-05-02      36
##  3 Hosteen Jacobi      2004-03-06     116
##  4 Mr. Edw Frami       2006-03-15      99
##  5 Josef Lemke         2006-08-14      76
##  6 Julisa Halvorson    2005-05-28      56
##  7 Judyth Lueilwitz    2005-03-09     108
##  8 Mr. Mekhi Goyette   2005-09-23     183
##  9 Hansford Moen PhD   2005-09-07      30
## 10 Fount Flatley       2006-04-12      13
## # ℹ 4,896 more rows
# unique customer id
# date of transaction
# and amount
# customer_id: name of the customer id column
# order_date: name of the transaction date column
# revenue: name of the transaction amount column
# analysis_date: date of analysis
# recency_bins: number of rankings for recency score (default is 5)
# frequency_bins: number of rankings for frequency score (default is 5)
# monetary_bins: number of rankings for monetary score (default is 5)
analysis_date <- lubridate::as_date('2007-01-01')
rfm_result <-
    rfm_table_customer(
        rfm_data_customer,
        customer_id,
        number_of_orders,
        recency_days,
        revenue,
        analysis_date
    )
rfm_result
## # A tibble: 39,999 × 8
##    customer_id recency_days transaction_count amount recency_score
##          <dbl>        <dbl>             <dbl>  <dbl>         <int>
##  1       22086          232                 9    777             2
##  2        2290          115                16   1555             4
##  3       26377           43                 5    336             5
##  4       24650           64                12   1189             5
##  5       12883           23                12   1229             5
##  6        2119           72                11    929             5
##  7       31283          112                17   1569             4
##  8       33815          142                11    778             3
##  9       15972           43                 9    641             5
## 10       27650          131                10    970             3
## # ℹ 39,989 more rows
## # ℹ 3 more variables: frequency_score <int>, monetary_score <int>,
## #   rfm_score <dbl>
# customer_id: unique customer id
# date_most_recent: date of most recent visit
# recency_days: days since the most recent visit
# transaction_count: number of transactions of the customer
# amount: total revenue generated by the customer
# recency_score: recency score of the customer
# frequency_score: frequency score of the customer
# monetary_score: monetary score of the customer
# rfm_score: RFM score of the customer

22.3.1 Visualization

heat map shows the average monetary value for different categories of recency and frequency scores

rfm_heatmap(rfm_result)

bar chart

rfm_bar_chart(rfm_result)

histogram

rfm_histograms(rfm_result)

Customers by Orders

rfm_order_dist(rfm_result)

Scatter Plots

rfm_rm_plot(rfm_result)

rfm_fm_plot(rfm_result)

rfm_rf_plot(rfm_result)

22.3.2 RFMC

  1. clumpiness is defined as the degree of nonconformity to equal spacing (Yao Zhang, Bradlow, and Small 2015)

In finance, clumpiness can indicate high growth potential but large risk, Hence, it can be incorporated into firm acquisition decision. Originated from sports phenomenon - hot hand effect - where success leads to more success.

In statistics, clumpiness is the serial dependence or “non-constant propensity, specifically temporary elevations of propensity— i.e. periods during which one event is more likely to occur than the average level.” (Yao Zhang, Bradlow, and Small 2013)

Properties of clumpiness:

  • Min (max) if events are equally spaced (close to one another)
  • Continuity
  • Convergence

References

Zhang, Yao, Eric T. Bradlow, and Dylan S. Small. 2013. “New Measures of Clumpiness for Incidence Data.” Journal of Applied Statistics 40 (11): 2533–48. https://doi.org/10.1080/02664763.2013.818627.
———. 2015. “Predicting Customer Value Using Clumpiness: From RFM to RFMC.” Marketing Science 34 (2): 195–208. https://doi.org/10.1287/mksc.2014.0873.