6/16/2019
The email A/B test we will analyze was conducted by an online wine store.
Source: Total Wine & More
Test setting: email to retailer email list
Unit: email address
Treatments: email version A, email version B, holdout
Reponse: open, click and 1-month purchase ($)
Selection: all active customers
Assignment: randomly assigned (1/3 each)
d <- read.csv("test_data.csv") head(d)
## user_id cpgn_id group email open click purch chard sav_blanc syrah ## 1 1000001 1901Email ctrl FALSE 0 0 0.00 0.00 0.00 33.94 ## 2 1000002 1901Email email_B TRUE 1 0 0.00 0.00 0.00 16.23 ## 3 1000003 1901Email email_A TRUE 1 1 200.51 516.39 0.00 16.63 ## 4 1000004 1901Email email_A TRUE 1 0 0.00 0.00 0.00 0.00 ## 5 1000005 1901Email email_A TRUE 1 1 158.30 426.53 1222.48 0.00 ## 6 1000006 1901Email email_B TRUE 1 0 0.00 0.00 0.00 0.00 ## cab past_purch days_since visits ## 1 0.00 33.94 119 11 ## 2 76.31 92.54 60 3 ## 3 0.00 533.02 9 9 ## 4 41.21 41.21 195 6 ## 5 0.00 1649.01 48 9 ## 6 0.00 0.00 149 6
Everything measured after the randomization that could possibly be affected by the treatment is an outcome.
summary(d$group)
## ctrl email_A email_B ## 41330 41329 41329
This is a completely randomized experiment.
summary(d[,c("open", "click", "purch")])
## open click purch ## Min. :0.0000 Min. :0.00000 Min. : 0.00 ## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.: 0.00 ## Median :0.0000 Median :0.00000 Median : 0.00 ## Mean :0.4567 Mean :0.07503 Mean : 21.30 ## 3rd Qu.:1.0000 3rd Qu.:0.00000 3rd Qu.: 21.86 ## Max. :1.0000 Max. :1.00000 Max. :1607.40
summary(d[,c("days_since", "visits", "past_purch")])
## days_since visits past_purch ## Min. : 0.00 Min. : 0.000 Min. : 0.00 ## 1st Qu.: 26.00 1st Qu.: 4.000 1st Qu.: 0.00 ## Median : 63.00 Median : 6.000 Median : 91.22 ## Mean : 89.98 Mean : 5.946 Mean : 188.79 ## 3rd Qu.:125.00 3rd Qu.: 7.000 3rd Qu.: 246.87 ## Max. :992.00 Max. :51.000 Max. :9636.92
summary(d[, c("chard", "sav_blanc", "syrah", "cab")])
## chard sav_blanc syrah cab ## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00 ## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 0.00 ## Median : 0.00 Median : 0.00 Median : 0.00 Median : 0.00 ## Mean : 73.31 Mean : 72.45 Mean : 26.68 Mean : 16.35 ## 3rd Qu.: 54.06 3rd Qu.: 57.42 3rd Qu.: 20.91 3rd Qu.: 12.96 ## Max. :9636.92 Max. :6609.92 Max. :2880.15 Max. :2365.90
Whoa! That’s a lot of chardonnay for one customer!
What is the first question you should ask about an A/B test?
Did the treatment affect the response?
Was the randomization done correctly?
How could we check the randomization with the data?
Randomization checks confirm that the baseline variables are distributed similarly for the treatment and control groups.
Averages of baseline variables by treatment group
d %>% group_by(group) %>% summarize(mean(days_since), mean(visits), mean(past_purch))
## # A tibble: 3 x 4 ## group `mean(days_since)` `mean(visits)` `mean(past_purch)` ## <fct> <dbl> <dbl> <dbl> ## 1 ctrl 90.0 5.95 188. ## 2 email_A 90.2 5.95 188. ## 3 email_B 89.8 5.94 190.
Group means are are similar between groups.
Purchase incidence by group is also similar.
## # A tibble: 3 x 2 ## group `mean(past_purch > 0)` ## <fct> <dbl> ## 1 ctrl 0.744 ## 2 email_A 0.741 ## 3 email_B 0.741
About 3/4 of email list has purchased in the past and this is similar across randomized treatments.
The full distributions of baseline variables should also be the same between treatment groups.