sklift.datasets.fetch_hillstrom
Kevin Hillstrom Dataset: MineThatData
Data description
This is a copy of MineThatData E-Mail Analytics And Data Mining Challenge dataset.
This dataset contains 64,000 customers who last purchased within twelve months. The customers were involved in an e-mail test.
1/3 were randomly chosen to receive an e-mail campaign featuring Mens merchandise.
1/3 were randomly chosen to receive an e-mail campaign featuring Womens merchandise.
1/3 were randomly chosen to not receive an e-mail campaign.
During a period of two weeks following the e-mail campaign, results were tracked. Your job is to tell the world if the Mens or Womens e-mail campaign was successful.
Fields
Historical customer attributes at your disposal include:
Recency: Months since last purchase.
History_Segment: Categorization of dollars spent in the past year.
History: Actual dollar value spent in the past year.
Mens: 1/0 indicator, 1 = customer purchased Mens merchandise in the past year.
Womens: 1/0 indicator, 1 = customer purchased Womens merchandise in the past year.
Zip_Code: Classifies zip code as Urban, Suburban, or Rural.
Newbie: 1/0 indicator, 1 = New customer in the past twelve months.
Channel: Describes the channels the customer purchased from in the past year.
Another variable describes the e-mail campaign the customer received:
Segment
Mens E-Mail
Womens E-Mail
No E-Mail
Finally, we have a series of variables describing activity in the two weeks following delivery of the e-mail campaign:
Visit: 1/0 indicator, 1 = Customer visited website in the following two weeks.
Conversion: 1/0 indicator, 1 = Customer purchased merchandise in the following two weeks.
Spend: Actual dollars spent in the following two weeks.
Key figures
Format: CSV
Size: 433KB (compressed) 4,935KB (uncompressed)
Rows: 64,000
Response Ratio:
Average visit Rate: .15,
Average conversion Rate: .009,
the values in the spend column are unevenly distributed from 0.0 to 499.0
Treatment Ratio: The parts are distributed evenly between the three classes
About Hillstrom
The dataset was provided by Kevin Hillstorm. Kevin is President of MineThatData, a consultancy that helps CEOs understand the complex relationship between Customers, Advertising, Products, Brands, and Channels.
Link to the blog: https://blog.minethatdata.com/