CON6545_Jagannath

Market Basket & Advanced Analytics
at Dunkin Brands
Mahesh Jagannath, Prasanna Palanisamy
Oct 1, 2014
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Agenda
•
•
•
•
•
About Dunkin Brands Inc.
BI Program at Dunkin Brands
BI Architecture at Dunkin Brands
Advanced Analytics Architecture & Methodology
Advanced Analytics Use Cases at Dunkin
•
•
•
Market Basket
Customer Analytics
Q&A
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Disclaimer
All data used is sample data for presentation purposes only
and is not actual corporate sales or consumer data
3
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
About Dunkin Brands
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
BI Program At Dunkin Brands
•
•
•
•
•
•
First launched at DBI in 2007
1350 BI users today with role based access to 504 dashboard pages
Mature governance process
Domestic POS sales analysis to increase comparable store sales and
profitability of DD and BR in U.S.
Store development dashboards to identify opportunities to continue DD
U.S. contiguous store expansion
International reported sales analysis to drive accelerated international
growth across both brands.
5
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
BI/DW Architecture at Dunkin Brands
Other DBI Data
Hyperion Users
Exadata
Exalytics
Enterprise Data
Warehouse
Oracle EBS
Hyperion
R
Radiant
Sales Data
Oracle BI
DBI Corporate
Users
Intl. POS
Franchisees
(above store)
Social Media
Loyalty / CRM
Steton
SMG
PAR
RPS Bluecube
PAR Terminals
RPS Archive
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
6
Agenda
•
•
•
•
•
About Dunkin Brands Inc.
BI Program at Dunkin Brands
BI Architecture at Dunkin Brands
Advanced Analytics Architecture & Methodology
Advanced Analytics Use Cases at Dunkin
•
•
•
Market Basket
Customer Analytics
Q&A
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Advanced Analytics platform
•
•
Products Considered
•
•
•
Oracle Advanced Analytics / Oracle R Enterprise (ORE)
Open Source R
IBM SPSS
Chose Oracle Advanced Analytics
•
•
•
Excellent fit with existing analytics infrastructure
All the benefits of Open source R
Scalability of Oracle 11G on engineered systems
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
R—Widely Popular
R is a statistics language similar to Base SAS or SPSS statistics
R environment
Strengths
•
•
•
Powerful & Extensible
Graphical & Extensive statistics
Free—open source
Challenges
•
•
•
•
Memory constrained
Single threaded
Outer loop—slows down process
Not industrial strength
Oracle Advanced Analytics
Oracle R Enterprise Component Architecture
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Oracle Advanced Analytics
Oracle R Enterprise Compute Engines
11
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Advanced Analytics Methodology
Identify
Business
Objective
Monitor
Performance &
re-calibrate
Deploy Model
Understand
Data
Prepare data
Test Model
Develop model
12
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
ORE Advanced Analytics Framework
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Agenda
•
•
•
•
•
About Dunkin Brands Inc.
BI Program at Dunkin Brands
BI Architecture at Dunkin Brands
Advanced Analytics Architecture & Methodology
Advanced Analytics Use Cases at Dunkin
•
•
•
Market Basket
Customer Analytics
Q&A
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Market Basket Analysis
• Understand role of category and
Identify
Business
Objective
purchase behavior
• Identify category marketing
opportunities
Monitor
Performance &
re-calibrate
Understand
Data
Deploy Model
• Get richer insight into behavioral
changes from promotions
Prepare data
• Apply data validation rules
• Transform POS data into MB
input format
• Output to Star schema suitable for
OBIEE consumption
Test Model
Develop model
• Pairwise association model similar
to Apriori, custom SQL
implementation
15
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Market Basket Business Questions
Choose a Category: (Sub Category Level)
Answer the following questions for that Item in a particular region last week.
•
•
•
•
•
•
What % of all transactions include [Product]?
What related items are sold most frequently with [Product]?
What is the average ticket $ amount when [Product] is present?
On Average how many [Product] are sold in each transaction?
What beverages are consumers buying most with [Product]?
In what % of [Product] transactions is [Product] the only product
purchased?
16
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Data Analysis & Design Considerations
•
•
•
•
•
•
•
8 M daily transactions, ~25M transaction detail lines
20 TB data warehouse size, sales data about 10 TB
Hierarchies: 5 level Product, 2x4 level Org, 4 level regional
~1000 SKUs @Item Group/Size level
Exponential growth in combinations with each hierarchy
2 years of pre-computed Market Baskets and associated
sales measures for reporting
Nightly compute within ETL window data with 1 day latency
Measures are non-additive along the Product Hierarchy
17
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Design : Approaches considered
1. Use Oracle Data Mining / Oracle R Enterprise Association
Rules
2. Use Frequent Itemset table function in Oracle 11g to
compute Item-sets
3. Custom SQL Development
• Approach Chosen
• Oracle Advanced Analytics for exploration / Ad-Hoc
• Custom SQL for repeatable basket computation
• OBIEE for reporting
18
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
High-level Design
Transaction
Data
Data Model/ Preprocessing
Rule
Development
Measure
Calculation
UI /
Reports
19
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
4 Key Reports
% of
Transactions
containing
related items
Transaction Detail:
Product of Interest
Related
Product
Pairings
Single Item
Transactions: % of
transactions when
products are
purchased alone.
20
Related Item
What beverages are sold
most often with PM
Flats?
21
POI Transaction Detail
Transaction Detail:
Product of Interest
22
Related Purchases
Related
Product
Pairings
23
Related Transactions
Non-additive measures
5+3+3 Don’t Equal 11 in
this case because some
medium and small
coffees might be sold in
the same transaction!
Single Item Transactions
Click on to
drill down for
more detail
Agenda
•
•
•
•
•
About Dunkin Brands Inc.
BI Program at Dunkin Brands
BI Architecture at Dunkin Brands
Advanced Analytics Architecture & Methodology
Advanced Analytics Use Cases at Dunkin
•
•
•
Market Basket
Customer Analytics
Q&A
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Current Areas Of Interest
•
•
•
•
Customer Profiling
Clustering / Segmentation
Customer Churn Prediction
Targeted Promotions
27
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Customer Profiling
Identify
Business
Objective
• Compute behavioral
variables
• Create Customer
Monitor
Performance &
re-calibrate
Deploy Model
Understand
Data
record
• Data Exploration in R
Prepare data
Test Model
Develop model
28
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Customer Profiling: Attributes
List of customer attributes used as-is or derived from their transactional history
Descriptive
Spend/ Check
Transaction/Frequen
cy
Store Features
Historical Purchase
1.
2.
3.
4.
5.
6.
1.
2.
3.
4.
1. Start Date
2. Last transaction date
3. Days since last
transaction
4. Total
transactions/Visits
5. Average weekly visits
6. % discounted visits
7. Top Day part
8. Daypart - % Visits
9. Preferred Store
10. Multi Store flag
11. Average DD Card
Recharge Amount
12. Average DD Card
Recharge Frequency
13. Days since last
recharge
14. Current card balance
15. Transaction Activity
in weeks
1. POS: drive thru or
not
2. Combo or not
3. Wifi
1. Total Spend
/Category
2. % spend on each
Category
3. % spend Sub
category
4. Average number of
items per transaction
5. Preferred item
combo
Customer ID
City
State
DMA
Age
Profession
5.
6.
7.
8.
9.
Min Check
Max Check
Total Spend
Average Weekly
Spend
Total points earned
% Points redeemed
Total No. of coupons
redeemed
Total discount
amount (Coupons)
Avg weekly coupon
redeemed
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Customer Segmentation / Clustering
Identify
Business
Objective
• Re-run the model
periodically to update the
new clusters
• Indicates any shift in the
customer behavior
Monitor
Performance &
re-calibrate
• To understand your customers
• Targeted Marketing
• Design Promotions
Understand
Data
• Compute behavioral
variables
• Create Customer
record
• Data Exploration in R
• Model displays cluster means –
Cluster properties
• Number of Customers in a
cluster
• Deployed for targeted
Deploy Model
Prepare data
Marketing and Monitoring
Customer behavior
• Identify variables for
clustering,
• Normalize data for
Clustering
Test Model
Develop model
• K-Means Clustering used to
cluster Customers and find
individual cluster
characteristics
30
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Customer Segmentation / Clustering
Analyze Cluster means to Derive Cluster
Properties
-
-
Clustering
Algorithm
Regulars – avg weekly visits are 5
- 78.2% visits in morning
Mostly coffee drinker, but 25% times food
buyers
- Coffee Regulars
- Avg weekly visits are 5.45
- Avg coffee transactions 80.29%
Customer Data
Profiles
-
High Spenders, Frequent visitors
- Avg weekly spend ($35.12)
- Avg. weekly visits (7.44)
- Coffee and Food in basket (Avg
items per transaction 2.4
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Customer Churn Analysis
• Monitor the response
and re-calibrate by
updating training data or
model parameters
• Calculate the metrics for
model evaluation
• Define Churn & Active
Identify
Business
Objective
Customer
• Identify Churn
Customer patterns
• Is the churn pattern
Monitor
Performance &
re-calibrate
localized or National?
Understand
Data
Class
Active
Churn
Active
71.93%
28.07%
Churn
15.37%
84.63%
Deploy Model
Prepare data
• Model will calculate
the churn score for
existing customers
• Flag customers with
high risk, low risk
based on churn score
Test Model
• Test the model on test
data set, for which
outcome is known
• Select threshold for
model selection
• Confusion Matrix for the
best Model
Develop model
• Compute behavioral
variables
• Create Customer
record
• Data Exploration in R
• Create Training data set
• Should have equal
distribution of churn
and usual customers
• Model to derive
churn risk
score.
• SVM
• Logistic
regression
• Naïve Bayes
Classifier
32
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Possible Future initiatives
• Periodic Churn Rate Modeling – measure churn over time
• Customer Segments based on buying pattern – what they
•
•
•
•
•
buy, when they buy?
Identify customers who are more likely to respond to offers
Personalized promotions for retention
Customer Lifetime value
Customer Sentiment Analysis
Enrich customer profiles with modeling scores
33
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.
Q&A
34
Confidential information: Copying, dissemination or distribution of this information is strictly prohibited.