A GACP and GTMCP company How to perform predictive analysis on your web analytics tool data June 19th, 2013 6/19/2013 FREE Webinar by #tatvicwebinar Before we start... www 6/19/2013 Q & A A GACP and GTMCP company ? #tatvicwebinar Our speakers A GACP and GTMCP company Carolina Araripe Inbound Marketing Strategist @Tatvic http://linkd.in/YazvVn Amar Gondaliya Data Model Engineer @Tatvic http://linkd.in/16cpDQI Kushan Shah Web Analyst @Tatvic http://linkd.in/18rfFfV 6/19/2013 #tatvicwebinar Talking about Analytics… A GACP and GTMCP company Descriptive: What has happened? Analytics Predictive: Predicts the outcome or future 6/19/2013 Prescriptive: What should happen? #tatvicwebinar Talking about Analytics… A GACP and GTMCP company Descriptive: What has happened? Analytics Predictive: Predicts the outcome or future 6/19/2013 Prescriptive: What should happen? #tatvicwebinar In other words… A GACP and GTMCP company Predictive Analytics “Technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions.” Source: Siegel, E. (2013) “Predictive Analytics. The power to predict who will click, buy, lie or die.” 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Introduction to R What A GACP and GTMCP company • Open source statistical computing language, widely used by organizations to solve business problems. • Data Analysis • Statistical Tests • Data Visualization • Predictive Model • Easy to integrate • Data frame • • Choose and download a user-friendly GUI • Forecasting Applications Why How to get started 6/19/2013 Download and install • Pre developed packages RStudio #tatvicwebinar R Packages Categories of Packages Data Extraction A GACP and GTMCP company For this webinar • RGoogleAnalytics Usage: To extract Google Analytics data into R Contibutors: Michael Pearmain, Nick Mihailovski, Amar Gondaliya and Vignesh Prajapati Data Visualization • ggplot2 Usage: Build plots and charts Contibutor: Hadley Wickham Time Series Machine Learning 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Google Analytics data A GACP and GTMCP company Extracting your GA data into R User performing data extraction Google OAuth2 Authorization Server Google Analytics API Access Token Request Access Token Response Call API for list of profiles Call API for query 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Business Problem A GACP and GTMCP company Projected Growth of Retail eCommerce in US US Retail eCommerce Sales 2011-2016 (in billion $) $384.90 $338.90 $296.70 $194.70 2011 $225.50 2012 $258.90 2013 2014 2015 2016 Source: http://www.emarketer.com/Article/Retail-Ecommerce-Set-Keep-Strong-Pace-Through-2017/1009836 6/19/2013 #tatvicwebinar Business Problem A GACP and GTMCP company Product return “Returns are on the rise-up 19% from 2007. For every US$1 spent on merchandize, 9¢ are returned.” “Average return rate for ecommerce retailers varies from 3-12%.” Source: Time Magazine, Sept. 04th, 2012 Product Return Impact (per day) Average Return Rate 9% 7% Average Order Value $100 $100 Orders Per Day 500 500 Total Income $50,000 $50,000 Loss due to returns $4,500 $3,500 Revenue post loss $45,500 $46,500 ----- $1000 Increase in Revenue/day 6/19/2013 Increase in Revenue with recovered returns in long run Month x30 $30,000 Year x365 $365,000 #tatvicwebinar Data Introduction A GACP and GTMCP company Transactional Data 6/19/2013 Pre Purchase Data Browsing Behavior up to shopping cart In Purchase Data Purchase Behavior from shopping cart to thank you page #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Machine Learning Tech. A GACP and GTMCP company Supervised Learning Generates a function that maps inputs (labeled data) to desired outputs (e.g.: Spam Detection) Variables Supervised Learning Model Labels are right answers from historical data Training Data Machine Learning Algorithm Labels e.g.: Spam Detection Input Data: Contains emails marked Spam/No Spam Variables Test Data 6/19/2013 Predictive Model Predicted Outcome labels #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Feature engineering A GACP and GTMCP company Going beyond algorithms and using domain knowledge to augment new variables to model • • • • E.g.: Products purchased as gifts are less likely to be returned Create a New Variable with binary values: 1 – Product purchased as gift, 0 – otherwise Products purchased in holiday season are more likely to be returned Based on Purchase date, create new variable with binary values: 1 – Product purchased in the month Nov-Dec, 0 - otherwise 6/19/2013 #tatvicwebinar Predictor/Response Variables A GACP and GTMCP company 700,000.00 Price of House ($) Response Variable 800,000.00 600,000.00 500,000.00 400,000.00 300,000.00 200,000.00 100,000.00 0.00 0 500 1,000 1,500 2,000 2,500 3,000 Size of House (sq ft) 3,500 4,000 4,500 5,000 Predictor Variable 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Generalized Linear Models A GACP and GTMCP company glm (formula, family, data) Formula Response ~ Predictor (This argument shows which all variables are independent (predictor) variables and which variable is/are dependent(response) variable/s Family Binomial (Since the output variable (which is product return is defined as binary value 0 or 1, we are using binomial family) Data Train data set – This data set consists values of all 18 variables (i.e. values of dependent variables and independent variables are given). This dataset is also called labeled data. 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Machine Learning Tech. A GACP and GTMCP company Supervised Learning Generates a function that maps inputs (labeled data) to desired outputs (e.g. Spam Detection) Variables Supervised Learning Model Labels are right answers from historical data Training Data Machine Learning Algorithm Labels e.g.: Spam Detector Input Data: Contains emails marked Spam/No Spam Variables Test Data 6/19/2013 Predictive Model Predicted Outcome labels #tatvicwebinar Summary A GACP and GTMCP company Probability of product return > 60% Number of Transactions Probability of product return ≤ 60% > 60 % ≤ 60 % > 60 % < 60 % Probability of Product Returns Call customer before shipping Send discount coupon to initiate customer for future purchase 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar ggplot2 Geometric Shapes 6/19/2013 Scales and Coordinate Systems A GACP and GTMCP company Plot Annotations #tatvicwebinar Q&A Round 6/19/2013 A GACP and GTMCP company #tatvicwebinar A GACP and GTMCP company Thank you! Carolina Araripe [email protected] +91 7600-515-354 +1 276-644-0456 6/19/2013 #tatvicwebinar
© Copyright 2024