EMM Regional User Group Breakout Session: Flowcharts Faster

EMM Regional User Group
Breakout Session:
How to Build High-Performing IBM (Unica) Campaign
Flowcharts Faster
© 2012 IBM Corporation
Agenda
Flowchart Performance Tuning Approaches
Unica Campaign and Databases
Campaign Configuration Parameters
Leveraging in-database processing
Custom Macros
2
© 2012 IBM Corporation
Target Audience
Organization Roles
– Database marketer/flowchart developer
– Application owner
– Technical application lead
– Application administrator
Responsibilities
– Application Configuration
– Advanced flowchart design and performance tuning
– Creating reusable application objects (catalogs, custom macros, etc)
– Support and training new users
3
© 2012 IBM Corporation
Flowchart Performance Tuning Approaches
© 2012 IBM Corporation
Flowchart Performance Tuning Approaches
in General
Optimize workload distribution between Campaign and the Database
– Table joins
– Merge process vs. database Union, Join, Minus operation
– Data sorts
Optimize Calculation Timing
– Calculate later on smaller audiences
– Calculate earlier as summary table or strategic segments
Partition or simplify the problem
– Break complex logic into simpler components
– Process one chunk at a time or in parallel
Reduce the number of processing steps
– Combine multiple queries accessing the same data into one
– Replace multiple levels of segmentation with derived fields
5
© 2012 IBM Corporation
Tuning Approaches, Cont’d
Optimize the use of available resources and tools
– Memory, CPU, I/O
– Virtual memory
– Connections and query threads
– Using DB bulk loaders
– Using DB statistics collection commands
Identify techniques that work well in your environment and package them for reuse
– Product configurations
– Flowchart templates
– Custom macros
Identify techniques that do not perform and develop alternatives
– Frequent audience switches could be avoided with base-dimension table
mappings
6
© 2012 IBM Corporation
IBM Unica Campaign and Databases
© 2012 IBM Corporation
Terminology
In Data Marts
“Dimension” tables contain normalized lists of values (States, Zips, Products, etc)
“Fact” tables contain quantitative characteristics of business transactions with foreign
key for dimensions
A table is either a fact or a dimension
8
© 2012 IBM Corporation
Terminology Cont’d
In IBM Unica Campaign
“Audience” represents a marketable entity, is associated with one or more key columns
“Base” tables contain audience keys
“Dimension” tables could be joined to base tables or other dimensions
A table could be mapped in multiple times in different roles
9
© 2012 IBM Corporation
Best Practices
Have a decision support schema.
– Easy to understand and navigate for the user
– Fewer table joins for performance
Propagate audience keys as widely as possible across the schema.
Calendar based summary tables are not as useful for marketing as they are for
other applications.
© 2012 IBM Corporation
The Role of Table Mapping
To provide Unica Campaign with metadata about audience levels and table
relationships
To control the structure of SQL generated by Unica Campaign:
– Joins between base and dimension tables are executed in the database
– Joins between base tables are performed on Unica Campaign server *)
– By selecting appropriate table mapping the users can change the structure of
SQL
Mapping tables at the right audience level(s) could simplify flowchart logic
*) In UC 7.5.3+ this behavior is configurable
11
© 2012 IBM Corporation
Table Mapping: Example 1
•Normalized Base Table
•Dimension of Transactions
Customer
CustID
(PK)
Name
Address
…
•De-normalized Base Table
•Dimension of Customer
Transactions
Product
ProdID
Audience Level: Customer
(PK)
SKU
Brand
…
TxnID
(PK)
CustID
ProdID
TimeID
ChannelID
Quantity
Amount
Discount
(FK)
(FK)
(FK)
(FK)
Channel
ChannelID (PK)
Description
…
Dimension
Time
TimeID
(PK)
Date
…
Dimension
Dimension
12
© 2012 IBM Corporation
Table Mapping: Example 2
Normalized Base Table
Household
HH ID (PK)
Address
Ciy
State
ZIP
…
Audience Level: Household
Dimension of HH
Denormalized Base Table
Customer
Dimension of Customer
CustID (PK)
First Name
Last name
HH ID
…
Product
ProdID (PK)
Brand
SKU
Brand ID
…
BrandID (PK)
Brand Name
Description
…
Dimension of Transaction
Transactions
TxnID
(PK)
CustID
ProdID
TimeID
ChannelID
Quantity
Amount
Discount
(FK)
(FK)
(FK)
(FK)
Channel
ChannelID (PK)
Description
…
Time
TimeID
(PK)
Date
…
Dimension of Transaction
Dimension of Transaction
Dimension of Product
13
© 2012 IBM Corporation
© 2012 IBM Corporation
Flowchart Execution with In-db Optimization
Campaign Server
Database
Universe and Suppression Queries
Merge Query
Process
Insert into
Select
Queries
Process
Insert into
Select Queries
Query for Derived Field Data
DF Calc
Data Retrieval
ID List
Segment
Process
Select
Query
Temp Table Creation
ID List Upload
Query for Output Fields
14
© 2012 IBM Corporation
Process
Insert into
Select
Query
© 2012 IBM Corporation
Know Your Flowchart Log
The flowchart log is the only way to know how a flowchart is actually
performing.
What is a good logging level?
© 2012 IBM Corporation
Log File Structure
Timestamp
PID
Level (I, W, E)
Category
Process Name
Message Body
04/20/2005 17:14:20.667
(1752)
[I] [PROCESS]
SESSION_RUN_START
04/20/2005 17:14:20.797
(1752)
[I] [PROCESS]
[Active]
Select PROCESS_RUN_START
04/20/2005 17:14:20.907
(1752)
[I] [DB QUERY]
[Active]
Northwind (thread 000004B8): SELECT
04/20/2005 17:14:20.957
(1752)
[I] [TABLE ACC] [Active]
Northwind (thread 000004B8): Query completed;
04/20/2005 17:14:22.069
(1752)
[I] [TABLE ACC] [Active]
Northwind (thread 000004B8): Data retrieval
04/20/2005 17:14:22.089
(1752)
[I] [PROCESS]
[Active]
Select: N_RECORDS = 89
04/20/2005 17:14:22.099
(1752)
[I] [PROCESS]
[Active]
Select PROCESS_RUN_DONE
© 2012 IBM Corporation
16
© 2012 IBM Corporation
Understanding Log Files
Flowchart
Run Info
Process
Run Info
[PROCESS]
SESSION_RUN_START
[PROCESS]
[Active]
Select PROCESS_RUN_START
[DB QUERY]
[Active]
Northwind (thread 000004B8): SELECT CustomerID FROM dbo.Orders ORDER BY CustomerID
[TABLE ACC]
[Active]
Northwind (thread 000004B8): Data retrieval completed; 830 records retrieved and returned to caller.
[PROCESS]
[Active]
Select: N_RECORDS = 89
[PROCESS]
[Active]
Select PROCESS_RUN_DONE
[PROCESS]
[Universe]
Merge PROCESS_RUN_START [sprocrun:757]
[CELL ACC]
[Universe]
CellAccess: Data is ready.
[PROCESS]
[Universe]
Merge: N_RECORDS 74
[PROCESS]
[Universe]
Merge PROCESS_RUN_DONE [sprocrun:757]
[PROCESS]
[Value Segment] Segment PROCESS_RUN_START [sprocrun:757]
[CELL ACC]
[Value Segment] CellAccess: Pre-computing derived fields.
[CELL ACC]
[Value Segment] CellAccess: Derived field computation completed. Data is ready.
[CELL ACC]
[Value Segment] Cell: Creating segments
[PROCESS]
[Value Segment] Segment: MAX_SEG_SIZE 3
[PROCESS]
[Value Segment] Segment PROCESS_RUN_DONE [sprocrun:757]
[PROCESS]
[Output List]
Snapshot PROCESS_RUN_START [sprocrun:757]
[DB QUERY]
Address, City)
[Output List]
Northwind (thread 00000288): INSERT INTO dbo.OUTPUT_TEST1(customer_ID, CompanyName,
Derived
Fields
SQL
SELECT UAC_27_1.CustomerID, dbo.Customers.CompanyName, dbo.Customers.Address, dbo.Customers.City FROM (UAC_27_1 LEFT
OUTER
JOIN dbo.Customers ON UAC_27_1.CustomerID = dbo.Customers.CustomerID)
17
[TABLE ACC]
[Output List]
Northwind (thread 00000288): Statement successful; 4 records affected. [sdbtacc:2269]
[DB QUERY]
[Output List]
Northwind (thread 00000ED8): SELECT COUNT(*) FROM dbo.OUTPUT_TEST1 [sdbtacc:2195]
[PROCESS]
[Output List]
Snapshot: Records Written to Table "OUTPUT_TEST1": 9
[PROCESS]
[Output List]
Snapshot PROCESS_RUN_DONE
[PROCESS]
SESSION_RUN_DONE
© 2012 IBM Corporation
© 2012 IBM Corporation
IBM Unica Campaign Configuration
Parameters
© 2012 IBM Corporation
Setting Configuration Parameters
In Campaign 7.x and AM 7.0.x
– Edit affinium_config.xml
In Campaign 7.x and AM 7.1+
– Edit parameter settings in AM configuration tab
– If a parameter not exposed in AM GUI
• Export current AM settings to XML file by using configTool
• Add parameters to XML file
• Re-import XML file into AM by using configTool again
In Campaign 7.5+ and UC 8+ the installer adds new parameters to
configuration automatically
19
© 2012 IBM Corporation
Server Level Configuration
maxVirtualMemory
– Default, can be modified for each flowchart
– Maximum recommended settings
• 1024 for 32-bit versions
• 2048 for 64-bit versions
useInDbOptimization
– Default, can be modified for each flowchart
– Recommended for environments with
• High data volumes
• Powerful databases
20
© 2012 IBM Corporation
Database Loaders
Database loaders improve the performance of moving data from the
campaign server back to the Database.
Loaders are a shared resource and other applications can take priority
– Managing parallelism can be a non – trivial exercise
Campaign invokes the loader through OS scripting and sets up data and
control file for the DB loader to consume.
Teradata uses 2 loader utilities
– FastLoad for entering data into an empty table
– Mload for appending data to a table
Performance increases especially seen in Contact History and Response
Processing writes
21
© 2012 IBM Corporation
Database Loader Execution
Flowchart
Generates
the Control
File
Flowchart
Generates
the Data
File
Flowchart
Invokes the
Loader
Script
Loader Script
Get in
the
queue
First In
queue?
Yes
Load
Data
and
Exit
No
Wait X Seconds
22
© 2012 IBM Corporation
© 2012 IBM Corporation
Database Tuning Configurations*
Many knobs and levers exist to tune Campaigns
interactions with your database…
Connections, Query Threads/Parallelism, I/O
– MaxQueryThreads
SQL Generation
– EnableSelectDistinct
DB Table Statistics Through Scripts
– PostTempTableCreateRunScript
DB Table Statistics Through SQL
– TempTablePostExecutionSQL
DB Tables Suffixes
– SuffixOnTempTableCreation
* Additional Settings included in Addendum
23
© 2012 IBM Corporation
Leveraging In Database Processing
© 2012 IBM Corporation
Eligibility for In-DB Processing
• With In-DB processing turned on, Campaign evaluates
every process box for “In-DB eligibility”
Features do not qualify
– Cross data source table joins
– Audience switching (*)
– Output de-duping
– Cell size limits (*)
– Data sampling (*)
– Campaign macros (*)
– Persistent derived fields (*)
– MailList, Response, Track
• Features that work well
–
–
–
–
–
Select
Segment
Merge
Snapshot
Derived fields with raw SQL
macros
© 2012 IBM Corporation
Design Recommendations
Scenario
Recommendation
Audience switching
- Use a combination of Select and Snapshot to create a work
table
- Map the work table at a desired audience
Cell size limits
- Create a raw SQL custom macro using database function
rownum()
- Use the custom macro in Select or Segment to limit the
number of IDs in the cell
Data sampling
- Create a raw SQL custom macro using database function
random()
- Use the custom macro in Select or Segment to generate a
random sample of IDs in the cell
DFs with Campaign macros
- Create a raw SQL custom macro using equivalent database
function
- Create a derived field referencing the custom macro
Persistent Derived Fields
Use Extract processes to store calculated data in the database
© 2012 IBM Corporation
SQL Macros + In-DB Processing
As of Campaign 7.3+, SQL custom macros work seamlessly with In-DB
optimization
– SQL custom macros in Selects and Segments are now eligible for In-DB
processing
– Both “ID” and “ID + Value” custom macros can be processed in the database
– Logical and arithmetic expressions built out of custom macros can be
processed in the database
– Derived fields using SQL custom macros in Snapshots and MailLists are now
eligible for In-DB processing
27
© 2012 IBM Corporation
Unica Macros vs. DB Functions
Unica
GROUPBY
SQL Server
Oracle
DB2
Netezza
Teradata
Aggregation Functions
AvgOf
AVG
AVG
AVG
AVG
AVG
CountOf
COUNT
COUNT
COUNT
COUNT
COUNT
DistinctOf
COUNT (DISTINCT)
COUNT (DISTINCT)
COUNT (DISTINCT)
COUNT (DISTINCT)
COUNT (DISTINCT)
MaxOf
MAX
MAX
MAX
MAX
MAX
MinOf
MIN
MIN
MIN
MIN
MIN
SumOf
SUM
SUM
SUM
SUM
SUM
StdDevOf
STDDEV
STDDEV
STDDEV
STDDEV
STDDEV_POP
Analytic Functions
28
IndexOf
ROW_NUMBER
ROW_NUMBER
ROW_NUMBER
ROW_NUMBER
ROW_NUMBER
MedianOf
ROW_NUMBER /
COUNT = 0.5
PERCENTILE_DISC
(0.5)
ROW_NUMBER /
COUNT = 0.5
ROW_NUMBER /
COUNT = 0.5
ROW_NUMBER /
COUNT = 0.5
RankOf
DENSE_RANK
DENSE_RANK
DENSE_RANK
DENSE_RANK
RANK
© 2012 IBM Corporation
Utilizing Custom Macros
© 2012 IBM Corporation
Custom Macros - Definition
Tools Custom Macros
Name and parameter list
Description/Help
Expression
Includes TEMPTABLE token
Uses {} for conditional query
generation
Expression type
Data source
Return value type
© 2012 IBM Corporation
Custom Macros - Usage
Used in query definitions
Listed in Query Helper…
Syntax help is available
Top level Select SQL:
select a.CustomerID,
sum(a.UnitPrice*a.Quantity – a.Discount)
from dbo.vOrderDetails a
group by a.CustomerID
Mid stream SQL:
select a.CustomerID,
sum(a.UnitPrice*a.Quantity – a.Discount)
from dbo.vOrderDetails a, UAC_34 b
where a.CustomerID=b.CustomerID
group by a.CustomerID
© 2012 IBM Corporation
Custom Macros – Data Mapping
Columns
of interest
© 2012 IBM Corporation
Custom Macros - Creation
Name and
parameters
Description/Help
Macro expression
with parameterized
SQL
Note alias for the
calculated column
33
© 2012 IBM Corporation
Custom Macros - Queries
34
© 2012 IBM Corporation
© 2012 IBM Corporation
Custom Macros – Single Execution
1.
Unica Campaign creates temp table for the results of Order_Aggr(Amount,
sum)
SELECT * INTO UAC_7_8 FROM (
SELECT vOrderDetails.CustomerID, sum(Amount) value
FROM vOrderDetails, UAC_7_0 where
vOrderDetails.CustomerID=UAC_7_0.CustomerID
GROUP BY vOrderDetails.CustomerID ) a
2.
Unica Campaign evaluates selection criteria and creates output cell
SELECT * INTO UAC_7_a FROM (
SELECT DISTINCT UAC_7_8.CustomerID
FROM UAC_7_8 INNER JOIN
UAC_7_0 ON (UAC_7_8.CustomerID=UAC_7_0.CustomerID)
WHERE (UAC_7_8.value > 10000) ) a
© 2012 IBM Corporation
Custom Macros – Expression Execution
1.
UC creates temp table for each custom macro
SELECT * INTO UAC_7_e FROM (SELECT vOrderDetails.CustomerID, sum(Amount) value
FROM vOrderDetails, UAC_7_0 where
vOrderDetails.CustomerID=UAC_7_0.CustomerID group by
vOrderDetails.CustomerID ) a
SELECT * INTO UAC_7_d FROM (
2.
SELECT vOrderDetails.CustomerID, sum(Discount) value
FROM vOrderDetails, UAC_7_0 where
OrderDetails.CustomerID=UAC_7_0.CustomerID
GROUP BY vOrderDetails.CustomerID ) a
UC evaluates expression and creates output cell
SELECT * INTO UAC_7_f FROM (
SELECT DISTINCT CustomerID FROM (
UAC_7_d FULL OUTER JOIN UAC_7_e ON UAC_7_d.CustomerID=UAC_7_e.CustomerID)
INNER JOIN UAC_7_0 ON
(UAC_7_d.CustomerID=UAC_7_0.CustomerID) OR (UAC_7_e.CustomerID=UAC_7_0.CustomerID)
WHERE ((UAC_7_d.value / UAC_7_e.value) > 0.1) ) a
© 2012 IBM Corporation
Q&A
Contact your Account Manager
for a product demonstration:
[email protected]
Learn more at
www.ibm.com/software/marketing-solutions/cafe
37
© 2012 IBM Corporation
Addendum
Campaign Configuration Settings
Loader Settings
Data Source Settings
© 2012 IBM Corporation
Datasource Configuration Level
Connections, Query Threads/Parallelism, I/O
– BulkInsertBlockSize
– BulkReaderBlockSize
– ConnectionCacheSize
– MaxQueryThreads
– QueryThreadSleep
– ShareConnection
39
© 2012 IBM Corporation
Datasource Configuration Level
SQL Generation
– AllowBaseJoinsInSelect
– AllowSegmentUsingSQLCase
– EnableBaseDimSelfJoin
– EnableSelectDistinct
– EnableSelectOrderBy
– PrefixOnSelectSQL
– SQLOnConnect
– UseExceptForMerge
– UseMergeForTrack
– UseNonANSIJoin
– UseNotInForMerge
– UseSQLToProfile
40
© 2012 IBM Corporation
Datasource Configuration Level
DB Loaders
– LoaderCommand
– LoaderCommandForAppend
– LoaderControlFileTemplate
– LoaderControlFileTemplateForAppend
– LoaderDelimiter
– LoaderDelimiterForAppend
– LoaderDelimiterAtEnd
– LoaderDelimiterAtEndForAppend
– LoaderUseLocaleDP
– MinReqForLoaderCommand
– MinReqForLoaderCommandForAppend
© 2012 IBM Corporation
Datasource Configuration Level
DB Loaders - Tokens Available
Token
Description
<AMUSER>
IBM Unica user name associated with the flowchart being run.
<CAMPAIGNCODE>
Code for the campaign associated with the flowchart.
<CAMPAIGNNAME>
Name of the campaign associated with the flowchart being run.
<CONTROLFILE>
Full path and file name to the temporary control file.
<DATABASE>
Name of the data source that Campaign is loading data into.
<DATAFILE>
Full path and filename to the temporary data file created by Campaign.
<DBUSER>
DB user name for the database.
<DSN>
Value of the DSN property.
<FLOWCHARTNAME>
Name of the flowchart being run.
<NUMFIELDS>
Number of fields in the table.
<PASSWORD>
DB password from the current flowchart connection to the data source.
<TABLENAME>
DB table name that Campaign is loading data into.
<USER>
DB user from the current flowchart connection to the data source.
© 2012 IBM Corporation
Datasource Configuration Level
DB Loaders - Tokens Available
Token
Description
<DBCOLUMNNUMBER>
Column ordinal in the database.
<FIELDLENGTH>
Length of the field being loaded into the database.
<FIELDNAME>
Name of the field being loaded into the database.
<FIELDNUMBER>
Number of the field being loaded into the database.
<FIELDTYPE>
Literal "CHAR( )". The length of this field is specified between the ().
* If your database happens to not understand the field type, CHAR,
you can manually specify the appropriate text for the field type and
use the <FIELDLENGTH> token.
<NATIVETYPE>
Actual database type that this field is loaded into.
<xyz>
Places the specified character(s) on all fields being loaded into the
database, except the last. A typical use is <,> which repeats a comma
all fields except the last.
<~xyz>
Places the specified characters only on the last repeated line.
<!xyz>
Places the specified character(s), including the angle brackets < >, on
all lines.
© 2012 IBM Corporation
Datasource Configuration Level
DB Tables Statistics
First Group
– PostExtractTableCreateRunScript
– PostSegmentTableCreateRunScript
– PostSnapshotTableCreateRunScript
– PostTempTableCreateRunScript
– PostUserBaseTableCreateRunScript
– PostUserTableCreateRunScript
– TempTablePreTruncateRunScript (only Teradata)
44
© 2012 IBM Corporation
Datasource Configuration Level
DB Tables Statistics
Second Group
– ExtractTablePostExecutionSQL
– SegmentTablePostExecutionSQL
– SnapshotTablePostExecutionSQL
– TempTablePostExecutionSQL
– UserBaseTablePostExecutionSQL
– UserTablePostExecutionSQL
– TempTablePreTruncateExecutionSQL (only Teradata)
45
© 2012 IBM Corporation
Datasource Configuration Level
DB Tables Statistics
Third Group
– SuffixOnAllOtherSQL
– SuffixOnCreateDateField (Teradata FORMAT 'YYYY-MM-DD')
– SuffixOnExtractTableCreation
– SuffixOnInsertSQL
– SuffixOnSegmentTableCreation
– SuffixOnSelectSQL
– SuffixOnSnapshotTableCreation
– SuffixOnTempTableCreation
– SuffixOnUserBaseTableCreation
– SuffixOnUserTableCreation
46
© 2012 IBM Corporation
Datasource Configuration Level
Temporary Tables
– AllowTempTables
– DeleteAsRecreate
– DeleteAsTruncate
– DisallowTempTableDirectCreate
– DoNotCreateServerBinFile
– MaxRowFetchRecords
– MaxTempTableJoinPctSelectAll
– MaxTempTableJoinPctWithCondition
– UseTempTablePool (only Teradata; intention to extend to DB2)
47
© 2012 IBM Corporation
Datasource Configuration Level
DB Tables Statistics - Tokens Available
Token
48
Description
<AMUSER>
IBM Unica Marketing user name associated with the flowchart for
which temp/Extract/Segment/Snapshot table was created.
<CAMPAIGNCODE>
Code for the campaign associated with the flowchart for which
temp/Extract/Segment/Snapshot table was created.
<CAMPAIGNNAME>
Name of the campaign associated with the flowchart for which
temp/Extract/Segment/Snapshot table was created.
<DBUSER>
Database user name for the database where the
temp/Extract/Segment/Snapshot table was created.
<FLOWCHARTNAME>
Name of the flowchart associated with the
temp/Extract/Segment/Snapshot table creation.
<USER>
Campaign user name of the user running the flowchart.
<PASSWORD>
DB password from the current flowchart connection to the data source.
<KEYCOLUMNS>
Temp/Extract/Segment/Snapshot table column name(s).
<TABLENAME>
Temp/Extract/Segment/Snapshot table name.
© 2012 IBM Corporation
Q&A
Contact your Account Manager
for a product demonstration:
[email protected]
Learn more at
www.ibm.com/software/marketing-solutions/cafe
49
© 2012 IBM Corporation