How to Improve Planning Data Collections Performance

Solution Highlights
1.
Why do we need Data
Collection?
2.
Definitions
3.
Overview of Data Collection
Process
4.
How Refresh collection
Snapshots work and use of
MLOG$ tables?
5.
Performance of Data
Collection Process
How to Improve Planning Data Collections Performance
Oracle ASCP Collection Process Insight
About the Author
This whitepaper is written by Shailesh Kumar Gupta, a Value Chain Planning (VCP)
Senior Consultant with Rapidflow Apps. In addition to MBA in Supply Chain
Management, Shailesh brings solid consulting background in Oracle Manufacturing
and VCP product suites.
About the Company
Rapidflow Apps was founded with the sole objective of providing high quality,
innovative and practical solutions to the real world business problems of our clients in
the areas of Supply Chain Planning, Product Lifecycle Management, Master Data
Management and Business Intelligence.
Please send us your comments/feedback at [email protected]
For more information on ASCP data collection and plan performance issues, please
feel free to contact us at [email protected]. Rapidflow Apps bring years of
experience in handling very large data volumes and has drastically improved
planning processes performance at many clients using Oracle Advanced Supply Chain
Planning(ASCP) and Inventory Optimization(IO).
Executive Summary
Most of the companies using Oracle Advanced Planning Suite face performance issue with data collection. To
improve performance of data collection it is important to understand the process of data collection. This
document provides an insight to data collection process and then discusses some of the techniques to
improve data collection performance.
Contents of this white paper are as follows:
•
•
•
•
•
Why do we need Data Collection?
Definitions
Overview of Data Collection Process
How Refresh collection Snapshots work and use of MLOG$ tables?
Performance of Data Collection Process
Why do we need Data Collection?
Oracle planning engine uses fairly complex algorithms to suggest solutions to a business scenario. These
functions give best results when dealing with a static data, where as in real world scenarios the transactional
data can be highly dynamic. Thus, it may become infeasible for the planning engine to handle this type of
dynamic data. So, to deal with this situation static data is provided to planning engine. This is done through
data collection process which captures snapshot of all the transactional data that is relevant to planning and
stores in database table so that it can be used by planning engine.
Definitions
Applications Data Store (ADS): The set of source data tables in each transaction instance that contain
data relevant to planning.
Data Pull: The data collection process consists of the Data Pull and the Operational Data Store (ODS) Load.
The collection process lets you collect across several Oracle application versions. It supports several
configurations. The two types of collections process are standard and continuous.
Operational Data Store (ODS): The planning data tables in the planning server that act as destination for
the collected data from each of the data sources (both ADS and Legacy).
Planning Data Store (PDS): The outputs of the planning process. The PDS resides in the same data tables
Data Collection Process White Paper
Page 2 of 6
as the ODS. However, PDS data are marked with plan IDs that show which plans they correspond to, while
ODS data are marked with plan ID = -1.
Pull program: Collects the data from the ADS and stores the data into the staging tables. This pull program
is a registered AOL concurrent program that could be scheduled and launched by a system administrator. If
you are using a legacy program, you must write your own pull program.
ODS Load: A PL/SQL program which performs the data transform and moves the data from the staging
tables to the ODS. This collection program is a registered AOL concurrent program that could be scheduled
and launched by the system administrator.
Snapshots: Snapshots are the set of tables and materialized which are refreshed by Refresh Collection
Snapshots request. They store the data that is relevant to planning in source instance.
MLOG$ Tables: An MLOG$ table is the Snapshot Log table created against a base table. A row is inserted in
to the table each time a change (insert/update/delete) is made to the base table.
Overview of Data Collection Process
Data collection process starts with the launch of Planning Data Collection Request. This request triggers
Planning data pull Request. Planning Data Pull launches a spawned program called Refresh Collection
Snapshots. Refresh collection snapshots gathers data from source instance (ADS) and stores that data in
materialized views or snapshot tables. These objects are present at source instance. After all data has been
gathered in the snapshots, Planning Data Pull Workers are launched which collects data from snapshots and
populates planning staging tables in the Planning Server.
After data is staged at staging tables all the data is populated to relevant planning tables in ODS by Planning
ODS Load process with the help of ODS Load Workers. This data in ODS is characterized by plan_id = -1. Same
tables are also used by ASCP plans but in that case data is characterized by plan_id = plan_id of the plan.
Planning Data Collection
Source Instance
Planning Instance
Planning Data Pull
ADS
Refresh Collection
Snapshots
Snapshots
Planning Data
Pull Workers
Data Collection Process White Paper
Staging
Tables
Page 3 of 6
Planning ODS
Load
ODS Load workers
ODS
PDS
How Refresh collection Snapshots work and use of MLOG$ tables?
Every table in ADS that stores transactional data to be used in planning process has a snapshot associated
with it. Each snapshot has log table i.e. MLOG$ associated with it. For each change made in base table, a row
is inserted in MLOG$. When refresh collection snapshot is launched the programs looks for the rows present
in MLOG$ and for each row in MLOG$ the program inserts a row in snapshots from corresponding base
tables. All the details regarding data to be inserted like base table name, changes made etc. are present in
MLOG$ table.
Refresh Collection Snapshots can be run in three modes.
a) Fast- In this mode programs looks at the MLOG$ table and inserts a row in respective snapshot
corresponding to the row in MLOG$. After the data is inserted program deletes the row from MLOG$.
b) Complete- In this mode program by passes the MLOG$ and inserts all the data in snapshots directly
from base tables and program deletes all the data from MLOG$.
c) Automatic- In this mode programs looks at the MLOG$ table and inserts a row in respective
snapshot corresponding to the row in MLOG$. After the data is inserted program truncates the
MLOG$.
Performance of Data Collection Process
Size of MLOG$ govern performance of Data collection. Thus, every method that reduces size of MLOG$
improves the performance of Data Collection Process.
Various methods to improve data collection performance are:
a) Run Refresh Collection Snapshot once before Data Collection is launched.
Reason: Usually refresh collection snapshot is run in fast mode. This mode works on incremental
basis i.e it creates snapshot of the data that was created or updated after the last successful
snapshot. Thus, running this request before collection is launched will ensure that there is lesser data
to be refreshed at time of collection, hence the performance of collection will improve.
Pros: Improves Data Collection performance.
Cons: No disadvantages.
b) Run Gather Schema Statistics regularly
Reason: When large changes to the data or database take place, the system performance may get
slow. Gather Schema Statistics gathers information about the tables and the data in the tables so that
it can determine the best approach for querying the data in the tables. So, if we are experiencing
Data Collection Process White Paper
Page 4 of 6
poor performance then try running this process as it will usually help
significantly improve performance.
Pros: Apart from performance of Data Collection process it also improves performance of other
programs.
Cons: No disadvantages.
c) Run Refresh Collection Snapshot in Automatic mode on periodic basis, weekly or monthly.
Reason: When Refresh is performed in fast or complete mode then the RDBMS deletes the rows
from MLOGs while running Refresh in Automatic mode truncates the MLOGs.
When there are many rows being inserted in the MLOG on a regular basis, then the delete performed
by the RDBMS will not recover space and the size of the MLOG can grow. Data collections in
Automatic Refresh mode truncate the MLOG and then perform a Complete Refresh of the Snapshot.
Pros: In addition to improving performance of Refresh Collection Snapshots, it also recovers database
storage space.
Cons: While dealing with very high volumes sometimes it is found that after executing Automatic
Refresh performance of Refresh Snapshots is not improved or degenerates quickly. Therefore, if the
Automatic Refresh does not resolve your performance issues with Refresh Collection Snapshots , we
recommend the advanced Strategy for handling MLOGs .
Data Collection Process White Paper
Page 5 of 6
© 2010 Rapidflow Apps Inc. All rights reserved.
No part of this document may be reproduced in any form by any means, nor may it be distributed without
the permission of Rapidflow Apps Inc., nor may it be resold or distributed by any entity other than Rapidflow
Apps Inc., without prior written authorization of Rapidflow Apps Inc.
THIS DOCUMENT IS PROVIDED “AS-IS” WITHOUT WARRANTY OF ANY KIND. ALL EXPRESS OR IMPLIED
REPRESENTATIONS, CONDITIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF
MERCHANTIBILITY OR FITNESS FOR A PARTICULAR PURPOSE, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT
SUCH DISCLAIMERS ARE DETERMINED TO BE ILLEGAL.
Data Collection Process White Paper
Page 6 of 6