The Tuning Affect: Lightening Fast PeopleSoft EPM (Without the EXA's) Session #33261 March 10, 2014 Your Presenter • Reggie Gentle • Extensive BI Experience (Before BI was BI circa 1999) • Former BI Architecture Consultant for major Oracle BI competitor. • Responsible for ensuring consistent, efficient and accurate delivery of BI/DW information to the campus community. • Architect for BI/ETL/DW Full stack • New Environment build out/performance tuning • End-user empowerment. • Experience with Data Warehouses exceeding 25TB in size on both Oracle and Netezza and tables in excess of 10billion rows • Oracle Training/Certifications include: • DBA Performance Tuning Expert • Oracle SQL Expert Florida State University is a premier, comprehensive, graduate research university, with both law and medical schools. Annual Operating Budget: $1.1B Over 41,000 students Over 13,000 employees Over 13,000 biweekly paychecks Over $18 million in biweekly payroll ”Most Efficient” high quality university 2 years in a row (2013/2014 US News) FSU’s BI Profile • Deployed BI/DW Solution in 2008 • Solution meets the reporting needs of our major Administrative & Academic organizations on Campus • Consists of 225 Structured Dashboards and 86 Subject Areas • Over 1855 distinct users & 7+ Million Requests • Dashboard Consumption & Self Service Reporting • Currently Completing Phase V of our CS Deployment (CSW Admissions Implementation) • 2009 Oracle Innovation Award Recipient The Vision for BI @ FSU FSU BI Vision To deliver world class analytic and business intelligence solutions that empower and inform the user community. BI Primary Objectives Change the paradigm of reporting to one of enabling power users and analysts to analyze structured summary and detailed information in an easy-to-access format Establish a unified analytic information platform for Financial, Human Resource, and Student data based on OBIEE. Maintain accurate, high quality, timely, and trustworthy information within the BI Solutions whereby University business decisions can be made and actions can be taken. FSU and Oracle • • • • • • • • • Implemented Financials 8.4, Portal 8.8, and EPM 8.8 in June 2004 Implemented HR/Payroll 8.8 in Dec 2004 Upgraded EPM & Portal Suites to 9.0 in Nov 2007 Upgraded HR Suite to 9.0 in Oct 2008 Upgraded FI Suite to 9.1 in Jan 2011 Deployed EPM 9.0 & OBIEE 10.1.3.3 (Windows) in Mar 2008 Upgraded OBIEE 10.1.3.4 (Linux) in April 2009 Upgraded EPM Suite from 9.0 to 9.1 in Feb 2012 Upgraded DataStage from 7.5 to 8.1 including platform migration from Windows to Linux in Feb 2012 • Upgraded OBIEE 10.x to 11.1.1.6 July 2013 • Currently Upgrading to IBM DataStage 8.5 • Q2/Q3 Upgrades planned OBIEE 11.1.1.8.x (if available) and Migration from AIX to RedHat Enterprise Linux (RHEL) for Data Warehouse The Challenge... Financials Nightly Load • Must complete in under 6 hours each night in order to meet Service Level Objective (SLO) • Data must be accurate • Data must be reconciled to the penny against transactional system • Must be completed without the use of any EXA’s • Must be completed with minimal to no human interaction OBIEE Dashboards • All university approved dashboards must display data in 15 seconds or less (excluding real-time dashboards) • Data must be timely and accurate • Analysis (answers) Ad-hoc must be speedy and provide similar response time when building new documents • Must be completed without the use of any EXA’s The Exa’s (The Competition) What Makes Them So Fast? • Flash Cache – Generally comprised of internal SSD’s and stores “HOT” data or frequently accessed data on SSD’s • Storage Indexes – Provide min/max values for each data storage cell for up to 8 columns in the cell. Queries first hit storage cell index to determine which cells should be returned to satisfy a query • Hybrid Columnar Compress (HCC) – Oracle essentially compresses data providing much smaller amounts of data to scan during query execution • Based on 1/4th rack X3-2 – 16 Cores DB, 36 Cores SQL, 512 GB RAM, 2.4 TB Flash Cache, 7 TB SAS disks What Makes Them So Fast? In Summary… the biggest key point in the previous slide… Smaller data coupled with proper and equally small indexes = faster results… The Home Team Current Architecture Data Warehouse Size • ~ 1.5 TB on Disk (excluding undo, archive..etc) • 1,920,996,115 Rows Spread across 902 tables (Only PSFT schema and excludes EPM system tables) • B-Tree Indexes (excluding unique) – 936 total, 2,220,448,600 rows • Bitmap Indexes – 472 total, 183,819,844 rows Data Warehouse Size (cont.) • Largest two tables are • PS_KK_ACTIVITY_LOG - 88,337,954 rows (growth of ~1mil/month) • PS_HR_ACCTG_LINE - 85,497,468 rows (growth of ~1mil/month • Largest Fact Table - PS_FSU_F_LEDKK_SUM 50,820,071 rows (fact for ledger KK) • Largest Dimension Table - PS_D_JRNL_SOURCE 38,271,118 rows Is this speed achievable without the EXA’s? First Things First! • One must first understand WHAT goals are being sought after. • EPM ETL loads are NOT data intensive load, contrary to popular belief • Generally speaking, faster DW doesn’t equate to faster load times during ETL windows The E-T-L • Let’s take a look at what delivered ETL’s actually do: • Extract – Siphon off new data from respective transactional systems • Transform – Transform the data into a usable format geared towards warehouse • Load – Load the newly enriched data into a target warehouse environment The E-T-L Breakdown Assuming a well tuned environment is implemented, below is an average breakdown of EPM load percentages • Extract – 15% • Transform – 70% • Load – 15% HMMM… That’s interesting!! The Logic Test If 70% of ETL load happens during the transform stage, and the transform stage happens on the DataStage host, why would we expect to see 100x improvements by implementing Exadata for the Data Warehouse environment? DataStage Job Tuning Efficient Loading Ledger KK • Generally was inconsistent with load times and has a tendency to timeout (mutex timeouts) • Heavily dependent on Disk IO speed due to heavy hashed file usage • Delivered with one hashed file for all prior year data. Once hashed file exceeds 2gb needs to be converted to 64bit hashed file. • Can see Volumes of 40k-100k nightly Ledger KK (delivered) Ledger KK (delivered) Ledger KK (current) Ledger KK cont. • Avg. 3min run-time nightly • Volume of 40k-100k nightly • Stage statistics 46498 rows read from Trans_Sid_Lkps1_out 2008 rows read from HASH_PS_FSU_F_LEDGER_KK_2014_Exists 5620 rows read from HASH_PS_FSU_F_LEDGER_KK_2013_Exists 6445 rows read from HASH_PS_FSU_F_LEDGER_KK_2012_Exists 9199 rows read from HASH_PS_FSU_F_LEDGER_KK_2011_Exists 10171 rows read from HASH_PS_FSU_F_LEDGER_KK_2010_Exists 7213 rows read from HASH_PS_FSU_F_LEDGER_KK_2009_Exists 2902 rows read from HASH_PS_FSU_F_LEDGER_KK_2008_Exists 483 rows read from HASH_PS_FSU_F_LEDGER_KK_2007_Exists 156 rows read from HASH_PS_FSU_F_LEDGER_KK_2006_Exists 62 rows read from HASH_PS_FSU_F_LEDGER_KK_2005_Exists 2239 rows written to Trans_Assign_values_out_insert 44259 rows written to Trans_Assign_values_out_update 46498 rows written to UPDATE_PREV_ROWS_BATCH_SID 0 rows written to Trans_Load_Error_in 11.740 CPU seconds used, 13.000 seconds elapsed. Ledger KK cont. • Insert/Update job • Insert takes ~ 12 seconds to run with avg rows 2k-50k • Update job takes ~3minutes to run with avg rows 10k-100k 44259 rows read from Trans_Assign_values_out 44259 rows written to Trans_Assign_values_out 3.200 CPU seconds used, 193.000 seconds elapsed. • Total Job(Xform,Load) run-time ~7-10 minutes for Ledger KK Ledger KK (future) Ledger KK (future) • Each XFM will run as an individual process, thereby allowing for “poor mans” parallel processing. • Allows memory footprint reduction on XFM process • Data flows through the stream faster, reducing timeout issues • Also, removing all IPC’s and using sequential files for buffering of data Database Tuning Financials Milestones Database Storage • 32k Block sized table spaces • Tables between 1mil – 5mil rows shared 32k tablespace • Tables with more than 5mil rows dedicated tablespace, with corresponding index tablespace for indexes • All surrogate ID columns have a singular bitmap index (if more than 26 SID columns determine best by query patterns) • House keeping fields (updatedttm, createddttm, data origin) on OWS tables have bitmap indexes on (See appendix for gotcha’s with bitmaps and deadlocking). • Partition OWS based on load needs from ETL tool, and MDW based on query needs from BI tool • SQL Profiles for query path optimization • Frequently review Tuning advisor recommendations and implement profiles/recommendations whenever necessary (see OBIEE tricks in appendix to allow OBIEE to utilize query profiles) • Partitioned tables get dedicated tablespace for table and indexes Partitioning • Must take an approach of reducing query runtimes based on needs…. IE, OWS tables are partitioned differently than MDW tables would be partitioned • WRH$/ASH tables are your best friend when determining frequently used queries • Partition similar tables on same key.. IE, Header/Line/Src tables should have same partition key • Recommend compiling all queries against said table in spreadsheet and defining commonly used items in where clause • Use local bitmaps on partitions when needed • Goal to have ~100k-250k rows per partition • Millions of partitions now supported with Oracle 11.2.0.4 • Interval partitions are auto managed, so the old DBA management nightmare is no longer a reason to have less partitions with more data Do we need medical assistance for any of the DBA’s in the room? Partitioning • Yes, you heard me right… 100k-250k rows per partition… • Oracle 11g R2 supports over 1million partitions per table, why not utilize what you’ve been given? • If Interval with auto-range is used, new partitions are automatically added without DBA involvement at all. Partitioning • First up KK Activity Log • ~88million rows • Heavily utilized on the selection of “changed data” after OWS loads have pushed new rows into table • May be selected against multiple times throughout the load window • Used heavily for reconciliation Partitioning TABLE_NAME TABLESPACE NAME PART? PS_KK_ACTIVITY_LOG FSU_OWS_KKACTLOG_32K NO # of Exec's Avg Exec Candidate over last 90 Time over Total Exec Days prior 90 Seconds NUM ROWS ? 88,337,954 Yes 2022 379 Part Type Potential Key Indexed Fields (Assume Separate Indexes for Each Field Specified) 764,013.43 KK_TRAN_ID, KK_TRAN_LN, DEPT_ID, FISCAL_YEAR, ACCOUNTING_PERIOD, KK_TRAN_DT FUND_CODE, LASTUPD_EW_DTTM NO 27,977,243 Yes 635 832 121,540.00 KK_TRAN_ID, KK_TRAN_LN, DATA_ORIGIN, SRC_SYS_ID, KK_TRAN_DT LASTUPD_EW_DTTM NO 12,444,537 Yes 2274 335 760,836.22 KK_TRAN_ID, KK_DATETIME_STAMP, KK_TRAN_DT LASTUPD_EW_DTTM PS_HR_ACCTG_LINE FSU_OWS_HR_ACCTG_32K NO 85,497,468 Yes 440 343 162,043.00 RUN_DT PS_KK_SOURCE_LN FSU_KKSRCLN_32K PS_KK_SOURCE_HDR FSU_KKSRCHDR_32K SEQNUM, KK_AMOUNT_TYPE, LASTUPD_EW_DTTM Partitioning Cont. PS_KK_ACTIVITY_LOG SQL ID # of Exec 90 Dy Avg Elapsed Secs last 90 Total Avg Seconds KK_TRAN_DT KK_TRAN_ID KK_TRAN_LN bx68tnvc9uyfc 771 117.0859883 90273.297 Y Y Y as745pchbankc 475 160.1717642 76081.588 Y Y Y 18vdbyt0hpq04 439 405.3001207 177926.753 gdf92xbmb84ua 110 98.41661818 10825.828 Y Y gbdrdma44d0s9 68 2749.091735 186938.238 Y Y 20n0bndtk7bbc 66 2880.288136 190099.017 Y Y ctr1twfqc8t20 56 559.9732321 31358.501 2xzs117sd358s 17 1.766 30.022 Y 35sxczd4au591 16 5.6280625 90.049 Y 83czfyy7aa1mv 2 195.07 390.14 Initial Thoughts Partition: Indexes DEPT_ID FISCAL_YEAR ACCOUNTING_PERIOD Y Y FUND_CODE Y Y Y Y Y Y Y KK_TRAN_DT KK_TRAN_ID, KK_TRAN_LN, DEPT_ID, FISCAL_YEAR, ACCOUNTING_PERIOD, FUND_CODE Y Y Y Y Y Partitioning select distinct sh.snap_id, h.sql_id,dbms_lob.substr(sql_text,100,1) sqlbit,dbms_lob.substr(sql_text,100,DBMS_LOB.GETLENGTH(sql_text)-100) namech, u.username, sh.sql_plan_hash_value, sh.sample_time, sh.SESSION_ID from sys.WRH$_ACTIVE_SESSION_HISTORY sh , DBA_HIST_SQLTEXT h , DBA_USERS u , DBA_HIST_ASH_SNAPSHOT dhas where DBMS_LOB.instr(h.sql_text, 'PS_KK_ACTIVITY_LOG',1,1)>1 and h.sql_id=sh.sql_id and u.user_id=sh.user_id --and h.sql_id = <fill in with sql_id you’d like to analyze further’ and dhas.snap_id=sh.snap_id and u.username in (<userid your ETL’s run under>) and MACHINE = <your university ETL server name> and dbhas.begin_interval_time between to_date('06-01-2013:00','MM-DDYYYY:HH24') and to_date('07-15-2013:23','MM-DD-YYYY:HH24') order by sample_time desc Partitioning OBIEE Tuning OBIEE Tuning • Store Web Catalog on SSD/High Speed Storage • Store user/query cache files on Ram-Drive • Turn off query identifier to reduce hard parsing and plan flopping The Future? Future Architecture Questions? Contacts • Reggie Gentle • • • • BI/EDW Architect Reporting and Information Services Florida State University E-mail: [email protected] This presentation and all Alliance 2014 presentations are available for download from the Conference site at www.alliance-conference.com Presentations from previous meetings are also available Appendix – DW Config • Heavy Usage of Profiles – Use around 150+ profiles as methods to influence the optimizer • Profiles could be different between environments, but are generally migrate-able • Memory Settings: • SGA Target set to 0 (ASM managed) • PGA Target set to 0 (ASM managed) • Auto parallel disabled, but manual parallel statement hinting allowed and utilized for large table scans where index usage is not possible. Also used for rebuilding bitmap indexes and Materialized Views. Appendix – Partitioning SQL SELECT SQL_ID, COUNT(*), AVG(SECONDS_DIFF), SQLBIT FROM ( select session_id , sql_plan_hash_value , sql_id , MIN(SAMPLE_TIME) START_TIME , ((extract(HOUR from (MAX(SAMPLE_TIME)-MIN(SAMPLE_TIME))) *60*60) + (extract(MINUTE from (MAX(SAMPLE_TIME)MIN(SAMPLE_TIME))) *60) + extract(SECOND from (MAX(SAMPLE_TIME)-MIN(SAMPLE_TIME)))) SECONDS_DIFF , (MAX(SAMPLE_TIME)-MIN(SAMPLE_TIME)) TIME_DIFF ,SQLBIT FROM (select distinct sh.snap_id, h.sql_id,dbms_lob.substr(sql_text,100,1) sqlbit,dbms_lob.substr(sql_text,100,DBMS_LOB.GETLENGTH(sql_text)-100) namech, u.username, sh.sql_plan_hash_value, sh.sample_time, sh.SESSION_ID from sys.WRH$_ACTIVE_SESSION_HISTORY sh,SYS.DBA_HIST_SQLTEXT h,dba_users u, DBA_HIST_ASH_SNAPSHOT dhas where sh.snap_id between 1 and 100 and DBMS_LOB.instr(h.sql_text, 'PS_KK_ACTIVITY_LOG',1,1)>0 and h.sql_id=sh.sql_id and u.user_id=sh.user_id )A GROUP BY session_id, sql_plan_hash_value, SQLBIT, sql_id HAVING (MAX(SAMPLE_TIME)-MIN(SAMPLE_TIME)) < '0 06:00:00.49000000' Order by MIN(SAMPLE_TIME) DESC, SESSION_ID DESC ) GROUP BY SQL_ID, SQLBIT Appendix – Index Rebuild SPROC -- Alter indexes for table to unusable SET SERVEROUTPUT ON SIZE 1000000 DECLARE INDEX_NAME CHAR; BEGIN FOR t IN (select INDEX_NAME from ALL_INDEXES where TABLE_NAME IN ('PS_F_CLASS_ENRLMT') AND INDEX_TYPE = 'BITMAP') LOOP -- Loop through list of indexes and mark bitmaps unusable EXECUTE IMMEDIATE 'ALTER INDEX ' || t.INDEX_NAME || ' UNUSABLE'; dbms_output.put_line(t.INDEX_NAME); END LOOP; END; -- rebuild prior unusable indexes SET SERVEROUTPUT ON SIZE 1000000 DECLARE INDEX_NAME CHAR; BEGIN FOR t IN (select INDEX_NAME from ALL_INDEXES where TABLE_NAME IN ('PS_F_CLASS_ENRLMT') AND INDEX_TYPE = 'BITMAP') LOOP -- Loop through list of indexes and rebuild those which are bitmap EXECUTE IMMEDIATE 'ALTER INDEX ' || t.INDEX_NAME || ' rebuild'; dbms_output.put_line(t.INDEX_NAME); END LOOP; END; Appendix – OBIEE SQL Profile usage • Config setting to remove ID from sql’s sent to DB (also explained in Oracle bug 14722496 variable DISABLE_SQL_HASH is set in opmn.xml for the obis process. <ias-component id="coreapplication_obis1" inherit environment="true"> <environment> <variable id="DISABLE_SQL_HASH" value="1"/>
© Copyright 2024