The Following Is Intended To Outline Our General - Oracle

Transcription

The following is intended to outline our generalproduct direction. It is intended for informationpurposes only, and may not be incorporated into anycontract. It is not a commitment to deliver anymaterial, code, or functionality, and should not berelied upon in making purchasing decisions.The development, release, and timing of anyfeatures or functionality described for Oracle’sproducts remains at the sole discretion of Oracle.

Implement Best Practices for Extreme Performance with Oracle DataWarehousingMaria ColganPrincipal Product Manager

Agenda The three Ps of Data Warehousing– Power– Partitioning– Parallel Execution Data Loading Workload Management– Statistics management– Initialization Parameters– Workload Monitoring

The Three Ps

3 Ps - Power, Partitioning, Parallelism Balanced Hardware Configuration– Weakest link defines the throughput larger tables or fact tables should be partitioned– Facilitates data load, data elimination and join performance– Enables easier Information Lifecycle Management Parallel Execution should be used– Instead of one process doing all the work multiple processesworking concurrently on smaller units– Parallel degree should be power of 2

Balanced Configuration“The weakest link” defines the throughputCPU Quantity and Speed dictatenumber of HBAscapacity of interconnectFC-Switch1number of Disk ControllersSpeed and quantity of switchesHBA2HBA1HBA2HBA1HBA2HBA1HBA2HBA1HBA Quantity and Speed dictateControllers Quantity and Speed dictateFC-Switch2number of DisksSpeed and quantity of switchesDisk Quantity and SpeedDiskArray 1DiskArray 2DiskArray 3DiskArray 4DiskArray 5DiskArray 6DiskArray 7DiskArray 8

Sun Oracle Database MachineA Balance Hardware ConfigurationExtreme PerformanceRAC Database Server Grid 8 High-performance low-costcompute servers 2 Intel quad-core Xeons eachInfiniBand Network 3 36-port Infiniband 880 Gb/sec aggregatethroughput 2009 Oracle Corporation - ConfidentialExadata Storage Server Grid 14 High-performance low-coststorage servers 100 TB raw SAS disk storage 5TB of Flash storage 21 GB/sec disk bandwidth 50 GB/sec flash bandwidth 100GB/sec memory bandwidth8

Partitioning Range partition large fact tables typically on date column– Consider data loading frequency Is an incremental load required? How much data is involved, a day, a week, a month?– Partition pruning for queries What range of data do the queries touch - a quarter, a year? Subpartition by hash to improve join performancebetween fact tables and / or dimension tables– Pick the common join column– If all dimension have different join columns use join column forthe largest dimension or most common join in the queries

Partition PruningQ: What was the totalsales for the weekend ofMay 20 - 22 2008?Sales TableMay 18th2008May 19th2008May 20th2008Select sum(sales amount)May 21st2008From SALESWhere sales date betweenMay 22nd2008to date(‘05/20/2008’,’MM/DD/YYYY’)Andto date(‘05/23/2008’,’MM/DD/YYYY’);Only the 3relevantpartitions areaccessedMay 23rd2008May 24th2008

Partition Wise joinSelect sum(sales amount)FromSalesCustomerRangepartition May18th 2008HashPartitionedSub part 1Sub part 1Sub part 2Sub part 2Sub part 3Sub part 3Sub part 4Sub part 4Both tables have the samedegree of parallelism and arepartitioned the same way onthe join column (cust id)SALES s, CUSTOMER cWhere s.cust id c.cust id;Sub part 1Sub part 1Sub part 2Sub part 2Sub part 3Sub part 3Sub part 4Sub part 4A large join is divided intomultiple smaller joins,each joins a pair ofpartitions in parallel

Execution plan for partition-wise joinPartition Hash All above the join &single PQ set indicate partitionpartition-wise joinIDOperation0SELECT STATEMENTPX COORDINATOR12PX SEND QC (RANDOM)NamePstartPstop:TQ10001TQPQ DistribQ1,01QC (RAND)3SORT GROUP BYQ1,014PX RECEIVEQ1,015PX SEND HASH6SORT GROUP BY78910:TQ10000Q1,00Q1,00PX PARTITION HASH ALL1128HASH JOINQ1,00Q1,00TABLE ACCESS FULLCustomers1128Q1,00TABLE ACCESS FULLSales1128Q1,00HASH

How Parallel Execution worksUser connects to thedatabaseBackground processis spawnedUserParallel serverscommunicate amongthemselves & the QCusing messages that arepassed via memorybuffers in the shared poolWhen user issues a parallelSQL statement thebackground processbecomes the QueryCoordinatorQC gets parallelservers from globalpool and distributesthe work to themParallel servers individual sessionsthat perform work inparallel Allocatedfrom a pool ofglobally availableparallel serverprocesses & assignedto a given operation

Parallel Execution PlanSELECT c.cust name, s.purchase date, s.amountFROM sales s, customers cWHERE s.cust id c.cust id;Query CoordinatorIDOperation0SELECT STATEMENT1PX COORDINATOR2PX SEND QC {RANDOM}34NameHASH JOINPX RECEIVETQININ-OUTQ1,01P- SQ1,01PCWPQ1,01PCWP5PX SEND BROADCASTQ1,01P- P6PX BLOCK LE ACCESS FULLCUSTOMERSPX BLOCK ITERATORTABLE ACCESS FULLSALESParallel Serversdo majority of the workPQDistributionBROADCAST

Parallel Execution of a QuerySELECT c.cust name, s.date,s.amountFROM sales s, customers cWHERE s.cust id c.cust id;ConsumersProducers

Producers and Consumer in the execution planConsumersQuery CoordinatorIDOperation0SELECT STATEMENT1PX COORDINATOR2PX SEND QC {RANDOM}34NameHASH JOINPX RECEIVETQININ-OUTQ1,02P- SQ1,02PCWPQ1,02PCWP5PX SEND HASHQ1,00P- P6PX BLOCK ITERATORQ1,00PCWPQ1,00PCWP7TABLE ACCESS FULLCUSTOMERS8PX RECEIVEQ1,02PCWP9PX SEND HASHQ1,01P- P10PX BLOCK ITERATORQ1,01PCWPQ1,01PCWP11ProducersTABLE ACCESS FULLSALESPQDistribution

Parallel Execution of a ScanFull scan ofthe sales table Data is divided into Granules–block range or partition Each Parallel Server is assigned oneor more Granules No two Parallel Servers ever contendfor the same Granule Granules are assigned so that the loadis balanced across all Parallel Servers Dynamic Granules chosen by theoptimizer Granule decision is visible in executionplanPQ 1PQ 2PQ 3

Identifying Granules of Parallelism during scans inthe plan

Controlling Parallel Execution on RAC1. Use RAC ServicesETLAd-Hoc queriesCreate two servicesSrvctl add service –d database name-s ETL-r sid1, sid2Srvctl add service –d database name-s AHOC-r sid3, sid42. PARALLEL FORCE LOCAL - New Parameter forcesparallel statement to run on just node it was issued onDefault is FALSE

Use Parallel Execution with common sense Parallel execution provides performance boost but requiresmore resources General rules of thumb for determining the appropriate DOP– objects smaller than 200 MB should not use any parallelism– objects between 200 MB and 5GB should use a DOP of 4– objects beyond 5GB use a DOP of 32Mileage may vary depending onconcurrent workload and hardwareconfiguration

Data Loading

Access MethodsBulk PerformanceFlat FilesTTSData PumpXML FilesDBLinksWeb ServicesHeterogeneous

Data Loading Best Practices External Tables– Allows flat file to be accessed via SQL PL/SQL as if it was a table– Enables complex data transformations & data cleansing to occur “on the fly”– Avoids space wastage Pre-processing– Ability to specify a program that the access driver will execute to read the data– Specify gunzip to decompress a .gzip file “on the fly” while its being Direct Path in parallel– Bypasses buffer cache and writes data directly to disk via multi-block async IO– Use parallel to speed up load– Remember to use Alter session enable parallel DML Range Partitioning– Enables partition exchange loads Data Compression

SQL Loader or External Tables And the winner is External Tables Why:Full usage of SQL capabilities directly on the dataAutomatic use of parallel capabilities (just like a table)No need to stage the data againBetter allocation of space when storing data High watermark brokering Autoallocate tablespace will trim extents after the load– Interesting capabilities like–––– The usage of data pump The usage of pre-processing

Tips for External Tables File locations and size– When using multiple files the file size should be similar– List largest to smallest in LOCATION clause if not similar in size File Formats– Use a format allowing position-able and seek-able scans– Delimitate clearly and use well known record termination to allow forautomatic Granulation– Always specify the character set if its different to the database Consider compressing data files and uncompressing duringloading Run all queries before the data load to populate column usagefor histogram creation during statistics gathering

Pre-Processing in an External Table New functionality in 11.1.0.7 and 10.2.0.5 Allows flat files to be processed automatically during load– Decompression of large file zipped files Pre-processing doesn’t support automatic granulation– Need to supply multiple data files - # of files will determine DOP Need to Grant read, execute privileges directoriesCREATE TABLE sales external( )ORGANIZATION EXTERNAL(TYPE ORACLE LOADERDEFAULT DIRECTORY data dir1ACCESS PARAMETERS(RECORDS DELIMITED BY NEWLINEPREPROCESSOR exec dir: 'gunzip'FIELDS TERMINATED BY ' ')LOCATION ( ));

Direct Path Load Data is written directly to the database storage usingmultiple blocks per I/O request using asynchronouswrites A CTAS command always uses direct path An Insert As Select needs an APPEND hint to godirectInsert /* APPEND */ into Sales partition(p2)Select * From ext tab for sales data; Only one direct path operation can occur on an object– By specifying a specific partition name in the table you can domultiple concurrent direct path loads into a partitioned table

Parallel Load Ensure direct path loads go parallel– Specify parallel degree either with hint or on both tables– Enable parallelism by issuing alter session command CTAS will go parallel automatically when DOP isspecified IAS will not – it needs parallel DML to be enabledALTER SESSION ENABLE PARALLEL DML;

Partition Exchange LoadingDBA1. Create external tablefor flat filesSales TableMay 18th2008May 19th20082. Use CTAS commandto create nonpartitioned tableTMP SALESTmp salesTmpTable3. Create indexesTmp salesTmpTableMay 20th2008May 21st2008May 22nd2008May 23rd2008thMay 24th20084. Alter table Salesexchange partitionMay 24 2008 with tabletmp salesSalestable nowhas all thedata5. GatherStatistics

Data Compression Use if data being loaded will be read / used more thanonce Works by eliminating duplicate values within a databaseblock Reduces disk and memory usage, often resulting inbetter scale-up performance for read-only operations Require additional CPU during the initial data load But what if workload requires conventional DML accessto the data after it has been loaded ?Use the COMPRESS FOR ALL OPERATIONS

WorkloadMonitoring

Statistics gathering You must gather optimizer statistics– Using dynamic sampling is not an adequate solution Run all queries against empty tables to populatecolumn usage– This helps identify which columns automatically gethistograms created on them Optimizer statistics should be gathered after the datahas been loaded but before any indexes are created– Oracle will automatically gather statistics for indexes as theyare being created

Statistics Gathering By default DBMS STATS gathers following stats for each table– global (table level)– partition level– Sub-partition Optimizer uses global stats if query touches two or more partitions Optimizer uses partition stats if queries do partition elimination andonly one partition is necessary to answer the query– If queries touch two or more partitions the optimizer will use a combinationof global and partition level statistics Optimizer uses sub-partition level statistics if your queries do partitionelimination and only one sub-partition is necessary to answer query

Efficiency Statistics Management How do I gather accurate Statistics “ . Compute statistics gives accurate results but takes too long .” “ . Sampling is fast but not always accurate .” “ . AUTO SAMPLE SIZE does not always work with data skew .” New groundbreaking implementation for AUTO SAMPLE SIZE Faster than sampling Accuracy comparable to compute statistics Gathering statistics on one partition (e.g. after a bulk load)causes a full scan of all partitions to gather global tablestatistics Extremely time and resource intensive Use incremental statistics Gather statistics for touched partition(s) ONLY Table (global) statistics are built from partition statistics

Incremental Global StatisticsSales Table1. Partition level stats aregathered & synopsiscreatedMay 18th2008May 19th2008May 20th20082. Global stats generated byaggregating partitionsynopsisMay 21st2008May 22nd2008May 23rd2008Sysaux Tablespace

Incremental Global Statistics Cont’d3. A new partitionis added to theSales Table table & Data isLoadedMay 18th2008May 19th2008May 20th20086. Global stats generated byaggregating the originalpartition synopsis with thenew oneMay 21st2008May 22nd2008May 23rd2008May 24th20085.synopsis for4. RetrieveGather partitioneach offorthenewotherstatisticspartitionsfrom SysauxpartitionSysaux Tablespace

Step necessary to gather accurate statistics Turn on incremental feature for the tableEXECDBMS STATS.SET TABLE PREFS('SH’,'SALES','INCREMENTAL','TRUE'); After load gather table statistics using GATHER TABLE STATScommand don’t need to specify many parameter– EXEC DBMS STATS.GATHER TABLE STATS('SH','SALES'); The command will collect statistics for partitions and update the globalstatistics based on the partition level statistics and synopsis Possible to set incremental to true for all tables using– EXEC DBMS STATS.SET GLOBAL PREFS('INCREMENTAL','TRUE');

Initialization parametersOnly set what you really need toParameterValueCommentscompatible11.1.0.7.0Needed for Exadatadb block size8 KBLarger size may help withcompression ratiodb cache size5 GBLarge enough to hold metadataparallel adaptive multi userFalseCan cause unpredictable responsetimes as it is based on concurrencyparallel execution message size16 KBImproves parallel server processescommunicationparallel min servers64parallel max servers128pga aggregate target18 GBshared pool size4 GBAvoids query startup costsPrevents systems from beingflooded by parallel serversTries to keep sorts in memoryLarge enough to for PXcommunicate and SQL Area

Using EM to monitor Parallel QueryClick on theperformancetab

Parallel Execution screensClick on theSQLMonitoringlink

Using EM to monitor Parallel QueryClick on a SQL ID to drilldown to more detailsShows parallel degreeused and number ofnodes used in query

SQL Monitoring Screens - PWJOnly one set of parallelservers

Using EM to monitor Parallel QueryCoordinatorConsumersProducers

SQL Monitoring screensClick on paralleltab to get moreinfo on PQThe green arrow indicates which line in theexecution plan is currently being worked on

SQL Monitoring ScreensBy clicking on the tab you can get more detail about what eachindividual parallel server is doing. You want to check each slave isdoing an equal amount of work

Disk Configuration with ASM

For More Informationsearch.oracle.comBest Practices for Data ucts/bi/db/11g/pdf/twp dw best practies 11g11 2008 09.pdf

Exadata SessionsDateTimeRoomSession TitleMon10/125:30PMMoscone South307S311436 - Implement Best Practices for Extreme Performance with Oracle Data Warehouses.Tue10/1311:30AMMoscone South307S311385 - Extreme Backup and Recovery on the Oracle Database Machine.Tue10/131:00PMMoscone South307S311437 - Achieve Extreme Performance with Oracle Exadata and Oracle Database Machine.Tue10/131:00PMMoscone SouthRoom 102S311358 - Oracle's Hybrid Columnar Compression: The Next-Generation CompressionTechnologyTue10/132:30PMMoscone South102S311386 - Customer Panel 1: Exadata Storage and Oracle Database Machine Deployments.Tue10/134:00PMMoscone South102S311387 - Top 10 Lessons Learned Implementing Oracle and Oracle Database Machine.Tue10/135:30PMMoscone South308S311420 - Extreme Performance with Oracle Database 11g and In-Memory Parallel Execution.Tue10/135:30PMMoscone SouthRoom 104S311239 - The Terabyte Hour with the Real-World Performance GroupTue10/135:30PMMoscone South252S310048 - Oracle Beehive and Oracle Exadata: The Perfect Match.Wed10/144:00PMMoscone South102S311387 - Top 10 Lessons Learned Implementing Oracle and Oracle Database Machine.Wed10/145:00PMMoscone South104S311383 - Next-Generation Oracle Exadata and Oracle Database Machine: The Future Is Now.Thu10/1512:00PMMoscone South307S311511 - Technical Deep Dive: Next-Generation Oracle Exadata Storage Server and OracleDatabase Machine

Parallel Execution of a Scan Data is divided into Granules - block range or partition Each Parallel Server is assigned one or more Granules No two Parallel Servers ever contend for the same Granule Full scan of the sales table PQ 1 Granules are assigned so that the load is balanced across all Parallel Servers