How Not To (mis)use It? Parallel Query In PostgreSQL

Transcription

Parallel Query in PostgreSQL:How not to (mis)use it?Amit Kapila(Senior DB Architect)Rafia Sabih(Software Engineer) 2013 EDB All rights reserved.1

Overview Intra-query parallelism till PG v11 Parallel-query flow in PGSupported parallel operators How to get most from parallel query Tuning parametersDos and don’ts of parallel operators Comparison with contemporary database engines for“parallel infrastructure” Take away 2016 EDB All rights reserved.2

Parallel Query flow: Scans and Aggregates1.2.3.Master spawns the required number of workers andalso works as one of the workers.Each worker scans part of the relation and togetherthey scan the complete tableThe nodes below are the parallel ones and above itare the serial onesMasterWorker 2016 EDB All rights reserved.Worker3Worker

Parallel Query flow: Scans and Aggregates1.A number of workers are spawned once thedecision to use parallel operator is made2.The leader backend that spawns the workers runsthe gather node which coordinates the taskWorkerWorkerWorkerTable 2016 EDB All rights reserved.4Worker

Parallel Query flow: Scans and Aggregates1.2.3.Each of the worker performs the scan, apply the filter, etc. on the tuples ofpages received by that workerWhen completed it transfers the resultant tuples to the masterIn case of aggregates, workers can only perform the aggregate on the tuplesthey received, hence master performs the final aggregate on the an 2016 EDB All rights reserved.5

Parallel Query flow: Scans and Aggregates1.2.3.In case of joins, atleast one of the table is scanned by a set of parallel workersEach worker then scans the inner table for the joinResultant tuples are then passed to lScan onT1Scan onT2 2016 EDB All rights reserved.ParallelScan onT1Scan onT26ParallelScan onT1Scanon T2

Intra-query parallel support Parallel executor support Pass the plan state tree for execution to workersExecute a plan by a set of workers Parallel aware executor nodes Differentotherwisebehaviourswhenruninparalleland Gather (Merge) Collect results across all workers and merge theminto a single result streamCollect all the instrumentation information acrossall workers and show the aggregated information 2016 EDB All rights reserved.7

Parallel operators in PostgreSQL Parallel access methods Parallel seq scan - PG v9.6Parallel index, index-only scans, bitmap-heap scans PG v10 Parallel joins NestedLoop and Hash joins - PG v9.6Merge-join - PG v10, improved parallel hash join - PGv11 Other parallel operators Parallel aggregate - PG v9.6Gather-merge, sub/init plans pushed to workers - PGv10, parallel append - PG v11Parallel create index - PG v11 2016 EDB All rights reserved.8

Performance evaluation of PG v10 on TPC-H Experimental setup RAM 512 GBNumber of cores 32 Parameter settings Work mem 64 MBShared buffers 8 GBEffective cache size 10 GBRandom page cost seq page cost 0.1Max parallel workers per gather 4 Database setup Scale factor 300Additionalindexeso orderdate, o comment 2016 EDB All rights reserved.9l shipmode,l shipdate,

Performance evaluation of PG v10 on TPC-HResults on scale factor 300 2016 EDB All rights reserved.10

Tuning parallelism Parallel query specific parameters max parallel workers per gatherparallel tuple costparallel setup costmin parallel table scan sizemin parallel index scan sizeparallel leader participationparallel workers Other parameters work memeffective cache sizerandom page cost 2016 EDB All rights reserved.11

Tuning parallelism Max parallel workers per gather Number of workers per node for a parallel operatorRecommended value 1 to 4The ideal value of this parameter is determined bynumber of cores in the system and the work requiredat a node E.g. If the number of cores is 8 but the work required at node is enough for 2 workersonly then increasing this parameter will not helpSimilarly, if the number of cores is 2 and we increased this parameter to 10, then it’slikely to cause degradation in performance 2016 EDB All rights reserved.12

Tuning parallelism parallel tuple cost planner's estimate of the cost of transferring onetuple from a parallel worker process to anotherprocess parallel setup cost planner's estimate for launching parallel workers andinitializing dynamic shared memory We can lower the values of these parameters todiagnose the performance of parallel operators 2016 EDB All rights reserved.13

Tuning parallelism min parallel table scan size Minimum size of relations to be considered forparallel sequence scanThe default value of this parameter is 8MBIf the database mostly has large tables then it isbetter to increase this parameterFor diagnostic purposes we can decrease it to lowervalues to analyse the query plans, etc. min parallel index scan size the minimum size of indexparallel scanThe default value is 512kB 2016 EDB All rights reserved.14tobeconsideredfor

Tuning parallelism parallel leader participation Manage the involvement of the leader processQueries which require to maintain the order of tuplesfrom workers, might need the leader to work on thatmore than scanning a part of leaderWhen too many workers are there it might be good tokeep leader free for the management of workers parallel workers Alter table table name set (parallel workers n )Control the degree of parallelism for each table, ifrequired 2016 EDB All rights reserved.15

Tuning parallelism work mem Amount of memory given to per worker per node effective cache size If random page cost is low then this parameter shouldbe enough to accommodate the secondary indexes random page cost Estimated cost of accessing a random page in disk 2016 EDB All rights reserved.16

Tuning parallelism: “Function”ing in parallel query! Functions are parallel unsafe If they modify any database stateIf they make changes to transaction(s) Using subtransaction(s)Accessing sequencesMake persistent changes to settings, e.g. setval Functions are parallel restricted Access temporary tables, client connection ckend-local state which the system cannot synchronizein parallel mode (e.g. setseed) 2016 EDB All rights reserved.17

When not to expectperformance improvementsfrom. 2016 EDB All rights reserved.18

Parallel Sequential Scan Too small table Lesser than the min parallel table scan size Too less tuples filtered out Additional costs dividing the work among workerscollection of tuples from workersIf the number of workers is not high enough, theadditional cost of parallelism could be more than thenon-parallel scan 2016 EDB All rights reserved.19

Parallel Index Scan Size of index is too small Number of leaf pages in the index range is small All the tuples qualify the index filter Index is non-BTree Currently not supported 2016 EDB All rights reserved.20

Parallel Bitmap Heap Scan Size of bitmap is small Bitmap index scan is costlier than bitmap heap scan The bitmap index scan is not supported in parallel,only bitmap-heap scan can be divided among workers Most of the tuples satisfy the qual Though the size of bitmap is big enough, the benefitof parallelism cannot be achieved if most of thetuples are sent to the gather 2016 EDB All rights reserved.21

Parallel Merge Join Inner relation is not small enough Ideally, the size of inner relation should be LEQwork mem/total number of workers, otherwise the amountof memory used might be unexpectedEvery worker will keep a copy of the inner relationIf the copy does not fit in memory it will be send tothe disk which will increase the I/O cost This issue is also true for parallel hash join without sharedhash 2016 EDB All rights reserved.22

Parallel Aggregates Number of groups is too high The final aggregate is to be performed at the gathernodeSo, it is almost same as gather is performing theaggregate alone Early aggregation is not possible E.g. average of an attribute Aggregate functions is parallel unsafe or restricted 2016 EDB All rights reserved.23

Gather-merge Workers contain mutually exclusive tuple ranges Gather-merge can only accept rows from one worker at atime to maintain the order of tuplesThe remaining workers need to halt till the sharedqueue is empty to complete their further processingThe processing is similar to as if workers are givingthe tuples in serial fashion 2016 EDB All rights reserved.24

More workers may not meanbetter performance 2016 EDB All rights reserved.25

Amdahl’s law 2016 EDB All rights reserved.26

Too many cooks spoil the broth! More workers may not translate to better performance If the amount of work is limited to be distributedamong n workers, the n k workers are not going to helpDon’t believe us, see the results for yourselfTPC-H Scale factor 50Additional indexes on l shipmode, l shipdate,o orderdate, and o commentServer settings 2016 EDB All rights reserved.27random page cost seq page cost 0.1effective cache size 10GBshared buffers 8GBwork mem 1GB

Parallel-query architecturePostgreSQLVsOther engines 2016 EDB All rights reserved.28

Parallel infrastructure: PG vs other engines1.2.3.4.Exchange operators Gather, workerEach operator needs to have a parallelversion as a new operator Parallelscans,joins,aggregateTuples can only flow between workers andgather Anewnodecalledgather-mergeisusedtomaintain order of rows Final aggregation can bedone by gather onlyWell-suited for multi-process architecture ofPostgreSQL2.3.4.Exchange operators Distribute, gather, andrepartitionExchange operators can be placed over anyoperator Any scan can be parallelisedby placing distribute andgather operators over it Aggregate can be parallelisedby repartition and gatheroperatorsTuples can be routed among the streams To maintain order of rows For efficient aggregationAdvocated for any multi-threaded architecturebased modelsOther DB enginesPostgreSQL 2016 EDB All rights reserved.1.29

Parallel scan: PG vs other enginesOther DB enginesMasterFinaliseaggregateStream aggregateGathermergeGather ScanParallelScanPage supplierPostgreSQL 2016 EDB All rights reserved.Table30

Case-study: Parallel Hash join1.2.3.All the workers work together to create thecomplete hash table The total work of creatingthe hash-table is dividedamong the workersOnce the hash is prepared, each worker canprobe it to perform the join of a tuple itreceived This probing is lock orcontention free Totaltuplesofouterrelation are divided amongworkers hence dividing thetotal workA smart move to overcome tuple routingmechanismEach stream performs a small hash join whichis later combined for the final result Eachstreamgetsawelldefined range of join Any stream can route a tupleif it receives some thatbelong to the range of otherstream Total work of hash-join isdivided among the streamsOther DB enginesPostgreSQL 2016 EDB All rights reserved.1.31

Parallel query architectures: at a glance In PostgreSQL, parallel-query architecture allows less communicationamong worker nodes, but more work per-node. This is more suited toprocess based architecture where inter-process communication cost ishigher The other architecture described has more communication among workers,but less work per node which could be more-suited to thread-basedarchitecture where there is almost no inter-process communication cost This conclusion is just based on our understanding of the systems 2016 EDB All rights reserved.32

Scope of enhancements in parallel query Parallel bitmap index scan Parallel sort Can improve performance for lengthy sortsCostly order by is common in OLAP environments Parallel materialise node Like shared hash, we can have one shared copy of innertable in parallel for efficient (merge/nested-loop)joins Improvements in parallel aggregate Perform final aggregate in shared memoryThis could remove the bottleneck of finalize aggregate Improvements in query optimizer Improve the costing for parallel operators 2016 EDB All rights reserved.33

Conclusion Remarkable performance improvements with parallel operators on TPC-H Till v9.6 out of 22 queries of TPC-H, performance improved for 15queries, in which 3 queries are at least 4 times faster and 11queries are 2 times fasterFurther in v10, around 10 of 22 TPC-H queries show significantimprovement in performance, in which around 4 queries show morethan 2x improvement Tuning parameters for parallelism Parallel-query specific parameters Other server-side parameters Parallel operators fail to improve performance when Too many workers may not always improve query performance Comparison with other database engines for parallel query architecture 2016 EDB All rights reserved.34

People who contributed Robert HaasAmit KapilaThomas MunroDilip KumarPeter GeogheganAndres FreundDavid RowleyRushabh LathiaRahila SyedAmit KhandekarKuntal GhoshRafia SabihSpecial thanks to Tom Lane for helping in the fixes of many bugs 2016 EDB All rights reserved.35

Output: Thank YouGatherWorkers Planned: 2Workers Launched: 2- Parallel Index Scan on Common phrasesIndex Cond: ( value ‘Thank You’ )Filter: Language ‘English’Slide credits:[1] html[2] 2016/session/1360-parallel-query-in-postgresql/[3] tgresql-expectations-and-opportunities/36

Intra-query parallelism till PG v11 Parallel-query flow in PG Supported parallel operators How to get most from parallel query Tuning parameters Dos and don'ts of parallel operators Comparison with contemporary database engines for "parallel infrastructure" Take away Overview