Distributed Database Management Systems - PoliTO

Transcription

Database Management SystemsDistributed Database Management SystemsDBMG1

Distributed architecturesData and computation are distributed overdifferent machinesDifferent levels of complexityDepending on the independence level of nodesTypical advantagesPerformance improvementIncreased availabilityStronger reliabilityDBMG2

Distributed architecturesClient/serverSimplest and more widespreadServer manages the databaseClient manages the user interfaceDistributed database systemDifferent DBMS servers on different network nodesautonomousable to cooperateGuaranteeing the ACID properties requires morecomplex techniquesDBMG3

Distributed architecturesData replicationA replica is a copy of the data stored on a differentnetwork nodeThe replication server autonomously managescopy updateSimpler architecture than distributed databaseDBMG4

Distributed architecturesParallel architecturesPerformance increase is the only objectiveDifferent architecturesMultiprocessor machinesCPU clustersDedicated network connectionsData warehousesServers specialized in decision supportPerform OLAP (On Line Analytical Processing)different from OLTP (On Line TransactionProcessing)DBMG5

Relevant propertiesPortabilityCapability of moving a program from a system to adifferent systemGuaranteed by the SQL standardInteroperabilityCapability of different DBMS servers to cooperateon a given taskInteraction protocols are neededODBCX-Open-DTPDBMG6

Database Management SystemsClient/server ArchitecturesDBMG7

Client/server architectures2-TierThick clientswith some application logicDBMS serverprovides access to dataCLIENT1CLIENTnDBMSSERVERDBDBMG8

Client/server architectures3-TierThin clientbrowserApplication serverimplements business logictypically also a web serverDBMS Serverprovides access to G9

SQL executionCompile & GoThe query is sent to the serverThe query is preparedgeneration of the query planThe query is executedThe result is shippedThe query plan is not stored on the serverEffective for one-shot query executionsprovides flexible execution of dynamic SQLDBMG10

SQL executionCompile & StoreThe query is sent to the serverThe query is preparedgeneration of the query planthe query plan is stored for future usagemay continue with executionthe query is executedthe result is shippedEfficient for repeated query executionsparametric executions of the same queryDBMG11

Database Management SystemsDistributed Database SystemsDBMG12

Distributed database systemsClient transactions access more than one DBMSserverDifferent complexity of available distributedservicesLocal autonomyEach DBMS server manages its local data in anautonomous waye.g., concurrency control, recoveryDBMG13

Distributed database systemsFunctional advantagesAppropriate localization of data and applicationse.g., geographical distributionTechnological advantagesIncreased data availabilityTotal block probability is reducedLocal blocks may be more frequentEnhanced scalabilityProvided by the modularity and flexibility of thearchitectureDBMG14

Database Management SystemsDistributed Database DesignDBMG15

Data fragmentationGiven a relation R, a data fragment is a subset ofR in terms of tuples, or schema, or bothDifferent criteria to perform fragmentationhorizontalsubset of tuplesverticalsubset of schemamixedboth horizontal and vertical togetherDBMG16

Horizontal fragmentationThe horizontal fragmentation of a relation Rselects a subset of tuples in R withsame schema of Robtained by means of spp is the partitioning predicateFragments are not overlappedDBMG17

ExampleThe following table is givenEmployee (Emp#, Ename, DeptName, Tax)Horizontal fragmentation on attribute DeptNamecard(DeptName) NE1 sDeptName ‘Production’ Employee EN sDeptName ‘Marketing’ EmployeeReconstruction of the original tableEmployee E1 E2 ENDBMG18

Vertical fragmentationThe vertical fragmentation of a relation R selectsa subset of schema of RObtained by means of pXX is a subset of the schema of RThe primary key should be included in X to allowrebuilding RAll tuples are includedFragments are overlapping on the primary keyDBMG19

ExampleThe following table is givenEmployee (Emp#, Ename, DeptName, Tax)Vertical fragmentationE1 p Emp#, Ename, DeptName EmployeeE2 p Emp#, Ename, Tax EmployeeReconstruction of the original tableEmployee E1DBMGE22020

Fragmentation propertiesCompletenesseach information in relation R is contained in atleast one fragment RiCorrectnessthe information in R can be rebuilt from itsfragmentsDBMG21

Distributed database designIt is based on data fragmentationData distribution over different serversEach fragment of a relation R is usually storedin a different filepossibly, on a different serverRelation R does not existit may be rebuilt from fragmentsDBMG22

Allocation of fragmentsThe allocation schema describes how fragmentsare stored on different server nodesNon redundant mapping if each fragment is storedon one single nodeSITE 1SITE 2SITE 1DBMG23

Allocation of fragmentsRedundant mapping if some fragments arereplicated on different serversincreased data availabilitycomplex maintenancecopy synchronization is neededSITE 1SITE 2SITE 1 SITE 2DBMG24

Transparency levelsTransparency levels describe the knowledge ofdata distributionAn application should operate differentlydepending on the transparency level supportedby the DBMSTransparency levelsfragmentation transparencyallocation transparencylanguage transparencyDBMG25

Transparency levelsThe following table is givenSupplier S (S#, SName, City, Status)Horizontal fragmentation on the City attributedomain of city {Torino, Roma}Allocation schemaDBMGHorizontal fragmentAllocation schemaS1 scity ‘Torino’ SS1@xxx.torino.itS2 scity ‘Roma’ SS2@xxx.roma1.itS2@xxx.roma2.it26

Fragmentation transparencyApplications know the existence of tables and notof their fragmentsdata distribution is not visibleExampleThe programmer only accesses table Snot its fragmentsSELECT SNameFROM SWHERE S# :CODEDBMG27

Allocation transparencyApplications know the existence of fragments,but not their allocationnot aware of replication of fragmentsmust enumerate all fragmentsExampleDBMGSELECT SNameFROM S1WHERE S# :CODEIF (NOT FOUND)SELECT SNameFROM S2WHERE S# :CODE28

Language transparencyThe programmer should select both the fragmentand its allocationNo SQL dialects are usedThis is the format in which higher level queriesare transformed by a distributed DBMSExample SELECT SNameDBMGFROM S1@xxx.torino.itWHERE S# :CODESelection of aIF (NOT FOUND)specific replica of S2SELECT SNameFROM S2@xxx.roma1.itWHERE S# :CODE29

Database Management SystemsTransaction classificationDBMG30

Transaction classificationThe client requests the execution of a transactionto a given DBMS serverthe DBMS server is in charge of redistributing itClasses define different complexity levels in theinteraction among DBMS serversThey are based on the type of SQL instructionwhich the transaction is allowed to containDBMG31

Transaction classificationRemote requestRead only requestonly select statementSingle remote serverRemote transactionAny SQL commandSingle remote serverDBMG32

Transaction classificationDistributed transactionAny SQL commandEach SQL statement is addressed to one singleserverGlobal atomicity is needed2 phase commit protocolDistributed requestEach SQL command may refer to data on differentserversDistributed optimization is neededFragmentation transparency is in this class onlyDBMG33

ExampleThe following table is givenAccount (Acc#, Name, Balance)Fragments and allocation schemaHorizontal fragmentationDBMGAllocation schemaA1 sacc# 10000 AccountNode 1A2 sacc# 10000 AccountNode 234

ExampleMoney transfer transactionBoT(Beginning of transaction)UPDATE AccountSET Balance Balance - 100WHERE Acc# 3000UPDATE AccountSET Balance Balance 100WHERE Acc# 13000EoT(End of transaction)DBMG35

ExampleWhat is the class of the transaction?Distributed request because Account is notexplicitly partitionedIf instead the update instructions referenceexplicitly A1 and A2Distributed transactionIf both update instructions reference only A1e.g., second update with WHERE Acc# 9000Remote transactionDBMG36

Database Management SystemsDistributed DBMS TechnologyDBMG37

ACID propertiesAtomicityIt requires distributed techniques2 phase commitConsistencyConstraints are currently enforced only locallyIsolationIt requires strict 2PL and 2 Phase CommitDurabilityIt requires the extension of local procedures tomanage atomicity in presence of failureDBMG38

Other issuesDistributed query optimization is performed bythe DBMS receiving the query execution requestIt partitions the query in subqueries, eachaddressed to a single DBMSIt selects the execution strategyorder of operations and execution techniqueorder of operations on different nodestransmission cost may become relevant(optionally) selection of the appropriate replicaIt coordinates operations on different nodes andinformation exchangeDBMG39

AtomicityAll nodes (i.e., DBMS servers) participating to adistributed transaction must implement the samedecision (commit or rollback)Coordinated by 2 phase commit protocolFailure causesNode failureNetwork failure which causes lost messagesAcknowledgement of messages (ack)Usage of timeoutNetwork partitioning in separate subnetworksDBMG40

2 Phase Commit protocolObjectiveCoordination of the conclusion of a distributedtransactionParallel with a weddingPriest celebrating the weddingCoordinates the agreementCouple to be marriedParticipate to the agreementDBMG41

2 Phase Commit protocolDistributed transactionOne coordinatorTransaction Manager (TM)Several DBMS servers which take part to thetransactionResource Managers (RM)Any participant may take the role of TMAlso the client requesting the transaction executionDBMG42

New log recordsTM and RM have separate private logsRecords in the TM logPrepareit contains the identity of all RMs participating to thetransaction (Node ID Process ID)Global commit/abortfinal decision on the transaction outcomeCompletewritten at the end of the protocolDBMG43

New log recordsNew records in the RM logReadyThe RM is willing to perform commit of thetransactionThe decision cannot be changed afterwardsThe node has to be in a reliable stateWAL and commit precedence rules are enforcedResources are lockedAfter this point the RM loses its autonomy for thecurrent transactionDBMG44

2 Phase Commit protocolRMTMPrepareLOGDBMG45

Phase I1. The TMWrites the prepare record in the logSends the prepare message to all RM(participants)Sets a timeout, defining maximum waiting time forRM answerDBMG46

2 Phase Commit protocolRMReadyTMPrepareLOGLOGDBMG47

Phase I2. The RMsWait for the prepare messageWhen they receive itIf they are in a reliable stateWrite the ready record in the logSend the ready message to the TMIf they are not in a reliable stateSend a not ready message to the TMTerminate the protocolPerform local rollbackIf the RM crashedNo answer is sentDBMG48

2 Phase Commit GDBMG49

Phase I3. The TMCollects all incoming messages from the RMsIf it receives ready from all RMsThe commit global decision record is written in thelogIf it receives one or more not ready or thetimeout expiresThe abort global decision record is written in thelogDBMG50

Phase II1. The TMSends the global decision to the RMsSets a timeout for the RM answersDBMG51

2 Phase Commit GCommit/AbortLOGDBDBMG52

Phase II2. The RMWaits for the global decisionWhen it receives itThe commit/abort record is written in the logThe database is updatedAn ACK message is sent to the TMDBMG53

2 Phase Commit GCommit/AbortLOGCompleteDBDBMGLOG54

Phase II3. The TMCollects the ACK messages from the RMsIf all ACK messages are receivedThe complete record is written in the logIf the timeout expires and some ACK messagesare missingA new timeout is setThe global decision is resent to the RMs which didnot answeruntil all answers are receivedDBMG55

2 Phase Commit it/AbortPhase IReadyTMLOGCommit/AbortCompleteDBDBMGLOGPhase IILOG56

Uncertainty windowEach RM is affected by an uncertainty windowStart after ready msg is sentEnd upon receipt of global decisionLocal resources in the RM are locked during theuncertainty windowIt should be smallDBMG57

Failure of a participant (RM)The warm restart procedure is modified with anew caseIf the last record in the log for transaction T is“ready”, then T does not know the global decisionof its TMRecoveryREADY listnew list collecting the IDs of all transactions in readystateFor all transactions in the ready list, the globaldecision is asked to the TM at restartRemote recovery requestDBMG58

Failure of the coordinator (TM)Messages that can be lostPrepare (outgoing)I PhaseReady (incoming)Global decision (outgoing)II PhaseRecoveryIf the last record in the TM log is prepareThe global abort decision is written in the log andsent to all participantsAlternative: redo phase I (not implemented)If the last record in the TM log is the globaldecisionDBMGRepeat phase II59

Network failuresAny network problem in phase I causes globalabortThe prepare or the ready msg are not receivedAny network problem in phase II causes therepetition of phase IIThe global decision or the ACK are not receivedDBMG60

Database Management SystemsX-Open-DTPDBMG61

X-Open-DTPProtocol for the coordination of distributedtransactionsIt guarantees interoperability of distributedtransactions on heterogeneous DBMSsi.e., different DBMS productsBased onOne clientOne TMSeveral RMsDBMG62

InterfacesX-Open-DTP defines interfaces for thecommunicationbetween client and TMTM interfacebetween TM and RMXA interfaceDBMS servers provide the XA interfaceSpecialized products implement the TM andprovide the TM interfaceE.g., BEA tuxedoDBMG63

Standard featuresRMs are passive and only answer to remoteprocedure invocations from the TMThe control of the protocol is embedded in theTMThe protocol implements two optimizations of 2Phase CommitPresumed abortRead onlyHeuristic decision to allow controlled transactionevolution in presence of failuresDBMG64

Presumed abortThe TM, when no information is available in thelog, answers abort to a remote recovery requestby a RMReduces the number of synchronous log writesprepare, global abort, complete are not synchronousSynchronous writes are still neededglobal commit in TM logready, commit in RM logDBMG65

Read onlyExploited by a RM that did not modify itsdatabase during the transactionThe RManswers read only to the prepare requestdoes not write the loglocally terminates the protocolThe TM will ignore the RM in phase II of theprotocolDBMG66

Heuristic decisionAllows transaction evolution in presence of TMfailuresDuring the uncertainty window, a RM may beblocked because of a TM failureLocked resources are blocked until TM recoveryThe blocked transaction evolves locally underoperator controlTransaction end is forced by the operatorTypically rollback, rarely commitHeuristic decision, because actual transactionoutcome is not knownBlocked resources are releasedDBMG67

Heuristic decisionDuring TM recovery, decisions are compared tothe actual TM decisionsIf TM decision and RM heuristic decision aredifferent, atomicity is lostThe protocol guarantees that the inconsistency isnotified to the client processResolving inconsistencies caused by a heuristicdecision is up to user applicationsDBMG68

Protocol interactionClientTM(TM Interface)RM(XA actionTM.OpenSessionClient - TM M.Exit()DBMG69

Database Management SystemsParallel DBMSDBMG70

ParallelismParallel computation increases DBMS efficiencyQueries can be effectively parallelizedExampleslarge table scan performed in parallel on differentportions of datadata is fragmented on different disksgroup by on a large datasetpartitioned on different processorsgroup by result mergedDifferent technological solutions are availableMultiprocessor systemsComputer clustersDBMG71

Inter-query parallelismDifferent queries are scheduled on differentprocessorsUsed in OLTP systemsAppropriate for workloads characterized bysimple, short transactionshigh transaction load100-1000 tpsLoad balancing on the pool of availableprocessing unitsDBMG72

Intra-query parallelismSubparts of the same query are executed ondifferent processorsUsed in OLAP systemsAppropriate for workloads characterized bycomplex queriesreduced query loadComplex queries are partitioned in subquerieseach subquery performs one or more operationson a subset of datagroup by and join are easily parallelizablepipelining operations is possibleDBMG73

Database Management SystemsDBMS benchmarksDBMG74

DBMS benchmarksBenchmarks describe the conditions in whichperformance is measured for a systemDBMS benchmarks are standardized by the TPC(Transaction Processing Council)Each benchmark is characterized byTransaction loaddistribution of arrival time of transactionsDatabase size and contentrandomized data generationTransaction codeTechniques to measure and certify performanceDBMG75

Types of benchmarksTPC COrder entry transactionsIt simulates the behavior of an OLTP systemNew evolution is TPC ETPC HDecision support (OLAP)It is a mix of complex queriesAlso TPC-DI and TPC-DSTPCx-HSBig data managementAssessment of implementation of Hadoop clustersDBMG76

Distributed Database Management Systems 1. D B M G 2 Distributed architectures Data and computation are distributed over different machines Different levels of complexity . Capability of moving a program from a system to a different system Guaranteed by the SQL standard Interoperability