Automated Load Balancing Invocation Based On Application Characteristics

Transcription

Meta-BalancerAutomated Load Balancing Invocation based onApplication CharacteristicsHarshitha Menon, Nikhil Jain, Gengbin Zheng, Laxmikant Kalé25th SeptemberCluster 2012, Beijing, China1 / 30

Meta-BalancerOutline1 IntroductionMotivationLoad Balancing Challenges2 Background3 Meta-BalancerStatistics CollectionDecision Making4 Evaluation5 Conclusion and Future Work2 / 30

Meta-BalancerIntroductionOutline1 IntroductionMotivationLoad Balancing Challenges2 Background3 Meta-BalancerStatistics CollectionDecision Making4 Evaluation5 Conclusion and Future Work3 / 30

n parallel applications on large systemsDifficult to program and extract best performancePerformance is limited by most overloaded processorThe chance that one processor is severely overloaded getshigher as no of processors increases4 / 30

n parallel applications on large systemsDifficult to program and extract best performancePerformance is limited by most overloaded processorThe chance that one processor is severely overloaded getshigher as no of processors increasesLoad imbalance in parallel applicationsLeads to drop in system utilizationHampers scalability of the application4 / 30

Meta-BalancerIntroductionLoad Balancing ChallengesLoad Balancing ChallengesLoad balancing has to be profitable!5 / 30

Meta-BalancerIntroductionLoad Balancing ChallengesLoad Balancing ChallengesLoad balancing has to be profitable!Determining factorsIncurred overheads - collection of statistics, execution ofstrategy to find the new mapping of tasks/work units, movingthe tasksWhen to perform load balance?Load balancing strategy selection5 / 30

Meta-BalancerIntroductionLoad Balancing ChallengesLoad Balancing ChallengesLoad balancing has to be profitable!Determining factorsIncurred overheads - collection of statistics, execution ofstrategy to find the new mapping of tasks/work units, movingthe tasksWhen to perform load balance?Load balancing strategy selectionAdaptive load balancing is needed in a dynamic applications5 / 30

Meta-BalancerIntroductionLoad Balancing ChallengesMeta-BalancerAutomating load balancing related decision making6 / 30

Meta-BalancerIntroductionLoad Balancing ChallengesMeta-BalancerAutomating load balancing related decision makingMonitors the application continuously and predicts loadbehavior6 / 30

Meta-BalancerIntroductionLoad Balancing ChallengesMeta-BalancerAutomating load balancing related decision makingMonitors the application continuously and predicts loadbehaviorIdentifies when to invoke load balancing for optimalperformance based onPredicted load behavior and guiding principlesPerformance in recent past6 / 30

Meta-BalancerBackgroundOutline1 IntroductionMotivationLoad Balancing Challenges2 Background3 Meta-BalancerStatistics CollectionDecision Making4 Evaluation5 Conclusion and Future Work7 / 30

Meta-BalancerBackgroundCharm Message-driven parallel programming paradigm based onoverdecomposition and migratable objects8 / 30

Meta-BalancerBackgroundCharm Message-driven parallel programming paradigm based onoverdecomposition and migratable objectsProgrammer decomposes the problem into tasksCharm RTS manages the scheduling of tasks on theprocessors8 / 30

Meta-BalancerBackgroundCharm Message-driven parallel programming paradigm based onoverdecomposition and migratable objectsProgrammer decomposes the problem into tasksCharm RTS manages the scheduling of tasks on theprocessorsSystem implementationUser View8 / 30

Meta-BalancerBackgroundDynamic Load Balancing Framework in Charm Based on principle of persistence9 / 30

Meta-BalancerBackgroundDynamic Load Balancing Framework in Charm Based on principle of persistenceInstruments the application tasks at fine-grained level9 / 30

Meta-BalancerBackgroundDynamic Load Balancing Framework in Charm Based on principle of persistenceInstruments the application tasks at fine-grained levelRelies on application user to invoke load balancer and selectload balancing strategy9 / 30

Meta-BalancerBackgroundDynamic Load Balancing Framework in Charm Based on principle of persistenceInstruments the application tasks at fine-grained levelRelies on application user to invoke load balancer and selectload balancing strategyWhen the load balancing is invokedGathers the statistics based on the strategy (centralized orhierarchical)Executes load balancing strategyMigrates objects based on new mapping9 / 30

Meta-BalancerMeta-BalancerOutline1 IntroductionMotivationLoad Balancing Challenges2 Background3 Meta-BalancerStatistics CollectionDecision Making4 Evaluation5 Conclusion and Future Work10 / 30

Meta-BalancerMeta-BalancerDesign OverviewModule to control load balancing related decision making11 / 30

Meta-BalancerMeta-BalancerDesign OverviewModule to control load balancing related decision makingImplemented on top of Charm load balancing framework11 / 30

Meta-BalancerMeta-BalancerDesign OverviewModule to control load balancing related decision makingImplemented on top of Charm load balancing frameworkKey responsibilitiesMonitor the application: collect minimal statisticsIdentify the iteration to invoke load balancing to optimizeperformanceForm a consensus among participating processors on when toinvoke load balancing11 / 30

Meta-BalancerMeta-BalancerStatistics CollectionStatistics CollectionROOTAsynchronous collectionStats Red 1PE0a1a2b1c1Stats Red 2a9b2d1c2c3d2c8b10Overlaps with application executionSupported using Charm ’s treebased reductionNo barrier for statistics collectiond7PE1PE2e1e2e3e4e11e12e1312 / 30

Meta-BalancerMeta-BalancerStatistics CollectionStatistics CollectionROOTAsynchronous collectionStats Red 1PE0a1a2b1Stats Red 2a9b2b10Overlaps with application executionSupported using Charm ’s treebased reductionNo barrier for statistics collectionMinimal statisticsc1d1c2c3d2c8d7PE1PE2e1e2e3e4e11e12Max loadAverage loadUtilization of processorse1312 / 30

Meta-BalancerMeta-BalancerDecision MakingDecision MakingConsider the load imbalance given byζ Lmax LavgLavg13 / 30

Meta-BalancerMeta-BalancerDecision MakingDecision MakingConsider the load imbalance given byζ Lmax LavgLavgζ 0 means load imbalance; leads to performance lossShould load balancing be invoked when ζ 0?13 / 30

Meta-BalancerMeta-BalancerDecision MakingDecision MakingConsider the load imbalance given byζ Lmax LavgLavgζ 0 means load imbalance; leads to performance lossShould load balancing be invoked when ζ 0?Goal - minimize total execution time (application loadbalancing overheads)13 / 30

Meta-BalancerMeta-BalancerDecision MakingModel to Predict Ideal LB PeriodConsider a linear model for load prediction based on collectedstatistics14 / 30

Meta-BalancerMeta-BalancerDecision MakingModel to Predict Ideal LB PeriodConsider a linear model for load prediction based on collectedstatisticsAverage load is represented byLavg a t la14 / 30

Meta-BalancerMeta-BalancerDecision MakingModel to Predict Ideal LB PeriodConsider a linear model for load prediction based on collectedstatisticsAverage load is represented byLavg a t laMax load is represented byLmax m t lm14 / 30

Meta-BalancerMeta-BalancerDecision MakingModel to Predict Ideal LB PeriodApplication execution time is sum ofTime spent on running applicationLoad Balancing overhead15 / 30

Meta-BalancerMeta-BalancerDecision MakingModel to Predict Ideal LB PeriodApplication execution time is sum ofTime spent on running applicationLoad Balancing overheadΓ Z τZ ηη ( (mt lm )dt ) (at la )dtτ00τ be the ideal LB period,η be the total iterations an application executes,Γ be the total application execution time, and be the cost associated with load balancing15 / 30

Meta-BalancerMeta-BalancerDecision MakingModel to Predict Ideal LB PeriodEquating the differential oftotal time to zero to minimizeit, we obtainm d(Γ) η ( 2 ) 0dττr 22 τ m16 / 30

Meta-BalancerMeta-BalancerDecision MakingModel to Predict Ideal LB PeriodEquating the differential oftotal time to zero to minimizeit, we obtainm d(Γ) η ( 2 ) 0dττr 22 τ m16 / 30

Meta-BalancerMeta-BalancerDecision MakingConsensus MechanismLOAD BALANCEROOT1PE0a10e112MaxIterationb10d8e123Final LB PeriodBCast 13b11PAUSEd7PE1PE2LB PeriodBCast 10e13c8d9c94b13d10c13PAUSE17 / 30

Meta-BalancerEvaluationOutline1 IntroductionMotivationLoad Balancing Challenges2 Background3 Meta-BalancerStatistics CollectionDecision Making4 Evaluation5 Conclusion and Future Work18 / 30

D: molecular dynamics simulation programFractography: used to study fracture surfaces of materials19 / 30

D: molecular dynamics simulation programFractography: used to study fracture surfaces of materialsMachines usedRanger: SUN constellation cluster at TACCJaguar: Cray system at ORNL19 / 30

D: molecular dynamics simulation programFractography: used to study fracture surfaces of materialsMachines usedRanger: SUN constellation cluster at TACCJaguar: Cray system at ORNLThree sets of ExperimentsNo Load BalancingPeriodic Load BalancingUsing Meta-Balancer19 / 30

Meta-BalancerEvaluationLeanMD with No Load BalancingOverall processorutilization is 65%No significant variation inprocessor loads during therun20 / 30

Meta-BalancerEvaluationLeanMD with Periodic Load BalancingElapsed time vs LB Period (Jaguar)Elapsed time (s)10000Frequent load balancingincreases execution time1000Periodic load balancingmay not give performancebenefit100128 cores256 cores512 cores108161024 cores2048 cores4096 cores32641282565121024LB Period21 / 30

Meta-BalancerEvaluationLeanMD with Meta-BalancerInvoked load balancer atthe beginningThereafter frequency ofload balancing is lowImproved performance by31% and the overallutilization to 95%22 / 30

Meta-BalancerEvaluationLeanMD - Comparison of Execution TimeCore128256512102420484096No LB (s)1945.161005.22516.47264.15135.9270.68Periodic LB (Period) (s)1451.30 (200)750.11 (200)393.30 (400)209.64 (400)116.69 (400)69.6 (700)Meta-Balancer r outperforms periodic load balancing23 / 30

Meta-BalancerEvaluationFractography with No Load BalancingLarge variation inprocessor utilizationLow utilization leading toresource wastage24 / 30

Meta-BalancerEvaluationFractography with Periodic Load BalancingFrequent load balancingleads to high overheadand no benefitInfrequent load balancingleads to load imbalanceand results in no gains25 / 30

Meta-BalancerEvaluationFractography with Meta-BalancerIdentifies the need forfrequent load balancing inthe beginningFrequency of loadbalancing decreases asload becomes balancedIncreases overall processorutilization and gives gainof 31%26 / 30

Meta-BalancerConclusion and Future WorkOutline1 IntroductionMotivationLoad Balancing Challenges2 Background3 Meta-BalancerStatistics CollectionDecision Making4 Evaluation5 Conclusion and Future Work27 / 30

Meta-BalancerConclusion and Future WorkConclusionDifficult to find the optimum load balancing periodDepends on the application characteristicsDepends on the machine the application is run on28 / 30

Meta-BalancerConclusion and Future WorkConclusionDifficult to find the optimum load balancing periodDepends on the application characteristicsDepends on the machine the application is run onMeta-Balancer automates the decision of when to invoke loadbalancing based on application characteristics28 / 30

Meta-BalancerConclusion and Future WorkConclusionDifficult to find the optimum load balancing periodDepends on the application characteristicsDepends on the machine the application is run onMeta-Balancer automates the decision of when to invoke loadbalancing based on application characteristicsMeta-Balancer adaptively identifies load balancing period28 / 30

Meta-BalancerConclusion and Future WorkConclusionDifficult to find the optimum load balancing periodDepends on the application characteristicsDepends on the machine the application is run onMeta-Balancer automates the decision of when to invoke loadbalancing based on application characteristicsMeta-Balancer adaptively identifies load balancing periodMeta-Balancer obtains substantial gains and avoids repetitiveexperimentation28 / 30

Meta-BalancerConclusion and Future WorkFuture WorkExtend Meta-Balancer to select load balancing strategyComputation vs Communication strategyRefinement vs Comprehensive strategyCentralized vs Distributed strategy29 / 30

Meta-BalancerConclusion and Future WorkFuture WorkExtend Meta-Balancer to select load balancing strategyComputation vs Communication strategyRefinement vs Comprehensive strategyCentralized vs Distributed strategyBetter models for predicting load29 / 30

Meta-BalancerConclusion and Future WorkThank you!30 / 30

Meta-Balancer Background Dynamic Load Balancing Framework in Charm Based on principle of persistence Instruments the application tasks at ne-grained level Relies on application user to invoke load balancer and select load balancing strategy When the load balancing is invoked Gathers the statistics based on the strategy (centralized or .