Resource And Deadline-Aware Job Scheduling In Dynamic Hadoop Clusters PDF Free Download

1y ago

23 Views

1 Downloads

400.35 KB

10 Pages

Report/dmca

Download PDF

Transcription

2015 IEEE 29th International Parallel and Distributed Processing SymposiumResource and Deadline-aware Job Scheduling inDynamic Hadoop ClustersDazhao Cheng* , Jia Rao* , Changjun Jiang† and Xiaobo Zhou**Department of Computer Science, University of Colorado, Colorado Springs, USAof Computer Science & Technology, Tongji University, Shanghai, ChinaEmails: {dcheng,jrao, xzhou}@uccs.edu, cjjiang@tongji.edu.cn† Departmentschedulers in Hadoop, such as the default FIFO scheduler,Fair scheduler, Capacity scheduler, the RAS scheduler [20],and their variations [9], [23], optimize job completion timewithout considering deadlines, there are recent studies that aimto guarantee job deadlines in Hadoop workloads by estimatingjob completion time and manipulating job queue ordering [24]or task scheduling [8].A recent trend of running Hadoop in a hybrid environmentfurther complicates the problem. To pursue cost-efﬁciency,Hadoop clusters can be powered by a mix of renewable energyand traditional power grid [6], [9], [17], or share the samecloud infrastructure with interactive workloads [17], [23], orrun opportunistically on transient resources, e.g., AmazonSpot Instances. In these scenarios, the resources available tothe Hadoop cluster are quite dynamic due to the variablesupply of renewable energy, the changing intensity of colocated workloads, or the abrupt termination of market-basedresources. The dynamics in the capacity of Hadoop clusterspose signiﬁcant challenges on satisfying job deadlines. First,it is hard to estimate job completion time with dynamicallyavailable resources. The prediction models should be robustto the varying cluster capacity. Second, job execution andtask scheduling become more complicated. When the amountof available resources drops, high priority jobs or jobs withapproaching deadlines should be prioritized to improve theapplication performance or revenue.In this work, we ﬁnd that deadline misses in Hadoopworkloads can possibly be minimized by exploiting the dynamics in resource availability and the ﬂexibility in Hadooptask scheduling. To this end, we propose, RDS, a Resourceand Deadline-aware Hadoop job Scheduler that dynamicallyallocates resources to different jobs based on the predictionof resource availability and job completion times. RDS temporarily delays low priority jobs or jobs with distant deadlinesin hopes that there will be sufﬁcient resources in the futureto compensate the slowdown caused to these jobs. Morespeciﬁcally, we make the following technical contributions. We develop a self-learning fuzzy model for estimatinga job’s completion time. The fuzzy model performsﬁne-grained job completion time estimation and selfadaptation at every measurement interval. We use asimple but effective model to predict future resourceavailability based on recent history. We formulate Hadoop job scheduling as an optimizationAbstract—As Hadoop is becoming increasingly popular inlarge-scale data analysis, there is a growing need for providingpredictable services to users who have strict requirements onjob completion times. While earliest deadline ﬁrst scheduling(EDF) like algorithms are popular in guaranteeing job deadlinesin real-time systems, they are not effective in a dynamic Hadoopenvironment, i.e., a Hadoop cluster with dynamically availableresources. As there is a growing number of Hadoop clustersdeployed on hybrid systems, e.g., infrastructure powered by mixof traditional and renewable energy, and cloud platforms hostingheterogeneous workloads, variable resource availability becomescommon when running Hadoop jobs. In this paper, we propose,RDS, a Resource and Deadline-aware Hadoop job Schedulerthat takes future resource availability into consideration whenminimizing job deadline misses. We formulate the job schedulingproblem as an online optimization problem and solve it using anefﬁcient receding horizon control algorithm. To aid the control,we design a self-learning model to estimate job completiontimes and use a simple but effective model to predict futureresource availability. We have implemented RDS in the opensource Hadoop implementation and performed evaluations withvarious benchmark workloads. Experimental results show thatRDS substantially reduces the penalty of deadline misses by atleast 36% and 10% compared with Fair Scheduler and EDFscheduler, respectively.I. I NTRODUCTIONAs industries are confronting an unprecedented volume ofdata everyday, Hadoop, the open source implementation ofthe MapReduce programming model, has become the de factostandard technique for storing and analyzing petascale datain a cost-efﬁcient way. For example, the Data warehouseHadoop cluster at Facebook contains 3000 machines and hostson average 25000 MapReduce jobs per day [4]. However,study [11] has shown that current use of Hadoop in researchand enterprises still has signiﬁcant room for improvement onthe performance of Hadoop jobs and the utilization of Hadoopclusters. There is a growing need for providing predictableservices to Hadoop users who have strict requirements on jobcompletion times (i.e., deadlines).However, meeting job deadlines is difﬁcult in currentHadoop platforms. First, because jobs have diverse resourcedemands, it is hard to determine how much resource isneeded for each job to avoid its deadline miss. Second,Hadoop clusters are usually shared by multiple jobs and thescheduling order of these jobs can affect job completiontime [24]. Thus, allocating sufﬁcient resources alone maynot guarantee job completion time effectively. While existing1530-2075/15 31.00 2015 IEEEDOI 10.1109/IPDPS.2015.36956

1324Available nodesNumber of machines (k)Available machines12TABLE IJ OB 1 AND 2 SUBMISSION INFORMATION .GreenHadoop run nodesJobsJ1J2168Input18 GB9 GBCPU demand120 GHz60 GHzArrival time0th min10th minDeadline40th min30th min00612Time (hours)18(a) Google trace.Fig. 1.24612Time (hour)1824(b) GreenHadoop trace.Unlike traditional energy, the intermittency of renewable energy makes it very hard to maintain a stable cluster resourceavailability to process workloads. Goiri et al. proposed GreenHadoop [9], a MapReduce framework for Parasol, a prototypegreen cluster built in Rutgers University. Figure 1(b) showsthat GreenHadoop has dynamic resource availability during 24hours since it is powered by solar energy and uses electricalgrid as a backup. It demonstrates that the available resourceof Hadoop cluster could be highly dynamic due to the timevarying power supply.The above analysis suggests that the beneﬁt of dynamiccapacity provisioning is apparent for production datacentersfrom the perspective of both economics and environments.However, it also brings up a challenging task to managedynamic datacenter clusters. In order to further explore thisproblem, we conduct a case study as follows.Dynamic production and Hadoop clusters.problem based on the prediction models of the jobcompletion time and the resource availability. We designan efﬁcient receding horizon control (RHC) algorithm toderive the online solution. Its solution is a task resourceallocation matrix that minimizes job deadline missesconsidering future resource availability. We develop and implement the RDS scheduler in theopen-source Hadoop implementation and perform comprehensive evaluations with various Hadoop workloads.Experimental results show that RDS effectively reducesjob deadline misses by at least 36% and 10% comparedto Fair scheduler and the earliest deadline ﬁrst (EDF)scheduler, respectively.The rest of this paper is organized as follows. Section IIgives motivations on resource and deadline aware Hadoopscheduling. Section III describes the design of RDS. Section IV gives details on system implementation. Section Vpresents the experimental results. Section VI reviews relatedwork. Section VII concludes the paper.B. Case studyWe created a 5-node virtual Hadoop cluster in our universitycloud testbed and ran two jobs using different schedulers.The cluster was conﬁgured with one master and four slavenodes. All the slave nodes shared a pool of CPU resources.We dynamically changed the resources allocated to the clusterusing VMware’s distributed resource scheduler to emulate thedynamics in resource availability of the cluster. The totalresource was evenly distributed to each slave node. The masternode was allocated a ﬁxed capacity. We ran two different wordcount [1] jobs. Two jobs have different input sizes, resourcedemands and deadlines. Table I gives their information.Figure 2 shows the performance of three schedulers in thedynamic cluster. Note that the CPU demand in Table I isthe cumulative resource requirement for running individualjobs. The GHz in Figures 2 and 3 is the instantaneous CPUallocation of jobs. For example, the size of the region withslanting lines (i.e., J1’s allocation) in Figure 2(b) should equalthe demand of J1 in Table I, which is 120 GHz. In the 10thto 20th and 30th to 40th intervals, the cluster has doubledresources than in other intervals.Figure 2(a) shows the trace of dynamic resource availability.Figures 2(b), 2(c) and 2(d) show the job execution underthree schedulers, namely First In First Out (FIFO), Fairscheduler, and an ideal scheduler considering job deadlines.FIFO schedules jobs based on their arrival times. Job J2 wasdelayed until job J1 ﬁnished, leading to the miss of J2’sdeadline. Fair scheduler allocates an equal amount of resourceto each job. However, fairness in resource allocation doesnot guarantee that J2 met its deadline. The ideal schedulerknew the future resource availability and determined that fairallocations between J1 and J2 would lead to J2’s deadlinemiss because the resource in the 20th to 30th is not sufﬁcient.II. M OTIVATIONWe ﬁrst introduce two dynamic clusters in a productiondatacenter and a research institution respectively. We thenshow that Hadoop job scheduling should be deadline aware inorder to provide predictable services to users. We use concreteexamples to demonstrate that existing Hadoop job schedulersare ineffective in meeting deadlines. Finally, we discuss thepractical issues when applying EDF, the theoretical optimalscheduler, in real and dynamic Hadoop systems.A. Cluster trace analysisTo understand the resource dynamics in production clouddatacenters, we have conducted an analysis of the time-varyingcapacity trace from a production cluster at Google [22].Figure 1(a) demonstrates the number of machines availablein the cluster can ﬂuctuate signiﬁcantly over time. This isdue to the application of power-aware resource provisioningapproach based on machine turn-on and turn-off for energysaving. We aim to provide a solution to this dynamic capacityenvironment by ﬁnding the optimal task scheduling approachto improve application performance while considering the costof scheduling reconﬁgurations.As the environmental impact of datacenters rapidly grows,the industry has started to explore building green datacentersthat are powered by renewable energy. For example, HPLab built up the Net-Zero Energy datacenter recently [17].957

(a) Resource availability.Fig. 2.(b) FIFO Scheduler.(c) Fair Scheduler.(d) Ideal Scheduler.Performance of FIFO Scheduler, Fair Scheduler and an ideal scheduler in a dynamic Hadoop cluster.Knowing that more resources would be available in the 30thto 40th interval, the ideal scheduler prioritized J2 with moreresources from 10th to 30th . The ﬂexible resource allocationeffectively guaranteed the deadlines of both jobs.We make two observations. First, deadline-oblivious schedulers perform poorly for jobs with deadlines in a dynamiccluster. Second, knowledge on future resource availabilityis critical to avoiding deadline misses. For example, if theresource level in the 20th to 30th interval further drops, adeadline-aware scheduler should allocate more resources toJ2 in the 10th to 20th interval. In summary, the informationon future resource availability affects a deadline-aware scheduler’s decisions on individual job resource allocations.For this job setting, there exists a schedule that meets bothjob’s deadline. For such a schedulable job set, earliest deadlineﬁrst scheduling (EDF) can also meet the deadlines. EDF isoptimal on preemptive uniprocessors [19]. However, EDF haspractical issues when used as a Hadoop job scheduler. First,in a resource constrained scenario or an overloaded system,where not all job deadlines can be met, the performance ofEDF is unpredictable and often quite bad. Figure 3(a) showsthe scheduling order of two jobs under EDF. The system become overloaded as the available resources dropped at the 10thminute. For this system, EDF is clearly not optimal because J2would have meet its deadline if it was scheduled before J1 (asshown in Figure 3(b)). Second, EDF does not determine howmuch resource is needed to meet job deadlines [24]. It maylead to over-provisioning or under-provisioning of resources.Finally, EDF overwrites user-deﬁned job priorities making itless attractive in a multi-user environment.[Summary] We have shown that deadline-aware schedulershave signiﬁcant advantage over deadline-oblivious ones. However, obtaining an optimal or near-optimal job scheduler isnot trivial in clusters with dynamically variable resources.We have shown that resource-oblivious schedulers such asEDF only performs scheduling optimizations based on currentobservation of resource availability. Looking forward into thefuture resource availability can guide schedulers in makingwise resource allocation decisions. These ﬁndings motivatedus to develop a resource and deadline-aware scheduler fordynamic Hadoop clusters.(a) EDF-preemptive.Fig. 3.(b) Ideal scheduler.Performance comparison in an overloaded system.estimations of job completion time and the predictions onfuture resource availability. To realize ﬁne-grained resourcecontrol, RDS divides job execution into control intervals andbuilds performance models for estimating job progress, fromwhich overall job completion time can be inferred. Based onthe job completion time estimation and resource availabilityprediction, RDS derives the optimal job scheduling via anonline receding horizon control (RHC) algorithm.Figure 4 shows the architecture of RDS. We describe thefunctionality of each component as follows: Fuzzy performance model takes allocated resourcesand job size as inputs and generates the estimated jobcompletion time as outputs. The model is updated periodically based on the measured job progress at eachcontrol interval.Resource predictor takes the history information onresource availability and predicts the amount of availableresources for the next few intervals.Scheduling optimizer adjusts the number of slots allocated to each running job based on an online recedinghorizon control algorithm.We formulate the resource and deadline-aware schedulingas an optimization problem that minimizes job deadline misspenalty. We present detailed design of the self-adaptive fuzzymodel, the resource model and the scheduling optimizer.A. Problem FormulationWe consider a Hadoop cluster with dynamic resource availability ra . Consider that there are J jobs running in the systemand the control interval is t (t [1, ., T ]). Each job j hasrmap tasks allocated umj resource and reduce tasks allocated ujrefresource. yj is the actual completion time of job j and yjis the reference time that meets its deadline. The optimizationIII. RDS D ESIGNIn this section, we present the design of RDS, a resourceand deadline-aware Hadoop job scheduler. RDS determinesthe resource allocations to individual jobs based on the958

Fig. 4.allocation considering future resource availability. Δumj (t) andΔurj (t) are the resource adjustment for map and reduce tasksin order to meet the expected job progress in each interval,respectively. They together represent the control penalty andare weighted by penalty matrices P and Q. The ﬁrst part of thefunction serves as a penalty for not meeting a job’s progresstarget and the second part requires the cost for changing job’sresource allocations be minimized.To consider future resource availability, the RHC controllerpredicts a job’s performance for the next Hp control intervals. It then computes a sequence of control actions Δu(t),Δu(t 1), . . . , Δu(t Hc ) over Hc control periods, calledthe control horizon, to keep the predicted performance closeto their expected targets. Thus, a performance model is neededto predict the progress of a job in a control interval given acertain amount of resources. The total amount of resourcesavailable in future intervals needs to be predicted.The architecture of RDS.problem is formulated as follows,minT J t 1 j 1s.t.J ωj (yj yjrefyjref) ,r(umj u j ) ra .(1)B. Estimating Job Execution ProgressWe use a multiple-input-single-output fuzzy model to predict a job’s execution progress based on its input sizeand resource allocation in each control interval. The fuzzymodel is often used to capture the complex relationshipbetween resource allocations and a job’s ﬁne-grained executionprogress [14]. However, a job’s progress can be affected bymany factors. First, job progress is not uniform at differentexecution phases, e.g., map and reduce phases. Second, evenwithin the same phase, data skew among tasks leads todifferent task execution speed at different intervals. Finally,co-running jobs may unpredictably interfere with a job’sexecution, making the mapping of resource to job progressvariable. Therefore, we design an online self-adaptive fuzzymodel based on real-time measurements of job progress.1) Fuzzy Model: The job j execution progress in thecontrol interval t is represented as the input-output NARX type(Nonlinear Auto Regressive model with eXogenous inputs),(2)j 1The goal of RDS scheduler is to minimize the penalty ofdeadline misses. Objective Eq. (1) captures the lost revenuedue to the deadline misses. The penalty remains zero until thejob misses its deadline. ωj is a constant that represents thepriority of job j. Constraint Eq. (2) ensures that the sum ofresources assigned to all running jobs must be bounded by thetotal available resources in the cluster.However, a job’s completion time yj can only be measuredwhen the job ﬁnishes. It is often too late for the Hadoopscheduler to intervene if a job already misses its deadline. Tothis end, we break down a job’s execution into small intervalsand apply calibrated deadlines for each interval. Considera job’s deadline is 100 minutes away and the execution isdivided into 10 intervals. If the job can ﬁnish one tenth of thetotal work in each interval, it can meet the overall deadline.The Hadoop scheduler can adjust the resource allocation if ajob’s execution is considered slow based on its progress onindividual intervals. Such a breakdown of job execution alsoallows the scheduler to look forward into future intervals andapply optimization considering future resource availability.Speciﬁcally, we transform the optimization problem to areceding horizon control (RHC) problem that minimizes thefollowing objective function:J(t) Hp J i 0 j 1 Hc J yj (t) Rj (u(t), dj , ξ(t)).(4)R is the relationship between the input variables and the outputvariable. The input variables are the current resource allocationu(t), the job input size dj , and the regression vector ξ(t).Here, resource allocation, u(t) [um (t), ur (t)], includes bothmap resource allocation um (t) and reduce resource allocationur (t). The regression vector ξ(t) contains a number of laggedoutputs and inputs of the previous control periods. It isrepresented as yj (t i) yjref (t i) 2Wξ(t) [(y(t 1), y(t 2), · · · , y(t ny )),(3)(u(t), u(t 1), · · · , u(t nu ))]T2r2( Δumj (t i) P Δuj (t i) Q ),(5)where ny and nu are the number of lagged values for outputsand inputs, respectively. Let ρ denote the number of elementsin the regression vector ξ(t), that is,i 0 j 1where t is the measurement (or control) interval, yj (t) is theactual progress of the job in interval t and yjref (t) is theexpected progress that ensures meeting the job’s deadline. Wis the job priority matrix. The optimization looks forward intofuture Hp intervals and tries to derive the optimal resourceρ ny nu .(6)R is the rule-based fuzzy model that consists of Takagi-Sugenorules [5]. A rule Rj is represented as959

Rj : IF ξ1 (t) is Ωj,1 , ξ2 (t) is Ωj,2 , · · · , and ξρ (t) is Ωj,ρu(t) is Ωj,ρ 1 and dj is Ωj,ρ 2THEN yj (t) ζj ξ(t) ηj u(t) ωj dj θj .(7)TABLE II3- STEP PREDICTION OF ARIMASteps123Here, Ωj is the antecedent fuzzy set of the jth rule, whichis composed of a series of subsets: Ωj,1 , Ωj,2 , · · · , Ωj,ρ 2 . ζj ,ηj and ωj are parameters, and θj is the offset. Their valuesare obtained by ofﬂine training. Each fuzzy rule characterizesthe nonlinear relationship between allocated resources andperformance for a speciﬁc job type.2) Online Self-Learning: Due to the dynamics of MapReduce job behaviors (e.g., data skews, different phases andmulti-tenant interferences), we design an online self-learningmodule to adapt the fuzzy model. It aims to minimize theprediction error of the fuzzy model e(t), which is the errorbetween actual measured job progress and predicted value.If e(t) 0, we apply a recursive least squares (RLS)method [2] to adapt the parameters of the current fuzzyrule. The technique updates the model parameters as newmeasurements are sampled from the runtime system. It appliesexponentially decaying weights on the sampled data so thathigher weights are assigned to more recent observations.We express the fuzzy model output in Eq.(4) as follow:y(t) φ(t)X e(t)t (e(t)2 τ e(t 1)2 ).Inputs of ARIMAra (t 3 t), ra (t 2 t), ra (t 1 t)ra (t 2 t), ra (t 1 t), ra (t t)ra (t 1 t), ra (t t), ra (t 1 t)Outputra (t t)ra (t 1 t)ra (t 2 t)dow h N . Let ra (t h t) denote the hth step prediction ofra (t) knowing the last n observations, i.e., ra (t n), ., ra (t 1). The available resource series ra (t t), ra (t 1 t), ., ra (t h t) are obtained by iterating the one-step prediction. Table IIillustrates how one-step prediction is iterated to obtain a 3step prediction. We study the impact of different predictionhorizons in Section V.D. Scheduling OptimizerThe scheduling optimizer is invoked at each control interval.It solves the RHC control problem using quadratic programming and outputs a sequence of resource adjustments thatminimize ﬁne-grained job deadline misses at each interval.Then, RDS applies the resource adjustment to individual jobsvia a two-level scheduling.Job management: RDS maintains two separate queues forcurrent running jobs and waiting jobs. By default, incomingjobs enter the waiting queue. The scheduling optimizer determines which job is to be moved to the running queue based onthe solution of the RHC control problem. For example, lowpriority jobs or jobs with distant deadlines may not be immediately allocated resources by the RHC control algorithm,i.e., uj (t) 0. These jobs wait until they receive resourceallocation from the scheduling optimizer. Since the resourceallocation is determined once for every control interval, it ispossible that short jobs whose deadline is earlier than thenext control interval may miss their deadlines due to the lateallocation of resources. To this end, we provide a fast pathfor short jobs in the job queue management, that is, RDSimmediately moves a short job to the running queue.Task scheduling: RDS applies the resource adjustment toindividual jobs by changing the number of execution slotsassigned to each job. Assigning more homogeneous slots toa job leads to more resources allocated to the job. Althoughthe actual resources allocated with different slots may vary,we found that the total number of slots assigned to a job is agood approximation of the job’s allocated resources. RDS usesa minimally invasive approach to realize dynamic number ofslots assigned to each job. Algorithm 1 shows how dynamicslots are realized via task scheduling. First, jobs are sortedaccording to their resource adjustment in the next controlinterval. At each heartbeat, the job with the largest resourceadjustment (line 5) is selected to run its task. After assigningone slot to this job, the job’s resource adjustment is updatedby subtracting the amount of resource equivalent to the sizeof one slot rslot (line 9). We calculate rslot based on the totalavailable resource ra and the total number of slots in thecluster. We discuss two scenarios in Section V, where rslotis treated differently.(8)where e(t) is the error between the actual output and predictedoutput. φ(t) [φT1 , φT2 , ., φTρ ] is a vector composed of themodel parameters. X [σ1 X(t), σ2 X(t), ., σρ X(t)] whereσj is the normalized degree of fulﬁllment or ﬁring strengthof jth rule and X(t) [ξ(t)T , u(t)] is a vector containingthe previous outputs and inputs of the control system. Theparameter vector φ(t) is estimated so that the error functionin Eq.(9) is minimized. We apply both the current error e(t)and the previous error e(t 1) to estimate the parameter vector,Error MODEL .(9)t 1Where τ is called the discount factor as it gives higher weightson more recent samples in the optimization. It determines inwhat manner the current prediction error and old errors affectthe update of parameter estimation.C. Predicting Resource AvailabilityWe use a simple but effective Auto-Regressive IntegratedMoving Average (ARIMA) model [3] to predict the resourceavailability in future intervals based on history information.The ARIMA model has been used to predict resource consumption [26], and dynamic power supply [4].In ARIMA model, the available resource of the currentinterval ra (t) is predicted based on the last n observationsof resource availability, i.e., ra (t 1), . ra (t n 1).ra (t) a1 ra (t 1) a2 ra (t 2) . an ra (t n), (10)where a1 , a2 , . . . , an are coefﬁcients obtained via model ﬁtting. RDS predicts future resource availability over a time win-960

Brown energyGreen energyPower (Kw)2.52: repeat3:if Any slot is available then4:/*Select a job in the running job queue*/5:j arg max[Δuj (t)], j {1, · · · , J}Resource AvailabilityCPU resource (GHz)Algorithm 1 Task scheduling with dynamic slots.1: Update the running job queue21.510.56:Select a local task from the job j7:Assign selected task to the available slot8:/*Update control adjustment scheme*/9:Δuj (t) Δuj (t) rslot10:end ifJmr11: untilj 1 (uj (t) uj (t)) ra403020102j50468 10 12 14 16 18Control intervals (th)(a) Energy supply.Fig. 5.2468 10 12 14Control intervals (th)1618(b) Dynamic resource.A dynamic Hadoop cluster.MATLAB to compute the local control solution. We usedMATLAB Builder JA to create a Java class from the MATLABprogram calling quadprog. This Java class was integratedinto RDS and deployed in the master node of the cluster.Based on the observations, we empirically set the controlpenalty weight matrix P [0.0107, 0.0096, 0.0132] and Q [0.0173, 0.0168, 0.0194] for map and reduce task, respectively.We set the control interval to 10 minutes.The algorithm effectively changes the number of slots ofeach job. If Δuj (t) 0, the job eventually will be assignedfree slots. If Δuj (t) 0, the job actually gives up theopportunity to run tasks, which is equivalent to reducing itsnumber of allocated slots.C. WorkloadsIV. S YSTEM I MPLEMENTATIONFor performance evaluations, we used a set of representativeMapReduce applications from the PUMA benchmark [1], i.e.,Wordcount, Terasort and Grep. By default, we set the samepriority for each benchmark. We study the impact of differentjob priorities in Section V-D. For each application, we submitmultiple copies with different input sizes as shown in theTable III that contains jobs with widely varying executiontimes and data set sizes, emulating a scenario where the clusteris used to run many different types of MapReduce applications.Similarly as Natjam et al. [8], we set the expected executiontime of individual jobs to 2.5 times of the job completion timein dedicated cluster with sufﬁcient resources. Then, we derivedjobs’ deadlines according to their arrival times. According tothe study [25], we set inter-arrival time of jobs to 10 minutes.A. TestbedWe built a dynamic Hadoop cluster in our university cloud,which consists of 108-core CPUs and 704 GB memory.VMware vSphere 5.1 was used for server virtualization.VMware vSphere module controls the CPU usage limits inMHz allocated to the virtual machines (VMs). It also providesan API to support the remote management of VMs. Hadoopversion 1.2.1 was deployed to the cluster with 21 VMs, i.e.,one master node and 20 slave nodes. We conﬁgured each slavenode with two map slots and one reduce slot. The block size isconﬁgured 64 MB in our experiments. Each VM was allocated1 virtual CPU and 2 GB memory. All VMs ran Ubuntu Server10.04 with Linux kernel 2.6.32.B. RDS ImplementationD. Dynamic Resource TraceTo implement RDS in the Hadoop environment, we added anew member mapred.job.deadline to store the deadlineof a job to the class JobConf. We applied the idea of Hadoopcapacity scheduler and refactoried its QueueManager classto implement RDS waiting and running job queues. Weimplemented the core components of RDS in the classSchedulerTaskScheduling.Resource Predictor: We used a sensor program providedby VMware vSphere 5.1 to collect the resource availability ofindividual VMs. The cluster information, e.g., number of slots,was monitored by the function getClusterStatus in theJobTracker. We applied the ARIMA approach combined withthe collected resource information of last 3 horizons to obtainthe predicted resource in the next 3 intervals.Performance Modeling: We used MATLAB Fuzzy LogicToolbox to apply subtractive clustering and ANFIS modelingtechnique on the data collected from cluster. At runtime, theperformance models were updated based on new measurements collected from the JobTracker using RLS algorithm.

using VMware's distributed resource scheduler to emulate the dynamics in resource availability of the cluster. The total resource was evenly distributed to each slave node. The master node was allocated a ﬁxed capacity. We ran two differentword-count [1] jobs. Two jobs have different input sizes, resource demands and deadlines.