ULE: A Modern Scheduler For FreeBSD - Zeus.cs.pacificu.edu

Transcription

ULE: A Modern Scheduler forFreeBSDChadd WilliamsFebruary 19, 2004

Motivation Why write a new scheduler? Excellent interactive performance!– With small loads– Old scheduling algorithms are O(n) Improve SMP support– no support for processor affinity or binding– Symmetric Multi-Threading support

History 4.3BSD– No SMP support No processor affinity Multi processor systems can be much slower– No scheduling classes– Priority derived by estimation of recent CPU usage andnice: estcpu decay function primary cause of poor high load performance must iterate over every process in the system– No fixed time slices every 100ms a different, equal priority, thread is run

Old FreeBSD Scheduler Extended 4.3BSD scheduler– Basic SMP support– Added scheduling classes real-time and idle classes added interrupt class added with SMP support– same as real-time with lower priorities (lower better)– Tuned parameters for good interactive performanceunder heavy load user base of mainly servers– nice refined processes with values 20 higher than the least nice will not run

Priority Calculation based on number of ticks while running decayed when processes sleep/wakeup decayed once a second– iterates over every process in the system! O(n)– based on current load average of system– without regular decay processes may starve sleeping process’s priority would otherwise neverdecay decay improves priority

New Linux Scheduler O(1) scheduling algorithmsCPU affinityper CPU run queuesTwo priority queues to achieve fairness– run lower priority processes after high priorityprocesses exhaust their time slices Dynamic slices– larger slices given to higher priority processes probably interactive processes negative nice positively affect allocated CPU time

New FreeBSD Scheduler Event driven– no periodic timer Several queuesTwo CPU load-balancing algorithmsInteractivity scorerSlice calculatorPriority calculator

Queues Each CPU has one kse queue struct– kse: kernel schedulable entity– three arrays of run queues indexed by bucketed priorities two used for interrupt, real-time, timesharing classes– current– next one for idle class– load statistics– nice window

Queues: Non-Idle Threads run from the current queue until empty––––pick threads in priority order to runswitch current and next queueseach thread runs for a time slice every two q switchesensures fairness Interactive threads assigned to current queue– low latency response– also runtime and interrupt threads Thread assigned to a queue until it sleeps or untilits slice expires– recalculate base priority, slice size, interactivity scoreeach time a slice expires

Interactivity Scoring Determined via voluntary sleep/run time Interactive threads have high sleep times– waiting for user input Voluntary sleep time– number of ticks betweensleep() and wakeup()– does not grow unbounded Quickly change status Use threshold to assign interactive or not

Priority Calculation Priority used only for run ordering– not fairness Timesharing class has calculated priority– others assigned statically Interactivity score used nice value used– can positively affect CPU time

Nice Impact Must be able to prevent threads fromrunning kseq tracks minimum nice value– only threads within 20 of this value get a slicevalue greater than zero– immediately put in next queue Threads given slices inversely proportionalto the difference of their nice and min nice

Slices Minimum slice value of 10ms Maximum slice value of 140ms Nice interaction– differences of less than 3 don’t matter– 40 nice values/14 slice sizes Interactive tasks: 10ms slice– quickly react to becoming non-interactive

SMP Important goal of ULE is better SMP usage CPU affinity– schedule a thread on the last CPU it was run on– improve cache usage Two load balancing schemes– Pull Migration get threads when the CPU is idle– Push Migration put thread to less loaded CPU from heavily loaded CPU

Pull Migration Prevent any CPU from idling Lock run queue of other processors andlook for runnable threads– highest priority thread in most loaded kseqstolen Less expensive than idling for smallnumbers of processors Effective for short running, high turnover

Push Migration Move thread from loaded kseq to lessloaded kseq– twice a second Only affects two kseqs total– load is irregular– dual processor is the common case– balancing 4 processors takes slightly longer Too much balancing may ruin cache

SMT Support Symmetric Multi-Threading– OS sees many logical CPUs on a physical CPU– logical CPUs share cache, etc. No migration penalty between logical CPUson same physical CPU Must be careful not to overload a physicalCPU Load balancing algorithms work on groupsof CPUs

Yeah, but does it work? Benchmarks run to test responsiveness ofthe system Single CPU system Compared to– 4BSD (old FreeBSD)– Solaris– Linux

Priority

Priority Shows priority for a process over time withconstant runtime Cyclic priority– Solaris– 4BSD Increasing priority– Linux

Nice

Nice All platforms but Linux can starve a highnice process– ULE line never hits zero because of how thetest was run Older schedulers show bias towards -20

Interactivity

Interactivity Two rounds of 30 parallel simulatedcompiles Shows response time of interactive apps Each scheduler is ok after a while– takes a while for Linux and 4BSD Spike at start of second round– Solaris and 4BSD

Interactivity

Interactivity Interactive response during previous nicetest ULE does very well 4BSD and Linux are pretty bad “good enough to use a shell to kill an app”

Pathological Case for Linux

Pathological Case for Linux Measure interactive response Five nice –5 processes– each attempt to use 25% of the processor Priority in Linux tends toward the minimum Early ULE implementations had thisproblem Interactivity scoring algorithm in ULEsolved this problem

Performance ULE development focused on interactivity andnice Now performance tuning Parallel compile test––––4X faster on ULE than 4BSDCPU affinitytest case is memory boundlittle kernel interaction ULE 30% better on apache throughput– dual Xeon– more realistic

Conclusions Interactivity pretty good– interactivity, priority, slice slice are all separateconcepts– adjust each independently nice has less power– livelock under nice load eliminated Currently using SMT leads to worseperformance Default in FreeBSD-current (5.X)

Load balancing algorithms work on groups of CPUs. Yeah, but does it work? Benchmarks run to test responsiveness of the system Single CPU system Compared to –4BSD (old FreeBSD) –Solaris –Linux. Pri