Primal-Dual Interior-Point Methods - Carnegie Mellon University

Transcription

Primal-Dual Interior-Point MethodsLecturer: Aarti SinghCo-instructor: Pradeep RavikumarConvex Optimization 10-725/36-725

OutlineToday: Primal-dual interior-point method Special case: linear programming2

Barrier method versus primal-dual methodLike the barrier method, primal-dual interior-point methods aim tocompute (approximately) points on the central path.Main differences between primal-dual and barrier methods: Both can be motivated by perturbed KKT conditions, but asthe name suggests primal-dual methods update both primaland dual variables Primal-dual interior-point methods usually take one Newtonstep per iteration (no additional loop for the centering step). Primal-dual interior-point methods are not necessarily feasible. Primal-dual interior-point methods are typically more efficient.Under suitable conditions they have better than linearconvergence.3

Constrained OptimizationConsider the problemmin f (x)xsubject to Ax bg(x) 0where the equality constraints are linear.LagrangianL(x, u, v) f (x) uT g(x) v T (Ax b)4

KKT conditionsKKT conditions f (x) g(x)u AT v 0U g(x) 0Ax bu, g(x) 0. Here U Diag(u), g(x) g1 (x) · · · gr (x)5

KKT conditions for Barrier problemBarrier problemmin f (x) φ(x)xAx bwhereφ(x) rXlog( gj (x)).j 1KKT conditions for barrier problem f (x) g(x)u AT v 0U g(x) 1Ax bu, g(x) 0.Same as before, except complementary slackness condition isperturbed.6

We didn’t cover this, but Newton updates for log barrier problemcan be seen as Newton step for solving these nonlinear equations,after eliminating u (i.e., taking uj /gj (x), j 1, . . . , r).Primal-dual interior-point updates are also motivated by a Newtonstep for solving these nonlinear equations, but without eliminatingu. Write the KKT conditions as a set of nonlinear equationsr(x; u; v) 0,where f (x) g(x)u AT v U g(x) 1r(x, u, v) : Ax b7

This is a nonlinear equation is (x; u; v), and hard to solve; so let’slinearize, and approximately solveLet y (x; u; v) be the current iterate, and y ( x; u; v)be the update direction. Definerdual f (x) g(x)u A vrcent U g(x) 1rprim Ax bthe dual, central, and primal residuals at current y (x; u; v).Now we make our first-order approximation0 r(y y) r(y) r(y) yand we want to solve for y in the above.8

I.e., we solveP 2 f (x) rj 1 uj 2 gj (x) g(x)TAT x U g(x)diag(g(x)) 0 u vA00 rdual rcent rprimSolution y ( x, u, v) is our primal-dual update directionNote that the update directions for the primal and dual variablesare inexorably linked together(Also, these are different updates than those from barrier method)9

Primal-dual interior-point methodPutting it all together, we now have our primal-dual interior-pointmethod. Start with a strictly feasible point x(0) and u(0) 0, v (0) .Define η (0) g(x(0) )T u(0) and let σ (0, 1), then we repeat fork 1, 2, 3 . . .10

Primal-dual interior-point methodPutting it all together, we now have our primal-dual interior-pointmethod. Start with a strictly feasible point x(0) and u(0) 0, v (0) .Define η (0) g(x(0) )T u(0) and let σ (0, 1), then we repeat fork 1, 2, 3 . . . Define ση (k 1) /m Compute primal-dual update direction y Determine step size s Update y (k) y (k 1) s · y Compute η (k) g(x(k) )T u(k) Stop if η (k) δ and (krprim k22 krdual k22 )1/2 δ10

Primal-dual interior-point methodPutting it all together, we now have our primal-dual interior-pointmethod. Start with a strictly feasible point x(0) and u(0) 0, v (0) .Define η (0) g(x(0) )T u(0) and let σ (0, 1), then we repeat fork 1, 2, 3 . . . Define ση (k 1) /m Compute primal-dual update direction y Determine step size s Update y (k) y (k 1) s · y Compute η (k) g(x(k) )T u(k) Stop if η (k) δ and (krprim k22 krdual k22 )1/2 δNote the stopping criterion checks both the central residual via η,and (approximate) primal and dual feasibility10

Backtracking line searchAt each step, we need to find s and setx x s x,u u s u,v v s v.Two main goals: Maintain g(x) 0, u 0 Reduce kr(x, u, v)kUse a multi-stage backtracking line search for this purpose: startwith largest step size smax 1 that makes u s u 0:nosmax min 1, min{ ui / ui : ui 0}Then, with parameters α, β (0, 1), we set s 0.99smax , and Update s βs, until gj (x ) 0, j 1, . . . r Update s βs, until kr(x , u , v )k (1 αs)kr(x, u, v)k11

Special case: linear programmingConsidermincT xsubject toAx bxx 0for c Rn , A Rm n , b Rm .Some history: Dantzig (1940s): the simplex method, still today is one of themost well-known/well-studied algorithms for LPs Karmarkar (1984): interior-point polynomial-time method forLPs. Fairly efficient (US Patent 4,744,026, expired in 2006) Modern state-of-the-art LP solvers typically use both simplexand interior-point methods12

KKT conditions for standard form LPThe points x? and (u? , v ? ) are respectively primal and dual optimalLP solutions if and only if they solve:AT v u cxi ui 0, i 1, . . . , nAx bx, u 0(Neat fact: the simplex method maintains the first three conditionsand aims for the fourth one . interior-point methods maintain thefirst and last two, and aim for the second)13

The perturbed KKT conditions for standard form LP are hence:AT v u cxi ui , i 1, . . . , nAx bx, u 0Let’s work through the barrier method, and the primal-dual interiorpoint method, to get a sense of these twoBarrier method (after elim u):Primal-dual method:0 rbr (x, v) T A v diag(x) 1 c Ax b0 rpd (x, u, v) T A v u c diag(x)u Ax b14

Barrier method: 0 rbr (y y) rbr (y) rbr (y) y, i.e., wesolve diag(x) 2 AT x rbr (x, v)A0 vand take a step y y s y (with line search for s 0), anditerate until convergence. Then update σ Primal-dual method: 0 rpd (y y) rpd (y) rpd (y) y,i.e., we solve 0IAT x diag(u) diag(x) 0 u rpd (x, u, v)A00 vand take a step y y s y (with line search for s 0), butonly once. Then update σ 15

Example: barrier versus primal-dualExample from B & V 11.3.2 and 11.7.4: standard LP with n 50variables and m 100 equality ointmethodsBarrier method uses variousvalues of σ, primal-dual11 methodusesσ 0.1. Both use α 0.01, β 0.511Interior-point methods10010 210 210 210 410 40100 100 6 610 1010 410 6105 105rfeas100 100η̂η̂duality gap102 102102µ 50 µ 150206040Newton iterationsµ 28010 1010 1010 810 810 1010 100 0 5Figure 11.4 Progress of barrier method for a small LP, showing dualitygap versus cumulative number of Newton steps. Three plots are shown,corresponding to three values of the parameter µ: 2, 50, and 150. In eachcase, we have approximately linear convergence of duality gap.Barrier central residual(µ 1/σ)10 510 5rfeas7210 1510 150 055 10 1015 1520 2025 2530 30iterationnumberiterationnumberPrimal-dual centralresidualNewton’s method is λ(x)2 /2 10 5 , where λ(x) is the Newton decrement of thefunction tcT x φ(x).The progress of the barrier method, for three values of the parameter µ, isshown in figure 11.4. The vertical axis shows the duality gap on a log scale. Thehorizontal axis shows the cumulative total number of inner iterations, i.e., Newtonsteps, which is the natural measure of computational effort. Each of the plots hasa staircase shape, with each stair associated with one outer iteration. The width ofeach stair tread (i.e., horizontal portion) is the number of Newton steps requiredfor that outer iteration. The height of each stair riser (i.e., the vertical portion) isexactly equal to (a factor of) µ, since the duality gap is reduced by the factor µ atthe end of each outer iteration.102 102The plots illustrate several typical features of the barrier method. First of all,the method works very well, with approximately linear convergence of the duality510 1015 1520 2025 gressof theinterior-pointmethodfor anFigure11.21Progressof primal-dualthe primal-dualinterior-pointmethodforLP,an LP,showingsurrogatedualitygap gapη̂ andthe normof theand dualresid-residshowingsurrogatedualityη̂ andthe normofprimalthe primaland dualuals,uals,versusiterationnumber.The Theresidualconvergesrapidlyto to withinzero within24 iterations;the thesurrogategap gapalso alsoconvergesto a toverysmallsmallnumberin in24 iterations;surrogateconvergesa verynumberabout28 iterations.The asterabout28 sfeasthanthanthe thebarriermethod,especiallyif highaccuracyis required.barriermethod,especiallyif highaccuracyis required.Primal-dual feasibilitygap, r (krprim k22 krdual k22 )1/2Can see that primal-dual is faster to converge to high accuracy105 10516

10 40102030Newton iterations4050Figure 11.7 Progress of barrier method for three randomly generated standard form LPs of different dimensions, showing duality gap versus cumulative number of Newton steps. The number of variables in each problem isn 2m. Here too we see approximately linear convergence of the dualitygap, with a slight increase in the4 number of Newton steps required for thelarger problems.35503040iterationsNewton iterationsNow a sequence of problems with n 2m, and n growing. Barriermethod uses µ 100, runs just two outer loops (decreases centralresidual by 10 ); primal-dual method uses σ 0.1, stops whencentral residual and feasibilitygap are at most 10 811.8 Implementation25202015 11030102m10310 110102m103Figure 11.8 Average number of Newton steps required to solveFigure100 randomly11.23 Number of iterations required to solve randomly generatedgenerated LPs of different dimensions, with n 2m. Error barsshow stanstandardLPs of different dimensions, with n 2m. Error bars show standard deviation, around the average value, for each value of m.Thedeviation,growth around the average value, for 100 instances of each dimendardin the number of Newton steps required, as the problem dimensionsrangesion. Thegrowth in the number of iterations required, as the problem diover a 100 : 1 ratio, is very small.mensions range over a 100 : 1 ratio, is approximately logarithmic.Barrier methodPrimal-dual methodPrimal-dual method require only slightly more iterations, despitethe fact that it is producinghigher accuracy solutionsA family of LPsHere we examine the performance of the primal-dual method as a function17the problem dimensions, for the same family of standard form LPs conside

Barrier method versus primal-dual method Like the barrier method, primal-dual interior-point methods aim to compute (approximately) points on the central path. Main di erences between primal-dual and barrier methods: Both can be motivated by perturbed KKT conditions, but as the name suggests primal-dual methods update both primal and dual variables