5.3 KKT Conditions
In mathematical optimisation, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests (sometimes called first-order necessary conditions) for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.
Allowing inequality constraints, the KKT approach to nonlinear programming generalises the method of Lagrange multipliers, which allows only equality constraints. Similar to the Lagrange approach, the constrained maximisation (minimisation) problem is rewritten as a Lagrange function whose optimal point is a saddle point
KKT conditions is the necessary conditions for optimality in general constrained problem.
For a given nonlinear programming problem: \[ \begin{align} \max \quad & f(\mathbf{x}) \\ \text{s.t.} \quad & g_{i}(\mathbf{x}) \leq b_{i} \quad \text{for } i = 1, 2, ..., m \\ & \mathbf{x} \geq \mathbf{0} \end{align} \] where \(\mathbf{x} = \left(x_{1}, x_{2}, \ldots, x_{n}\right)\), The necessary condition for \(\mathbf{x}^{*}\) being its critical point is that \(\mathbf{x}^{*}\) satisfy all the following KKT conditions: \[ \begin{align} \frac{\partial h}{\partial x_{j}} &\leq 0 \quad \text{for } j = 1, 2, \ldots, n \\ x^*_j \frac{\partial h}{\partial x_{j}} &= 0 \quad \text{for } j = 1, 2, \ldots, n \\ \frac{\partial h}{\partial \lambda_{i}} &\leq 0 \quad \text{for } i = 1, 2, \ldots, m \\ \lambda_{i} \frac{\partial h}{\partial \lambda_{i}} &= 0 \quad \text{for } i = 1, 2, \ldots, m \\ \mathbf{x}^{*} &\geq 0 \\ \boldsymbol{\lambda} &\geq 0 \end{align} \] where the Lagrangian function \(h(\mathbf{x}, \boldsymbol{\lambda})\) and its derivatives are: \[ \begin{align} h(\mathbf{x}, \boldsymbol{\lambda}) &= f(\mathbf{x})-\sum_{i=1}^{m} \lambda_{i}\left[g_{i}(\mathbf{x})-b_{i}\right] \\ \frac{\partial h}{\partial x_{j}} &= \frac{\partial f}{\partial x_{j}} - \sum_{i=1}^{m} \lambda_{i} \frac{\partial g_{i}}{\partial x_{j}} \quad \text{for } j = 1, 2, \ldots, n \\ \frac{\partial h}{\partial \lambda_{i}} &= -g_{i}(\mathbf{x})+b_{i} \quad \text{for } i = 1, 2, \ldots, m \end{align} \]
5.3.1 Critical Point
A critical point for a real-valued and differentiable function \(f\left(x_{1}, \ldots, x_{n}\right)\) is a point at which the function’s slope is zero in all of the \(x_{1}, \ldots, x_{n}\) directions. (Morgan 2015)
5.3.2 for Constrained Optimization with Inequality Constraints
Satisfying these conditions does not guarantee that the solution is optimal. Certain additional convexity assumptions are needed to obtain this guarantee. (Hillier 2012)
Thus, following corollary can be used instead:
To illustrate the point, following two-variate problem is analysed.
The decision variables are defined as follows: \[ \begin{array}{clc} \hline \text { Variables } & \text { Definition } & \text{ Type } \\ \hline \mathrm{x}_{1} & \text { hours spent producing IPA } & \text{continuous} \\ \mathrm{x}_{2} & \text { hours spent producing Lager } & \text{continuous} \\ \hline \end{array} \]
The problem can be formulated as: \[ \begin{align} \begin{split} \max \quad & 15 \sqrt{x_{1}} + 16 \sqrt{x_{2}} \\ \text{s.t.} \quad & x_{1} + x_{2} \leq 120 \\ & x_{1}, x_{2} \in \mathbb{R}^+ \end{split} \end{align} \]
Two parts in the function \(L(x_{1}, x_{2}, \lambda)\) are monotonically increasing, so the function is strictly convex. (Its Hessian matrix is positive semidefinite for all possible values.) It is obvious that the decision variables belong to a convex set.
Point (1, 1) is a slater point, so the problem satisfies Slater’s condition. The strong duality holds.
The Lagrangian function is: \[ \begin{align} \begin{split} L(x_{1}, x_{2}, \lambda) = 15 \sqrt{x_{1}} + 16 \sqrt{x_{2}} - \lambda (x_{1} + x_{2} - 120) \end{split} \end{align} \] whose derivatives are: \[ \begin{align} \begin{split} \frac{\partial L}{\partial x_{1}} &= \frac{15}{2 \sqrt{x_{1}}} - \lambda \\ \frac{\partial L}{\partial x_{2}} &= 8 / \sqrt{x_{2}} - \lambda \\ \frac{\partial L}{\partial \lambda} &= x_{1} + x_{2} - 120 \end{split} \end{align} \]
Also: \[ \begin{align} x_1, x_2 \geq 0 \\ \lambda \geq 0 \end{align} \]
Critical points can be calculated by the symbolic math toolbox in MATLAB:
syms x1 x2 lbd
eq(1) = lbd * (x1 + x2 - 120) == 0;
eq(2) = x1 * (15/2/sqrt(x1) - lbd) == 0;
eq(3) = x2 * (8/sqrt(x2) - lbd) == 0;
sol = solve(eq)
Results are (0, 0, 0), (120, 0, 0.6847), (0, 120, 0.7303), and (56.133, 63.867, 1.0010), and corresponding values of the objective function are 0, 164.3168, 175.2712, and 240.2499. So the point (56.133, 63.867, 1.0010) is chosen as the optimal solution. \[ \begin{align} x_{1} &= 56.133 \quad \text{so that } 3 \sqrt{x_{1}} \approx 22.5 \\ x_{2} &= 63.867 \quad \text{so that } 4 \sqrt{x_{2}} \approx 32 \\ \lambda &\approx 1 \\ 15 \sqrt{x_{1}} + 16 \sqrt{x_{2}} &= 240.250 \\ \end{align} \]
So I would allocate 56.133 hours in total to produce 22 and half bottles of IPA, and 63.867 hours in total to produce 32 bottles of Lager. The obtained maximum revenue is 240.250.
5.3.3 Validation
The results obtained from modern optimisation algorithms can be validated using the duality gap and KKT conditions.
For complicated problems, it may be difficult, if not essentially impossible, to derive an optimal solution directly from the KKT conditions. Nevertheless, these conditions still provide valuable clues as to the identity of an optimal solution, and they also permit us to check whether a proposed solution may be optimal. (Hillier 2012)
References
Hillier, Frederick S. 2012. Introduction to Operations Research. Tata McGraw-Hill Education.
Morgan, Peter B. 2015. An Explanation of Constrained Optimization for Economists. University of Toronto Press.