Article

How good is good enough?

August 1, 2023

min read

Introduction

In this post we discuss what it means to solve mathematical optimization problems and that having a solution not necessarily means we have the best possible solution. We highlight why it is however crucial to obtain the best possible solution or at least to know how good a given solution actually is. We demonstrate how this can be achieved by the branch-and-bound paradigm and walk in detail through the various steps of branch-and-bound. We also indicate implications for and differences to quantum optimization when it comes to a vs. the solution.

‍

On mathematical optimization and optimal solutions

Arguably one of the most promising use cases for quantum computing is mathematical optimization (also known as decision making). In mathematical optimization, we want to choose decision variables $x\in\mathbb{R}^n$ in a way to maximize an objective function $f(x)$ while satisfying a set of constraints $c(x)\leq 0$, i.e.:

$$ \begin{aligned} \max_{x\in\mathbb{R}^n} \ &f(x)\\ \text{s.t. } \ &c(x)\leq 0\\ &x_i\in\mathbb{Z}, i\in I \end{aligned}$$

In most practical problems, some (or even all) decision variables are constrained to be integer (the variables $x_i$ with $i\in I$), while the remaining variables are continuous. In the general case of nonlinear and nonconvex functions $f$ and $c$, the optimization problem above is a nonconvex mixed-integer nonlinear problem (MINLP). Such problems are NP-hard due to their combinatorial and nonlinear nature, and we cannot expect to solve such problems to optimality efficiently on classic computers. Will this change with quantum computers?

Well, this new computing paradigm is supposed to be a disruptive game changer for many computational tasks and mathematical optimization is no exception to that. Quantum optimization approaches like QAOA or Quantum Annealing promise to provide solutions to QUBOs, a subclass of MINLPs, very fast. But what does solution and optimality actually mean?

This is a general discussion and not specifically linked to quantum computing. Understanding the difference between a solution (a point that fulfills all constraints) and the solution (a solution with the best possible objective value) is not tied to the hardware and is equally important for both classical and quantum computing. We discuss implications to quantum computing in more detail towards the end of this post.

‍

A solution vs. the solution

Many optimization algorithms (e.g., metaheuristics like simulated annealing) only return a solution to an optimization problem without any guarantees that this is also the globally optimal (i.e., best possible) solution. There might or might not be a better solution to the problem, but we never know when using such approaches. We have no clue by how much the solution could potentially still be improved. For many applications however it is crucial to obtain a globally optimal solution. Think about high-stakes scenarios, e.g., the planning and scheduling of ambulances. Sending the right ambulance to the right place at the right time is crucial to save lives. A solution to this optimization task is just not good enough. We need the optimal solution to save as many lives as possible. But how can we actually tell that indeed no better solution exists without looking at all possible solutions?

Branch-and-bound, a well-known general-purpose algorithm suitable for many mixed-integer optimization problems, provides a remedy to this dilemma.

‍

A primer on classic Branch-and-Bound

Branch-and-bound was first introduced by Ailsa Land and Alison Doig in 1960 for solving discrete linear optimization problems, see [1]. This pioneering approach is general enough to be applied to mixed-integer (non-)linear optimization problems and paved the way to today’s large-scale computational optimization. In the following, we give a brief introduction to branch-and-bound. Therefore, reconsider the general MINLP from above and assume that the functions $f$ and $c$ are convex. This renders the problem a convex MINLP.

For the sake of simplicity, we further assume that all variables $x_i$ for $i\in I$ are binary, i.e., we have $x_i \in \{0,1\}$. However, everything that we explain in the following also carries over to general integer variables. In addition, we express the feasible set (the points fulfilling the constraints $c(x)\leq 0$ and the binary conditions $x_i\in\{0,1\}$) of the MINLP by $\Omega$.

As the name implies, branch-and-bound consists of two main ingredients: branching in order to split the optimization task into smaller chunks and bounding the optimal objective value of the optimization problem.

‍

Splitting the problem

The branching step of branch-and-bound derives two subproblems from the original problem by picking a binary variable and fixing it to 0 in the first subproblem and to 1 in the second subproblem. It is straightforward that the optimal solution of the original problem is the better solution of the two subproblems (with reduced dimension).

Instead of solving the subproblems, we could split them again into two subproblems each. This builds a so-called branch-and-bound tree. At each level of the tree, the dimension of the variable vector is reduced by 1. In the extreme case, one would branch on every variable, which would result in an inefficient enumeration scheme.

‍

Relax!

In order to prevent full enumeration, we want to derive bounds on the objective function value of the subproblems by considering easier-to-solve relaxations.

The problem on the right is a relaxation of the MINLP on the left, if:

Every point feasible for the original problem is also feasible for the relaxation, i.e., if it holds $$ x\in\Omega \Rightarrow x\in\bar{\Omega},$$ and
for every feasible point of the MINLP the objective function value of the relaxation bounds the MINLP objective value: $$ x\in\Omega \Rightarrow f(x)\leq\bar{f}(x).$$

A straight-forward relaxation is the so-called continuous relaxation, which is derived by relaxing the binary conditions $x_i\in\{0,1\}$ to $x_i\in [0,1]$ for $i\in I$, and by keeping the objective function unchanged ($\bar{f}=f$). In this way, both implications are fulfilled. Solving the continuous relaxation is somewhat easy because with the lack of nonconvex integrality constraints, the relaxation is convex such that local optimality implies global optimality.

The whole idea of the bounding step is to get rid of subtrees of the branch-and-bound tree, which are not of interest. This step is also called pruning and is understood best with a toy example.

‍

Growing a tree

Iteration 1:

Every branch-and-bound algorithm starts with the root node problem, which is exactly the original MINLP. Heuristics are used to derive a feasible solution quickly. We store the objective value of the best-known feasible solution as $z^*$ (which is 10 in our example). In addition, solving a relaxation of the root node problem gives an initial bound $\bar{z}=21$ on the objective value.

Now, we derive two subproblems by picking a binary variable and fixing it to 0 and 1, respectively.

Iteration 2:

We pick subproblem SP2 from the pool of open problems (SP1 and SP2) and essentially repeat everything we did in Iteration 1. Heuristics applied to the subproblem give an objective value of $\underline{z}=9$, which does not improve the best-known objective value $z^*=10$ known from the root node. Solving the relaxation of the subproblem improves the bound known from the parent problem from 21 to 18. Note that this bound is only valid for the subtree induced by SP2.

Again, we pick a variable to branch on and to derive two new subproblems. We have now three open subproblems (SP1, SP3, and SP4).

‍

‍Iteration 3:

We pick SP3 from the pool of open subproblems, apply heuristics and solve a relaxation. In this case the heuristics provide an objective value of $\underline{z}=12$, which indeed improves the best-known (also called incumbent) value. We hence update $z^*=12$.

Of course, we pick a variable, branch on it and add the resulting subproblems to the pool of open problems (SP1, SP4, SP5, SP6).

Iteration 4:

We pick SP4, apply heuristics and solve a relaxation. They heuristics do not improve the incumbent value $z^*=12$, but the relaxation provides a bound of $\bar{z}=11$ valid for the subtree induced by SP4. This allows to prune the subtree. The rationale is as follows: Any feasible solution that fulfills the variable fixings present in the branch to SP4 cannot have a better objective value than the bound at SP4, which is $\bar{z}=11$. However, we have already found a solution with objective value $z^*=12$. Thus, the subtree with root SP4 cannot contain an optimal solution and we can prune it.

‍

When do we stop?

Simply put, we continue this process until there are no open subproblems left. In the worst case, this happens when we fully enumerated the tree. However, for most optimization problems we only need to consider a fraction of the full tree due to pruning. Keeping the search tree small is an essential part of branch-and-bound and a rich theory and literature exist on how to achieve this (useful buzzwords: branching rules, node selection, cutting planes, symmetry detection, conflict analysis).

Whenever the pool of open subproblems is empty, we have found the proven global optimal solution (simply by design of the algorithm). Proving global optimality is one core strength of branch-and-bound and as discussed in the beginning, sets it apart from heuristic methods (like quantum annealing or QAOA in the context of QUBOs).

In practice, there might also be other criteria to terminate the algorithm early. For example, one can terminate after a certain amount of runtime, because sometimes the best solution after a reasonable time is more valuable than the proven optimal solution after a much longer runtime. Having this flexibility is another major advantage of branch-and-bound.

‍

How good is good enough?

This is the central question of this post. The most important feature of branch-and-bound is that at any point in time, a quality assessment of the best-known solution is available via the global bound. The latter is the weakest (i.e., smallest) bound across all open subproblems and the rationale behind this is obvious: There cannot be a solution with a larger objective value (in our maximization example) than the largest bound available at any of the subproblems still open. We label this globally valid bound as $\hat{z}$ and introduce the following relative solution quality measure, which is also called the relative gap: $$\frac{\hat{z}-z^*}{|z^*|}\geq 0.$$ It tells us by how much the objective value could potentially still be improved.

The relative gap can also be used to terminate the search process early. For example, for some applications it might be enough to find a solution which is at most 5% away from the true global optimal solution, i.e., to stop once the relative gap drops below 0.05. This ability is fundamentally different to heuristic approaches. While heuristics very likely provide solutions of similar quality, we will never know how good the solution really is. Hence, branch-and-bound adds much more credibility and flexibility to solving optimization problems.

‍

How about quantum computing?

In this post we learned about mathematical optimization and the important distinction between a solution and the solution. Many optimization tasks require to find the global optimal solution, and even if a close-to-optimal solution is good enough, one needs to be able to judge how good a given solution actually is.

Obviously, this also applies to quantum optimization, because the need for provably optimal solutions or bounds and gaps is tied to the application and not to the hardware. In quantum optimization, we face two obstacles with regard to proven global optimality. First, almost all known quantum optimization approaches (like QAOA or quantum annealing) are of inherent heuristic nature. Second, current noisy intermediate-scale quantum (NISQ) machines are prone to errors resulting from, e.g., infidelity of gates, crosstalk, state preparation, or measurement errors. These errors are amplified with larger quantum circuits, low connectivity between qubits, etc. While strategies exist to minimize errors, they will not be resolved entirely until the advent of fully fault-tolerant quantum computers. Hence, tackling optimization problems with quantum approaches always only gives you a solution without quality assessment and never the solution.

In this post, we also scratched the surface of branch-and-bound, a classic algorithm that returns the provably optimal solution when run long enough and provides a quality assessment, the relative gap, of the current best solution at any point in time. In our next blog post, we dive into the details on how our HybridSolver exploits the branch-and-bound paradigm to provide provably optimal solutions of optimization problems in a hybrid classic-quantum way.

‍