The world’s leading publication for data science, AI, and ML professionals.

Linear Programming Optimization: The Simplex Method

Part 3: The algorithm under the hood

Photo by Daniel Cassey Pahati on Pexels.com
Photo by Daniel Cassey Pahati on Pexels.com

Up until now, this series has covered the basics of linear programming. In this article, we are going to move from basic concepts into the details under the hood! This article will cover the simplex method, which is the algorithm that is often used to solve linear programming problems. While we will solve a simple linear programming example by hand with the simplex method, our focus will be on the intuition of the algorithm rather than memorizing the algorithmic steps (we have computers for that kind of stuff!).

Here is what we will cover:

  1. Why the simplex method is needed
  2. Moving from graphical solutions to algebraic
  3. Demonstrating how the simplex method works with a simple example

Here is the link to the list that contains all of the articles I’ve written so far for this series:

Linear Programming

Why the simplex method is needed

In the first article of this series, we went over how the attributes of linear programming allow it to only consider the corner points of constraints as potential optimal solutions. This is a very powerful feature that narrows an infinite solution space to a finite solution space. In the examples we reviewed, we only had a few constraints and a few variables – we even solved some of them by hand! After looking at some simple examples, one might assume that an exhaustive, brute force (looking at every single corner point) approach to searching for the optimal solution is sufficient.

While, for smaller LP problems, the brute force method is fine, as the problems get more complicated, the number of corner points can explode. The number of total corner points (including feasible and infeasible) is a function of (1) the number of variables and (2) the number of constraints in the problem.

Formula for the total number of corners - image by author
Formula for the total number of corners – image by author

Below are some examples of the growth of the number of corner points for multiple variables and constraint combinations:

Quick explosion in number of corner points - image by author
Quick explosion in number of corner points – image by author

The rapid growth of the corner points necessitates a smarter way of finding the optimal solution – enter the simplex method! The simplex method is a clever way to look at only a small subset of all of the possible solutions while still guaranteeing a global optimal result – we’ll get into the details a little further down.


Moving from graphical solutions to algebraic

When learning the basics of linear programming, it is really nice to look at a graphical representation of an Optimization problem with two decision variables. We covered this in part 1. While this works wonders for developing an understanding of the basics, it doesn’t extrapolate well to more complicated problems. Its usefulness is essentially limited to instructional purposes.

To further our understanding of LPs and get to a point where we are ready to start discussing the details of the simplex method, we need to understand how to transfer an LP problem into an algebraic format.

One of the main components of getting an algebraic set up is transforming the constraint inequalities to equalities. Outside of the context of LPs, there isn’t a good way to do this mathematically. But remember how for LPs, we are really only interested in corner points? And all corner points are on the lines? Since we actually only care about the borders of the solution space, we basically ignore the inequality! Again, this is because we wouldn’t worry about the area not on the border lines of the space anyways.

Nice, let’s just switch all of the inequality signs to equality signs and we are good to go, right? Not so fast! When we consider corner points, we aren’t usually on all of the lines at the same time. Because of this, we need to add a way to ‘hop’ on and off of specific constraint lines. This is a little hard to explain in words – ironically, I’m going to explain this concept… graphically.

Switching from inequalities to equalities - image by author
Switching from inequalities to equalities – image by author

Okay, so now see why we can remove the inequality for equality – but we have a big problem. Let’s explore this problem by looking into the math of a specific corner point:

image by author
image by author

Let’s plug these X₁ and X₂ values for all of the linear constraints:

image by author
image by author

We see here that the more strict equality causes these values of X₁ and X₂ to not be viable options for the third constraint. We need a way to make it possible for us to be at the corner while having the variable values work in all of the constraint formulas.

The solution to the problem above is to create slack variables. Slack variables are added to each equality and give us the flexibility to only consider the constraints involved in one corner at a time.

We can ‘hop’ to a corner point by setting the slack variables to 0 for the constraints involved. For the additional constraints that are not involved, we can set to slack variable to whatever the difference (or the slack) is between the decision variable side of the equation and the constant side of the equation.

Adding slack variables gives us the flexibility to 'hop' to corner points - image by author
Adding slack variables gives us the flexibility to ‘hop’ to corner points – image by author

If the slack variable is set to 0, then the constraint is one of the constraints in the corner point we are considering. If it is anything other than zero, we know that the specific constraint is not involved in the corner point under consideration.

Okay, with our inequalities set to equalities and our slack variables created, we are ready to solve linear programing problems algebraically. Let’s now get into the details of how the Simplex method uses this set up to efficiently optimize!

Note: When we were discussing the number of possible corner points in the previous section, n includes slack variables, not just the original decision variables.


Demonstrating how the simplex method works with a simple example

First and foremost, the simplex method is an algorithm that efficiently moves from corner point to corner point, calculating the objective values of the corner points until it finds the globally optimal solution. The simplex method is a greedy algorithm that moves to the corner point that increases (for maximization) or decreases (for minimization) the objective function the most. Greedy algorithms are typically infamous for finding sub-optimal solutions – but, because of the characteristics of linear programming, the simplex method is guaranteed to find the optimal solution.

I’d like to take some time to build intuition on why a greedy algorithm, like the simplex method, can find a global optimal for linear programming. The defining attribute of a greedy algorithm is that it makes optimal decisions given just the immediate information. Often, a series of local optimal decisions do not lead to a globally optimal solution – but, as I mentioned, this is not the case for linear programming. The reason it holds is because of the linearity requirement for the constraints. Let’s look at an example of making optimal decisions with linear functions.

Imagine we are at the intersection of two functions – we want to select the function that will give us the highest y-value for the next x-value.

Demonstration that for linear functions, local optimal decisions are also global - image by author
Demonstration that for linear functions, local optimal decisions are also global – image by author

The simplex method uses a very similar approach to move to different corner points in a greedy fashion. One of the keys to the simplex method is that it terminates when no more better adjacent corners are found. This corner is both the local and global optimum because of the convexity of the solution space.

Intuitive Example

With a general understanding of the set up of the simplex method and a basic understanding of what it does (greedily iterate through corner points to find the globally optimal solution) – I think we are ready to start working through an example.

So far, we’ve discussed three constraints, but we have not established an objective function. To do that, we’ll come up with some coefficients to give X₁ and X₂ and we will add all of the slack variables with coefficients of 0 (this makes the optimization tableau complete).

So here is the problem we will be solving:

image by author
image by author

Let’s get this set of linear equations simplex-ready. We’ll start with the constraints, which we’ve already discussed above. This is what they will look like:

image by author
image by author

We have not discussed how to modify the objective function to fit into the simplex approach. It is very simple, we set the objective value equal to Z and the transform the function algebraically to where 0 is on the right hand side. This looks like this:

image by author
image by author

Note that we added 0’s for the slack variables, they add no value to the objective, but this formulaic set up will be helpful as we start to build out the simplex tableau.

The simplex tableau is a data table that is used in the simplex method to traverse corners and calculate their objective function values. It also has implements to stop exploring once the optimal value has been found.

The tableau has one row for the objective function and one row for each constraint. At any point in the algorithm’s execution, there are basic and non-basic variables. Variables switch between these categories as the algorithm progresses. Basic variables are variables that are not set to 0, you can think of them as variables that are active. Non-basic variables are variables that have a value of 0, you can think of these as deactivated. The simplex method always starts at the origin. At the original, all decision variables are set to 0, which means that only the slack variables are basic, i.e., non-zero or "active." The starting tableau will show the slack variables as non-basic at the beginning.

Below is our example’s starting tableau:

image by author
image by author

Okay, now that we have our initial set up, we are ready to start our first iteration of the simplex method. We need to start moving away from the origin – we do this by moving one of the slack variables to non-basic and pulling one of the decision variables into the basic solution set. But which slack variable do we take out, and which of the X’s do we bring in?

Let’s first figure out which variable we want to make basic. Remember how we said that the simplex algorithm is greedy? This is where this comes into play! We look at the coefficients for the z-row and select the variable that has the most negative coefficient. This is a maximization problem, but remember that we switched the signs of the coefficients to get the 0 on the right hand side, so negative = good, positive = bad. The most negative coefficient is tied to the variable that will increase the objective function the most right now. Our options are -5 for X₁ and -3 for X₂, so we select X₁. This is called the entering variable – thankfully that is a pretty intuitive name!

image by author
image by author

Now, let’s talk about finding the slack variable to make non-basic or de-active/remove. In simplex terms, the variable that is being removed is (also intuitively) called the leaving variable. We want to find the maximum value that we can set the entering variable to, without violating any constraints. This means we need to find the slack variable that is associated with the tightest/most restrictive constraint on X₁. We can do this by taking the ratio of the ‘solution’ column and the constraint’s’ X₁ coefficients. X₁’s constraint coefficients are -2, 1 and 1 – the corresponding ‘solution’ column values are 5, 7 and 10.

S_2 has the lowest, non-negative ratio - therefore it will be selected as the leaving variable - image by author
S_2 has the lowest, non-negative ratio – therefore it will be selected as the leaving variable – image by author

Here’s the intuition behind this selection. We are trying to find the maximum value that X₁ can take while being consistent with the system of linear equations it is a part of. For the first constraint, really anything goes because the X₁ coefficient is negative – the other variables can compensate at any level to make the overall equation equal to 5. Obviously ‘anything goes’ is not a very restrictive constraint 😄 !

image by author
image by author

For the second constraint, X₁’s maximum value will be realized when S₁ and X₂ are set to 0, this maximum value is 7. We see that there are values for other variables that can make the system of linear equations work.

values exist such that the system of linear equations doesn't break - image by author
values exist such that the system of linear equations doesn’t break – image by author

Now let’s take a look at our last option, which is setting X₁ to 10. This causes big problems; remember that all variables have to be ≥ 0. As you can see below, there is no way to make the second constraint work given a value of 10 for X₁. Therefore, the highest we can set X₁ to, without violating a function in a system of linear equations is 7.

There is no way to make the system work with x1 = 10 - image by author
There is no way to make the system work with x1 = 10 – image by author

Wow, I feel like we’ve done a lot of work already! We’ve identified which variable we will be introducing into and which variable will be leaving the simplex tableau. Now we have to update the tableau to reflect these changes.

Before we get into the details of updating the tableau for the entering and leaving variables, let’s get some key terms down quickly. The row that belongs to the leaving variable is called the pivot row. The column that belongs to the entering variable is called the pivot column. The intersection of the pivot row and the pivot column is called the pivot element.

Simplex entering/leaving variable terms - image by author
Simplex entering/leaving variable terms – image by author

Okay, let’s update this tableau for the changes we need to make! The first step is to update the row for S₂ (it gets special treatment because it is the row we are swapping out for X₁), then we need to update all of the other rows.

We update the pivot row by dividing the entire row by the pivot element. This sets the row up so that the ‘solution’ column’s value is equal to the X₁ value. Unfortunately, for this example, since the pivot element is 1, it will appear that nothing has changed (I guess I should’ve picked a better example…). But we will update the ‘Basic’ column to show that X₁ is now in the basic solution and S₂ is not.

image by author
image by author

All of the other rows get modified with this function:

New Row = (Current Row) – (Its pivot column coefficient) X (New Pivot Column)

This function updates the solution columns so that the values of the basic variables make a coherent set of linear equations. Let’s go through that now. We’ll start with the row for the slack variable S₁.

modifying S1 because X1 was introduced to the basic solution - image by author
modifying S1 because X1 was introduced to the basic solution – image by author

The math for S₁’s row now ‘adds up.’ Remember that X₁ is now equal to 7. The constraint that corresponds to the first row is -2X₁ + X₂ + S₁ = 5. After our first move, this equals -14 + 0 + S₁ = 5. Given this formula, S₁ must equal 19, which is what we now have for the ‘solution’ column. We will follow this same step for the other two rows. The resulting tableau is the output of the first iteration of the simplex algorithm.

updated simplex tableau after first iteration - image by author
updated simplex tableau after first iteration – image by author

Now that we’ve completed the first iteration, we start a new iteration and follow the exact same steps as we did for the first. Remember that the first step is to find the most negative number on the z-row. Interesting… All numbers are positive. When all numbers are 0 or positive in the z-row, you are currently at the optimal point – this is the stopping criteria for the simplex method. For this example, our optimal solution is X₁ = 7 and X₂ = 0 with an objective function value of 35. The slack variables only serve an algebraic purpose and are discarded after the optimal solution is found.

For illustrative purposes, let’s image that there were a negative value in the z-row. Something like this:

hypothetical scenario where we have a negative value in z-row - image by author
hypothetical scenario where we have a negative value in z-row – image by author

If this were the case, we would select X₂ to be the entering variable and then proceed to (1) select the leaving variable, (2) recalculate the leaving variable’s row and switch it for X₂, (3) recalculate all other rows and (4) lastly look for negative values in the z-row to see if we have found the optimal solution, or if we need to move to the next iteration.

Conclusion

We’ve gone deep into the details of the simplex method in this article. I think it is important to understand the details, but it is also important to not miss the forest for the trees. Especially since you will never have to manually solve a linear programming problem in practice.

Here are a few key takeaways from this article:

  • The simplex method is an algorithm to solve linear programming problems using algebra
  • The objective and constraint formulas need to be re-written as equalities with slack variables before the simplex method can be used
  • The simplex method optimizes by doing a greedy search for the variable that adds the most to the objective function – then it sets that variable as high as possible
  • The simplex method stops after it can’t find any variables that will improve the objective function – this allows the simplex method to exit the problem early with the optimal solution – without having to explore all of the corner points

Related Articles