Luis Fernando PÉREZ ARMAS, Ph.D., Author at Towards Data Science https://towardsdatascience.com The world’s leading publication for data science, AI, and ML professionals. Wed, 05 Mar 2025 14:37:02 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://towardsdatascience.com/wp-content/uploads/2025/02/cropped-Favicon-32x32.png Luis Fernando PÉREZ ARMAS, Ph.D., Author at Towards Data Science https://towardsdatascience.com 32 32 Mathematics of Love: Optimizing a Dining-Room Seating Arrangement for Weddings with Python https://towardsdatascience.com/mathematics-of-love-optimizing-a-dining-room-seating-arrangement-for-weddings-with-python-f9c57cc5c2ce/ Mon, 02 Sep 2024 14:42:54 +0000 https://towardsdatascience.com/mathematics-of-love-optimizing-a-dining-room-seating-arrangement-for-weddings-with-python-f9c57cc5c2ce/ Solving the Restricted Quadratic Multi-Knapsack Problem (RQMKP) with mathematical programming and Python

The post Mathematics of Love: Optimizing a Dining-Room Seating Arrangement for Weddings with Python appeared first on Towards Data Science.

]]>
Please follow me to The woof-top reception... (Image generated by DALLE-3)
Please follow me to The woof-top reception… (Image generated by DALLE-3)

Dr. Hannah Fry‘s book, The Mathematics of Love [1], is one of those rare finds, smart, funny, and incredibly easy to read. I enjoyed it so much that I ended up giving copies to some of my closest friends. They loved it too, but a few of them hit a snag with Chapter 8, which, ironically, was the chapter they were most excited about.

This chapter goes into the tricky business of organizing wedding seating arrangements, using mathematical programming and optimization to figure out how to seat guests so everyone has the best time possible. Sounds cool, right? But here’s the catch, the book doesn’t actually walk you through how to set up or solve the problem, which left my friends feeling a bit lost.

Now, I know what you’re thinking: wedding seating? Really? But don’t be fooled , it’s a really difficult problem to solve, and the solution has way more applications than just weddings (although this one is already important). Think organizing tables on cruise ships [3], forming sports teams or workgroups, optimizing portfolios, and in my world as a business school professor, putting together effective teams for group projects (group conflicts and free riding have dramatically decreased since I started organizing groups with the methodology described in this article). So yeah, it’s a problem worth knowing how to solve, at least in my humble opinion.

In this article, I’m going to pick up where Dr. Fry left off and show you how to actually tackle this problem using Python and Gurobi. We’ll start by breaking down the problem, then jump into the details of it’s Mixed Integer Programming (MIP) formulation. After that, we’ll bring in a sample dataset and solve it step by step. And to wrap things up, I’ll talk about some cool extensions and advanced methods you can use to take this even further.

Now If you have never used Gurobi before, especially from Google Colab (The IDE that we will use in this article), I invite you to read one of my previous articles where I explain how to set up your Gurobi web license in your Colab environment. You can find the link to the article below:

Optimizing Project Schedules: Using Gurobipy and Cheche_pm warm-starts for Efficient RCPSP…

If you also need a tutorial and more information about linear programming (LP) or integer programming (IP) techniques I strongly recommend you to check these excellent articles written by Bruno Scalia C. F. Leite, I am leaving the links down below. Another great source is William’s HP book about Mathematical Programming [2].

Linear programming: Theory and applications

A Comprehensive Guide to Modeling Techniques in Mixed-Integer Linear Programming

Article index:

The Problem

As Dr. Hanna Fry puts it, aside from choosing a terrible best man to give the speech, one of the biggest mistakes a couple can make at their wedding is seating enemies at the same table (In my home country not having enough tequeños would also be considered a mortal sin). Make that mistake, and you’re in for a night of icy glares, passive-aggressive comments, and, if alcohol is involved, maybe even a brawl. Getting the seating arrangement right can make a world of difference in the overall vibe of the big day.

Every optimization problem has an objective function, and in this case, our goal is to maximize the total happiness of the guests. Guest happiness depends on how they interact with each other. Let’s take an example: Pongo and Cruella. Cruella loves Pongo’s beautiful hair, so the interaction term from Cruella’s side is +1. But it’s not always mutual – Pongo, worried about Cruella’s obsession with his fur, feels uneasy, so his interaction term is -1. If they’re seated together, their happiness cancels out to zero. Now, imagine doing this for 100 guests instead of just two. How much harder does it get? If you try to brute force the Sh-t out of this, you’ll find that there are 100 factorial (100!) ways to arrange the seating – that’s an astronomical number. This is precisely why mathematical optimization is our best bet.

To make matters more complex, we have constraints to consider. It’s not just about maximizing happiness; we also need to respect the table capacity limits and keep couples together (they tend to be angry if they are separated). So, our problem is to find an assignment of guests to tables that maximizes total happiness while respecting these constraints.

Now, here’s a philosophical twist: Should we aim to maximize the total happiness of the entire venue, or should we try to maximize the happiness of the least happy table? Which approach is fairer to our guests? What if maximizing total happiness makes one table absolutely miserable? Don’t worry, this article will tackle both approaches, but I am leaving the question open as food for thought.

These kinds of problems are known in the literature as quadratic assignment problems. However, the specific problem we’re solving here can be addressed as a Restricted Quadratic Multiknapsack Problem (RQMKP).

You’ve probably heard of the Binary Knapsack problem – there are tons of articles on Medium about how to solve it, I’ll leave a link to a cool Knapsack article written by Maria Mouschoutzi, PhD below.

How Many Pokémon Fit?

But the RQMKP? That’s what the Knapsack problem evolves into after eating its vegetables, getting eight hours of sleep, and hitting the gym five times a week. This isn’t just a Knapsack; it’s "the" Knapsack problem – the Goku Ultra Instinct version of it. It’s been proven to be strongly NP-Hard [6], which makes it a real beast to solve.

Unlike the regular Knapsack problem, the RQMKP involves multiple ("M") knapsacks that need to be filled. The objective function includes both quadratic ("Q") and linear terms, which, in our case, represents happiness – a function that, I should remind you, we need to maximize. The "R" stands for restricted, meaning we must place each item into one of the available knapsacks without exceeding their individual capacities, but every item has to be placed somewhere.

In our wedding dining context, when we focus on maximizing the total happiness of the event, we’re dealing with the Max-Sum variant of the RQMKP. When we aim to maximize the happiness of the least happy table, we’re looking at the Max-Min variation (we are maximizing the minimum of the tables).

Now that we have a broad understanding of the problem, let’s discuss the formal mathematical programming formulations.

Mathematical formulation of the RQMKP

We’ll start out with the simpler version of the problem, the Max-Sum RQMKP. Once we’ve tackled that, we’ll move on to the more challenging Max-Min variation.

The Max-Sum RQMKP

A mathematical formulation of the Max-Sum RQMKP problem is presented in Figure 1 below.

Figure 1. Mathematical formulation of the Max-Sum RQMKP (Image Created by the author)
Figure 1. Mathematical formulation of the Max-Sum RQMKP (Image Created by the author)

At first glance, the formulation above might seem daunting, but it’s actually quite straightforward. It uses binary variables x_{i,k}, which equal 1 if guest i is seated at table k, and 0 otherwise. The objective function is simply the sum of the interaction terms c_{i,j} for all pairs of individuals {i,j} who are seated together at the same table.

  1. Constraint (1) ensures that for each individual i, only one binary variable x_{i,k} equals 1, meaning each person can be seated at only one table.
  2. Constraint (2) makes sure that the number of guests at table k does not exceed its maximum capacity B_k. Note that different tables might have different capacities, which is why there’s a specific parameter B_k for each table.
  3. Constraint (3) ensures that all couples {i,j} in the set of couples E are seated together at the same table k.
  4. Finally, Constraint (4) guarantees that the variables x_{i,k} remain binary.

The Max-Min RQMKP

A mathematical formulation of the Max-Min RQMKP problem is presented in Figure 2 below. This formulation is slightly more difficult, but nonetheless easy to grasp.

Figure 2. Mathematical formulation of the Max-Min RQMKP (Image Created by the author)
Figure 2. Mathematical formulation of the Max-Min RQMKP (Image Created by the author)

This formulation is very similar to the Max-Sum-RQMKP, but here, the objective function is a new variable ζ that we are trying to maximize. Constraint (1) forces ζ to be less than or equal to the total happiness of each individual table. This means that, in the best-case scenario, ζ represents the happiness of the least happy table.

The rest of the constraints (2–5) are exactly the same as in the Max-Sum case, so there’s no need to repeat them.

With the formulations in place, let’s proceed to set up the environment to solve these problems.

Installing the libraries and setting up the Colab environment

To use Gurobi in Google Colab, we first need to install it using the following code:

!pip install gurobipy

Once installed we can proceed to import the required libraries for this project.

import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
import gurobipy as gp
from gurobipy import GRB
from google.colab import userdata

We’re going to use the standard "mirepoix" (🥬🥕🧅) of almost any Python project: numpy, pandas, and matplotlib. Additionally, we’ll import networkx to visualize the relationships between our guests, and gurobipy to formulate and solve the problem. With all the libraries in place, the only thing left to do is initialize the Gurobi environment. You can easily do this with the code below. Please note that you’ll need your Gurobi API credentials to proceed. I’m storing mine in the "sec_rets" t_ab of Google Colab, but if you prefer, you can directly copy the API access values into the variables.

WLSACCESSID = userdata.get('gurobi_WLSACCESSID')
WLSSECRET = userdata.get('gurobi_WLSSECRET')
LICENSEID = int(userdata.get('gurobi_LICENSEID'))

params = {
"WLSACCESSID": WLSACCESSID,
"WLSSECRET": WLSSECRET,
"LICENSEID": LICENSEID,
}
env = gp.Env(params=params) #this line initializes the gurobi environment

Now that we’re familiar with the problem formulations and have the environment set up, we can move forward with solving the problem. But first, let’s take a look at the example data we’ll be working with.

Problem data

For this article, I’m going to use a small example with 21 guests for our wedding event. Each guest has previously expressed their preference to dine with another guest: a +1 if they want to share a table, a 0 if they are neutral, or a -1 if they want to avoid that person. The data for this example is shown in Table 1 below.

For this example, we’re going to consider only three couples: Pikachu and Meditite, Machoke and Tyrantrum, and Aipom and Pelipper. Remember, we need to seat these couples together.

To keep things simple, we’ll have just three tables, each with a maximum capacity of 7 guests. The data for this problem is available in my GitHub repo. Let’s go ahead and import it using the code snippet below.

url = 'https://raw.githubusercontent.com/ceche1212/math_love_RQMKP_medium/main/input_data.csv'
df = pd.read_csv(url,index_col=0)
idx_to_col = {i:v for i,v in enumerate(df.columns)}
col_to_idx = {v:i for i,v in enumerate(df.columns)}

data = df.to_numpy() # trnasform the data into a numpy array

The data is correctly imported, so let’s go ahead and generate a cool network graph to visualize the relationships between our guests.

G = nx.DiGraph()

nodes = df.columns.to_list()
G.add_nodes_from(nodes)

for i,n1 in enumerate(nodes):
  for j,n2 in enumerate(nodes):
    if i == j:
      continue
    else:
      c_i_j = data[i,j]

      if  c_i_j > 0:
        G.add_edge(n1,n2,weight=c_i_j,color = 'b')
      else:
        G.add_edge(n1,n2,weight=c_i_j,color = 'r')

def nudge(pos, x_shift, y_shift):
    return {n:(x + x_shift, y + y_shift) for n,(x,y) in pos.items()}

plt.figure(figsize=(16,16))
pos = nx.circular_layout(G)
pos_nodes = nudge(pos, 1, 1)
edges = G.edges()
colors = [G[u][v]['color'] for u,v in edges]
nx.draw(G, pos_nodes, edge_color=colors,node_color='grey', alpha=0.4, node_size=3600)
nx.draw_networkx_labels(G, pos=pos_nodes)
plt.title('Wedding Guests')
plt.show()

If everything goes according to plan, you should be looking at something similar to Figure 3 just below.

Figure 3. Network graph of the interactions c_{i,j} of the wedding guests (Image created by the author)
Figure 3. Network graph of the interactions c_{i,j} of the wedding guests (Image created by the author)

The graph above looks pretty (and complicated). Each red line corresponds to a negative interaction – for instance, very few people want to eat with Pikachu. I mean, he can be such a diva, and his conversations are just one note, so it’s understandable. In contrast, blue lines represent positive interactions. For instance despite his past work experiences with Team Rocket, and his intransigence regarding pollution and emissions, Koffing seems to be quite popular with the other guests.

To finalize the input data setup, we are also going to need the variables associated with table capacities and couples. The table capacities B_k is going to be a list with the capacity of each table. Couples are going to be stored as tuples inside a list E.

n_tables = 3
B_k = [7,7,7]
E = [(col_to_idx['Pikachu'],col_to_idx['Meditite']),
           (col_to_idx['Machoke'],col_to_idx['Tyrantrum']),
           (col_to_idx['Aipom'],col_to_idx['Pelipper'])]

I,K = range(len(col_to_idx)),range(n_tables)

Solving the RQMKP

Let’s start with the Max-Sum-RQMKP problem. Typically, to solve such models in commercial mathematical programming solvers, we would first need to linearize the quadratic terms x_{i,k}x_{j,k}. Normally this will require the creation of new binary variables w_{i,j,k} = x_{i,k}x_{j,k} that represent the quadratic terms of our formulation, the linearization also required the addition of three new set of constraints, for each i,j and k . This would make the problem substantially more difficult for the solver to solve. But luckily for us, Gurobi has a quadratic solver in place (That is very likely doing some linearization’s behind the scenes), so we can go ahead and directly create the model, as shown in Figure 1.

x_vars = [(i,k) for i in I for k in K] # create indices for variables

model = gp.Model('Max_Sum_RQMKP',env=env) #initialize model

X = model.addVars(x_vars,vtype=GRB.BINARY, name="X") # create variables

#objective function
model.setObjective( gp.quicksum( data[(i,j)]*X[i,k]*X[j,k] for i in I 
                                for j in I 
                                for k in K), GRB.MAXIMIZE ) 

# Constraint (1)
model.addConstrs( gp.quicksum( X[i,k] for k in K ) == 1 for i in I)

# Constraint (2)
model.addConstrs( gp.quicksum( X[i,k] for i in I ) <= B_k[k]  for k in K)

# Constraint (3)
for i,j in E:
  for k in K:
    model.addConstr(X[i,k] == X[j,k])

To solve the model, we just need to set up the MIP gap (stopping criteria based on the percentage distance from the optimal solution) and the time limit we’ll allow the solver to run. For this example, I want to run the solver to optimality if possible (so the MIP gap is equal to 0.00), and we’ll allow Gurobi to run for a maximum of 600 seconds.

tl = 600
mip_gap = 0.00

model.setParam('TimeLimit', tl)
model.setParam('MIPGap', mip_gap)
model.optimize()

After less than 1 second, the solver should stop and provide a final solution with a total happiness of 59. Now, let’s proceed to extract a detailed solution.

max_sum_tables = [ []  for k in K]
for x in X.keys():
  if X[x].x > 0.95:
    guest_idx = x[0]
    table = x[1]
    guest_name = idx_to_col[guest_idx]
    max_sum_tables[table].append(guest_idx)

The table organization (which guest is seated where) is now stored in the list max_sum_tables, now I wonder what is the happiness of each one of the tables? let’s find out.

for i,table in enumerate(max_sum_tables):

  Z = 0
  table_comp = [idx_to_col[x] for x in table]
  for n1 in table:
    for n2 in table:
      Z += data[n1,n2]
  print(f" Table {i+1} ({table_comp}), Hapiness  = {Z}")

The output generated by the code above is the following one:

>>>>

Table 1 Happiness  = 3
Table 2 Happiness  = 23
Table 3 Happiness  = 33

If you’re seated at Table 3, you’re in for a great time, but if you’re at Table 1, you’d better hope the food is good because the conversations might be pretty dull. This configuration maximizes the overall happiness of the event, but it does so at the expense of Table 1’s enjoyment, creating a significant imbalance.

Is it possible to find a more equitable arrangement that still maximizes happiness? The answer is yes, but it will likely come at the cost of the total overall happiness. We can achieve this by solving the Max-Min variation of the RQMKP.

x_vars = [(i,k) for i in I for k in K]

model = gp.Model('Max_Min_RQMKP',env=env) #initialize model

X = model.addVars(x_vars,vtype=GRB.BINARY, name="X") # create variables
Zeta = model.addVar(vtype=GRB.INTEGER, name="zeta") # create objective function variable

# Objective function
model.setObjective( Zeta, GRB.MAXIMIZE )

# Constraint (1)
model.addConstrs(gp.quicksum(X[i,k]*X[j,k]*data[(i,j)] for i in I for j in I) >= Zeta for k in K)

# Constraint (2)
model.addConstrs( gp.quicksum( X[i,k] for k in K ) == 1 for i in I)

# Constraint (3)
model.addConstrs( gp.quicksum( X[i,k] for i in I ) <= B_k[k]  for k in K)

# Constraint (4)
for i,j in E:
  for k in K:
    model.addConstr(X[i,k] == X[j,k])

Again we input the Mip-Gap and time limit, similar with the previous model and then we proceed to solve.

tl = 600
mip_gap = 0.00

model.setParam('TimeLimit', tl)
model.setParam('MIPGap', mip_gap)
model.optimize()

In less than 3 seconds (this variation is a little bit harder to solve), the solver will stop and give you a final solution of 14. But remember, this isn’t the overall happiness of the entire event – it’s just the happiness level of the saddest 😢 table. Now, let’s check out the total happiness of the event and see how each table is feeling individually.

max_min_tables = [ []  for k in K]
for x in X.keys():
  if X[x].x > 0.95:
    guest_idx = x[0]
    table = x[1]
    guest_name = idx_to_col[guest_idx]
    max_min_tables[table].append(guest_idx)

Total_Z = 0
for i,table in enumerate(max_min_tables):

  Z = 0
  table_comp = [idx_to_col[x] for x in table]
  for n1 in table:
    for n2 in table:
      Z += data[n1,n2]
  Total_Z += Z
  print(f" Table {i+1} ({table_comp}), Happiness  = {Z}")
print(f"Total Happiness = {Total_Z}")

The output generated by the code above is the following one:

Table 1 Happiness  = 14
Table 2 Happiness  = 18
Table 3 Happiness  = 14
Total Happiness = 46

The total happiness of this solution is 46, which is a 22% drop compared to the 59 happiness value from the Max-Sum solution. However, the Max-Min solution is significantly more equitable. Just look at Table 1 – their happiness jumped from a dismal 3 to a solid 14. In my opinion, this is a better outcome for everyone involved (assuming you actually like all your guests and want them to have an equally good time. If you’re not too fond of the folks at Table 1, feel free to let them stew in their misery with the Max-Sum configuration – but that begs the question, why invite them at all?).

Visualizing the RQMKP solution

Now that we have our solution, let’s create a table plan to visualize the dining arrangement before wrapping up this article. We’ll use the plotting capabilities of networkx once again. For this example, we’ll plot the solution from the Max-Min variation of the RQMKP, but the process is exactly the same for the Max-Sum problem.

# Create and populate the Table-Guest network graph
Tables_G = nx.Graph()

for i,table in enumerate(max_min_tables):
  table_comp = [idx_to_col[x] for x in table]
  table_name = f'Table {i+1}'
  Tables_G.add_node(table_name,type='table')

  for guest in table_comp:
    Tables_G.add_node(guest,type = 'guest')
    e = (table_name,guest)
    Tables_G.add_edge(*e)

# Find table nodes and guest nodes
table_nodes = [node for node, data in Tables_G.nodes(data=True) if data['type'] == 'table']
guest_nodes = [node for node, data in Tables_G.nodes(data=True) if data['type'] == 'guest']

# Adjust layout for the position of the tables and guests
pos = nx.circular_layout(Tables_G)

# For each table node, create a circular layout for its guests
for table in table_nodes:
    # Get the neighbors (guests) of the table
    guests = list(Tables_G.neighbors(table))

    # Calculate circular layout for guests around the table node
    num_guests = len(guests)
    angle_step = 2 * np.pi / num_guests
    radius = 0.5  # Adjust the radius to space guests further from the table

    for i, guest in enumerate(guests):
        angle = i * angle_step
        pos[guest] = pos[table] + np.array([radius * np.cos(angle), radius * np.sin(angle)])

# Draw the graph with differentiated node sizes
plt.figure(figsize=(10, 10))

# Draw the table nodes
nx.draw_networkx_nodes(Tables_G, pos, nodelist=table_nodes, node_color='lightgrey', node_size=3600, label='Table',)

# Draw the guest nodes
nx.draw_networkx_nodes(Tables_G, pos, nodelist=guest_nodes, node_color='lightblue', node_size=1000, label='Guest')

# Draw the edges and labels
nx.draw_networkx_edges(Tables_G, pos)
nx.draw_networkx_labels(Tables_G, pos, font_size=10)
plt.title('Table Configuration Max-Min')
plt.show()

The resulting plot is displayed in Figure 4 below.

Figure 4. Dinning configuration resulting from the Max-Min solution of the RQMKP (Image created by the author)
Figure 4. Dinning configuration resulting from the Max-Min solution of the RQMKP (Image created by the author)

Conclusions

In conclusion, this article has guided you through solving a dining-room seating arrangement problem by tackling it as a Restricted Quadratic Multi-Knapsack Problem (RQMKP). We explored two different formulations: the Max-Sum version and the Max-Min version.

Depending on the situation, one of these formulations may be more appropriate or even more ethical. For example, when forming student project groups, prioritizing overall happiness at the expense of a single miserable group doesn’t seem fair to me. That’s why, in such cases, the Max-Min approach is likely the better choice. However, in other contexts, the Max-Sum approach might be more suitable. Ultimately, the decision is yours – but now, you have the tools to formulate and solve both using Python.

Quadratic problems like these are challenging, and as the problem size increases, you might notice the solver struggling to find a solution. This is why such problems are often tackled using metaheuristics. I’ve included a link to a state-of-the-art metaheuristic approach for these quadratic assignment problems:

https://www.sciencedirect.com/science/article/abs/pii/S0377221720309814

Additionally, here’s a link to one of my research papers, where I explore a similar variant of the group formation problem using a hybrid method that combines quantum annealing and classical decomposition methods.

Restricted Min-Norm Semi-Assignment in Group Formation: Quantum and Binary Algorithmic Integration

You can find the entire notebook with the code developed for this article in the link to my GitHub repository just below.

math_love_RQMKP_medium/RQMKP_medium_com.ipynb at main · ceche1212/math_love_RQMKP_medium

I sincerely hope you found this article useful and entertaining. If so, I’d love to hear your thoughts! Please feel free to leave a comment or show your appreciation with a clap 👏 . And if you’re interested in staying updated on my latest articles, consider following me on Medium. Your support and feedback are what drive me to keep exploring and sharing in this fascinating field. Thank you for taking the time to read, and stay tuned for more insights in my next article!


References

  • [1] Fry, Hannah. The mathematics of love: Patterns, proofs, and the search for the ultimate equation. Simon and Schuster, 2015.
  • [2] Williams HP. Model Building in Mathematical Programming. fourth ed. Chichester: Wiley; 1999
  • [3] Bergman, David. "An exact algorithm for the quadratic multiknapsack problem with an application to event seating." INFORMS Journal on Computing 31.3 (2019): 477–492.
  • [4] Nasini, Stefano, and Luis Fernando Pérez Armas. "Restricted Min-Norm Semi-Assignment in Group Formation: Quantum and Binary Algorithmic Integration." Available at SSRN 4684889 (2024).
  • [5] Silva, Allyson, Leandro C. Coelho, and Maryam Darvish. "Quadratic assignment problem variants: A survey and an effective parallel memetic iterated tabu search." European Journal of Operational Research 292.3 (2021): 1066–1084.
  • [6] Cacchiani, Valentina, et al. "Knapsack problems – An overview of recent advances. Part II: Multiple, multidimensional, and quadratic knapsack problems." Computers & Operations Research 143 (2022): 105693.
  • https://www.perfecttableplan.com/help/52/windows/html/seat_assignment.htm
  • https://towardsdatascience.com/how-many-pok%C3%A9mon-fit-84f812c0387e

The post Mathematics of Love: Optimizing a Dining-Room Seating Arrangement for Weddings with Python appeared first on Towards Data Science.

]]>
Solving a Constrained Project Scheduling Problem with Quantum Annealing https://towardsdatascience.com/solving-a-constrained-project-scheduling-problem-with-quantum-annealing-d0640e657a3b/ Tue, 20 Aug 2024 17:18:15 +0000 https://towardsdatascience.com/solving-a-constrained-project-scheduling-problem-with-quantum-annealing-d0640e657a3b/ Solving the resource constrained project scheduling problem (RCPSP) with D-Wave's hybrid constrained quadratic model (CQM)

The post Solving a Constrained Project Scheduling Problem with Quantum Annealing appeared first on Towards Data Science.

]]>
Why did the dog fail quantum mechanics class? He couldn't grasp the concept of super-paws-ition. Quantum superposition is the principle where a quantum system can exist in multiple states simultaneously until it is measured, at which point it collapses into one of the possible states. (Image generated by DALLE-3)
Why did the dog fail quantum mechanics class? He couldn’t grasp the concept of super-paws-ition. Quantum superposition is the principle where a quantum system can exist in multiple states simultaneously until it is measured, at which point it collapses into one of the possible states. (Image generated by DALLE-3)

I’m really excited to share this article with you because it’s closely tied to my current research: optimizing project schedules with quantum computing. My passion for this topic comes from my background as a project manager and my ongoing research into how quantum computing can help solving complex optimization problems. During my PhD, I specifically looked into how today’s quantum technology could be used to tackle complex scheduling challenges, that are specific for the field of Project Management.

We’re living in an incredible era where quantum computers are no longer just a concept – they’re real and being used. Imagine what Richard Feynman would say about them! However, these machines aren’t fully ready yet. They’re often referred to as NISQ (Noisy Intermediate-Scale Quantum) devices, a term coined by quantum computing pioneer John Preskill. These machines are still considered early versions of the more advanced quantum computers we hope to see in the future.

Currently, they’re small, with only a few qubits (quantum bits of information) available. For comparison, it’s not uncommon today to have a laptop with 16GB of RAM, which equates to around 128 billion bits of information, that we can work with. In contrast, the largest quantum computers have fewer than 6,000 qubits. Moreover, qubits are highly sensitive to noise, and the limited number of qubits makes it challenging to correct errors and perform complex calculations.

Despite these challenges, the quantum computers we have today are powerful enough to test out algorithms and see their potential for solving tough problems. In broad terms there are two main types of quantum computers: universal quantum machines, like the ones from IBM and Google, which work on a circuit-gate model of computation, and quantum annealers, which use adiabatic evolution principles (Technically speaking there is a third type of quantum computer, which are measurement based quantum machines, but these ones are still in early development). In Quantum Computing, "universal" means that these machines can, in theory, run any quantum algorithm by implementing any quantum mechanical gate (including classical gates). While quantum annealers might not be universal, they’re still super promising for optimization tasks.

In this article, I’ll walk you through how you can use a quantum annealer provided by a company called D-Wave to optimize project schedules, this article is based on one of my recently published research papers [1]. If you’re interested in a deeper dive, I invite you to read my full research article, which you can access using the link down below.

Solving the resource constrained project scheduling problem with quantum annealing – Scientific…

Before we get into the coding, I’ll give you a quick intro to quantum annealing and how it’s used in optimization. Let’s get started!

Article index:

What is quantum annealing?

Quantum Annealing is a quantum metaheuristic designed to solve combinatorial optimization problems by leveraging the principles of quantum mechanics [2]. This approach is inspired by simulated annealing [4], a classical metaheuristic where a system is gradually cooled to reach a state of minimum energy. In quantum annealing, quantum phenomena such as superposition (where a quantum system can exist in multiple states simultaneously until measured, at which point it collapses into one of the possible states) and quantum tunneling are utilized to efficiently navigate through local minima, with the goal of finding the global minimum of a cost function.

Quantum annealing devices are primarily employed for optimization purposes, such as finding the minimum or maximum value of a problem. Imagine trying to find the lowest point in a rugged, hilly landscape aka the "best" solution to a problem. Traditional methods might get stuck in small dips, mistaking them for the lowest point. In contrast, a quantum annealer leverages the properties of quantum mechanics, attempting to tunnel through these local minima and guide the system to the true lowest point, or optimal solution. The effectiveness of quantum annealing, compared to its classical counterpart, can vary depending on the specific problem and the hardware used. The literature presents both theoretical proofs [3] that highlight the advantages of quantum annealing, as well as contrasting perspectives, particularly when compared to simulated annealing

At the core of quantum annealing is the profound concept that the universe naturally tends to seek states of minimum energy, known as ground states. Quantum annealing also relies on the "adiabatic theorem" of quantum mechanics and the evolution of quantum systems as described by the time-dependent Schrödinger equation.

Optimization problems can be represented as physical systems and formulated as energy Hamiltonian problems, which are mathematical representations of a system’s total energy. A common approach is to use the ISING model to encode the optimization problem into a physical system. In this model, the optimization problem is mapped onto a representation of the magnetic spin of each qubit, interacting in a two-dimensional lattice grid, as shown in Figure 1 below.

Figure 1. ISING Two Dimensional Lattice (Image created by the author)
Figure 1. ISING Two Dimensional Lattice (Image created by the author)

The mathematical representation of the energy Hamiltonian H₁​ in the ISING model is given by the formula below:

Here, σ​ is the Z-Pauli gate applied to qubit i, which takes the value +1 if the qubit is pointing upwards and -1 if it is oriented downwards. Jᵢ ⱼ​ represents the interaction coefficient between qubits i and j, while Hᵢ​ represents the coefficient of an external magnetic field interacting with qubit i. Fortunately, most common optimization problems can be encoded using this model, as the ISING problem is NP-Complete, as demonstrated by Istrail in 2000 [5]. In this model, each variable is binary, acting like a magnet that can only point up or down. These magnets will naturally try to orient themselves to minimize the energy of the entire grid, resulting in the system’s minimum energy. The solution to the problem is simply the configuration of the magnets after they have evolved and properly aligned themselves.

You’ve likely already encountered some of the principles of this model without even realizing it. If you’ve ever played with magnets, you know that when you have two or more, they naturally try to align their poles in opposite directions to minimize the total energy of the system, as illustrated in Figure 2.

Figure 2. Magnets minimizing energy (Image created by the Author)
Figure 2. Magnets minimizing energy (Image created by the Author)

Mother Nature is a giant optimizer, that tries to solve this problem on her own. As illustrated in Figure 3 if we start with a random assortment of spin particles, each oriented in a different direction, they will naturally attempt to realign themselves, seeking the minimum energy state. Quantum annealing leverages this principle. However, this process doesn’t always guarantee finding the absolute global minimum, as nature can still get stuck in local minima. That’s why we also rely on the power of the adiabatic theorem.

Figure 3. Quench of an ISING system on a two-dimensional square lattice (500 × 500) starting from a random configuration (Image downloaded from Wilkipedia https://en.wikipedia.org/wiki/Ising_model)
Figure 3. Quench of an ISING system on a two-dimensional square lattice (500 × 500) starting from a random configuration (Image downloaded from Wilkipedia https://en.wikipedia.org/wiki/Ising_model)

The adiabatic theorem, first formally stated by Max Born in 1928 (although Einstein had also explored the concept earlier), tells us that:

If a system starts in its lowest energy state and changes "slowly enough", it will remain in that minimum energy state as it evolves.

This theorem and its impact on quantum annealing can be challenging to grasp, so I’m going to use a metaphor that has worked well for me in the past. Keep in mind that, like all metaphors, it may not be entirely accurate from a formal physics perspective (it is not), but it serves as a helpful way to illustrate the concept.

Imagine we have a toddler who, in this example, represents a quantum system. Let’s get creative, like Elon Musk, and call this child Psi Ψ. Now, Psi Ψ can interact with various environments within the house where they live. These environments represent what we’ll call energy Hamiltonians. For instance, one Hamiltonian could be the living room of the house, with a comfy couch in front of the TV, as shown in Figure 4.

Figure 4. Energy Hamiltonian metaphor example (Image created by the author using Midjourney)
Figure 4. Energy Hamiltonian metaphor example (Image created by the author using Midjourney)

Another Hamiltonian could be for example the bedroom of Psi Ψ, as depicted in Figure 5.

Figure 5. Energy Hamiltonian metaphor example (Image created by the author using Midjourney)
Figure 5. Energy Hamiltonian metaphor example (Image created by the author using Midjourney)

Now, Psi Ψ can exist in various energetic states. For example, Psi Ψ could be in an excited state, let’s say they’re in the middle of a sugar rush, as shown in Figure 6.

Figure 6. Very Exited State Ψ (Image created by DALLE-3)
Figure 6. Very Exited State Ψ (Image created by DALLE-3)

Psi Ψ can also be in a ground state, a state of minimum energy. For example Psi Ψ might be in deep sleep as shown in Figure 7.

Figure 7. Ground State Ψ (Image created by DALLE-3)
Figure 7. Ground State Ψ (Image created by DALLE-3)

Now, there are certain Hamiltonians, which we’ll call H₀​, where Psi Ψ will naturally evolve toward a minimum energy ground state. For instance, if we let Psi Ψ play in the living room, even if they start in an excited state, they’ll eventually tire out and fall asleep on that comfy sofa, as shown in Figure 8. (I remember a childhood friend’s house with a terrace overlooking a beautiful valley, where you could feel the fresh mountain breeze. The terrace had a sofa and was filled with chimes that sang with the wind. It was one of the most relaxing environments I’ve ever experienced, and I’m sure anyone would naturally settle into a ground state in that kind of "Hamiltonian".)

Figure 8. Easy Hamiltonians H₀ (Images created by DALLE-3)
Figure 8. Easy Hamiltonians H₀ (Images created by DALLE-3)

In contrast, there are other Hamiltonians – typically the ones we’re most interested in – called H₁​, where it’s much more difficult for Psi Ψ to reach the ground state. For example, Psi Ψ really struggles to fall asleep when they are in their bedroom, as shown in Figure 9.

Figure 9. Hamiltonians of interest H₁ and Ψ in exited state (Image created by DALLE-3)
Figure 9. Hamiltonians of interest H₁ and Ψ in exited state (Image created by DALLE-3)

Unfortunately, what we really want is for Psi Ψ to be in the ground state of H₁​ – in other words, we want Psi Ψ to be fully asleep in their bedroom . But that’s tough to achieve, so what do we do?

Quantum annealing advises us not to fight against the current. Instead of directly seeking the ground state of H₁​, it suggests leveraging the power of the adiabatic theorem. Quantum annealing says, "Look, the starting point of this process should be an easy Hamiltonian, H₀​". We first allow Psi Ψ to naturally evolve into a ground state because it’s easy to achieve this in H₀​. So, we let Psi Ψ play in the living room, and once they are fully asleep on that comfy sofa, then – and only then – do we very gently, very softly, without disturbing their sweet dreams, begin to slowly change the Hamiltonian from H₀​ to H₁​. We gently pick them up from the sofa, slowly climb the stairs, and softly place them in their bed. If we do this process correctly and "slowly enough" (adiabatically), by the end of it, we should find Psi Ψ in the ground state – not of H₀​, but of H₁, as shown in Figure 10​. And voila 🥐, this means that we could use this technique to find the ground states of Hamiltonians of interest H₁.

Figure 10. Adiabatic evolution of Ψ (Image created by DALLE-3)
Figure 10. Adiabatic evolution of Ψ (Image created by DALLE-3)

Great! So, this means that we can indeed use this technique to solve complex optimization problems. All we need to do is find a way to encode the problem into an H₁ (using the ISING model)​ and identify a suitable H₀​. Then, we gradually transition from H₀​ to H₁​ by following the annealing schedule outlined below.

Figure 11. Annealing schedule (Image created by the author)
Figure 11. Annealing schedule (Image created by the author)

At this point, you might be wondering, "What about H₀​? How do we find it?" Fortunately, the initial Hamiltonian H₀ is easy to identify. We typically use a Hamiltonian known as the "tunneling Hamiltonian," which is described by Equation 4 below. The ground state for this Hamiltonian is simply each qubit in a "uniform" superposition (equal probability of being 0 or 1).

Now, when it comes to optimization, we often prefer not to use the +1,−1 basis of the ISING model. Instead, we like to encode optimization problems into Hamiltonians H₁​​ using an equivalent isomorphic formulation of the ISING model called QUBO (Quadratic Unconstrained Binary Optimization). QUBO expresses the problem as minimizing a quadratic polynomial over binary variables that can be either 0 or 1. When we formulate a problem as a QUBO, we must represent all variables of our optimization problem as binary and include each constraint as a penalty term added to the objective function. An excellent guide on how to do this was published by Fred Glover [6].

Once the optimization problem is in QUBO or ISING form, it must be mapped onto the quantum chip. The chip has a fixed architecture for the qubits – a topology, if you will (the current generation of D-wave quantum annealers uses a topology known as Pegasus). Since the qubits are not fully connected, we need to be creative when mapping the problem into the quantum chip. If for example a qubit i needs to interact with qubit j, but they aren’t directly connected in the machine’s topology, we need to find other qubits (or a group of them) that form a connecting path between i and j. This path is known as a "chain", and the process of mapping a problem onto the quantum chip is called minor embedding, this process consumes extra qubits that are now going to be used as logical qubits. Minor embedding is not trivial, as it is also an NP problem, but we can rely on heuristics to help find these embeddings. An example of the mapping produced from my research can be seen in Figure 12 below.

Figure 12. Minor embeddings for an RCPSP instance used in the research article of the author (Image created by the author)
Figure 12. Minor embeddings for an RCPSP instance used in the research article of the author (Image created by the author)

There’s one small detail about the current generation of quantum annealers: it’s been nearly impossible to recreate the ideal conditions required by the adiabatic theorem. Even minor disturbances, like those caused by cosmic rays [9], can disrupt the ground state and interfere with the annealing process. That’s why, in practice, we run the annealing process multiple times and consider all the output solutions as samples. We then collect these samples and select the one with the lowest energy hoping that it is the optimal solution.

Now that we have a solid understanding of how quantum annealing works, let’s move on to the next section, where we’ll apply it to solve one of the most challenging scheduling problems – the Resource-Constrained Project Scheduling Problem (RCPSP).

The RCPSP and it’s binary MILP formulation

As a reminder, we’re tackling a project scheduling problem where we need to manage limited resources, such as workers or equipment. This problem is known as the Resource-Constrained Project Scheduling Problem (RCPSP). It’s a challenging combinatorial optimization problem, classified as NP-hard by Blazewicz et al. (1983) [7]. The project consists of many tasks, each with its own duration and resource requirements, and there’s a maximum capacity for each resource. Additionally, tasks have "precedence constraints," meaning some tasks must be completed before others can begin. (Note that there are different types of precedence constraints; the "Finish-to-Start" (FS) type is the most common.)

The main goal is to create a schedule with the minimum makespan – a plan that determines the timing of each task to complete the entire project as quickly as possible. Our schedule must respect all constraints, which means we must adhere to precedence requirements (not breaking the order of tasks) and ensure that resource usage doesn’t exceed the available capacities.

There are many different Mixed Integer Linear Programming (MILP) formulations for the RCPSP, but we need to choose one that best adapts to quantum annealing. In my research, we explored 12 of the most commonly used formulations for the RCPSP. We selected the one that consumes the fewest qubits, as qubits are a precious resource in the current stage of quantum computing development, and their availability is a limiting factor. After an extensive analysis, we determined that the most suitable formulation is the time-indexed binary formulation of Pritzker et al. (1969) [8]. For more details on how we reach this conclusion, please refer to the research article.

This is a binary formulation where we have one binary variable x_{i,t} for each activity i and starting date t. This variable equals 1 if activity i starts on day t, and 0 otherwise. The total number of variables is equal to the number of project activities (plus two extra dummy activities – one for the project ‘Start’ and one for the project ‘End’) multiplied by the number of possible start dates t. The number of possible start dates is determined by the project time horizon T, which represents the maximum possible project duration. Each task has a duration p_i and a resource consumption u_{i,k}​ for each resource k. Precedence constraints are provided as a list of tuples E in the form (i,j), indicating that task j cannot start until task i is finished.

Figure 13. Pritzker et al. 1969 mathematical programming formulation for the RCPSP (Image created by the author)
Figure 13. Pritzker et al. 1969 mathematical programming formulation for the RCPSP (Image created by the author)

The formulation above might look daunting at first sight, but it is actually quite simple.

  1. The objective function (1) represents the sum of all possible start dates for the dummy variable x_{n+1,t}​ (where these dummy variables represent the project’s end date). We know that only one of these variables will equal 1 for a specifict. Therefore, by minimizing the sum of the product t*x_{n+1,t}, we are effectively minimizing the project duration.
  2. Constraint (2) ensures that each activity i has exactly one start date.
  3. Constraint (3) ensures that if an activity j follows another activity i, then activity j must start after the finish date of activity i, which is equal to the start date of i plus its duration p_i​.
  4. Finally, Constraint (4) guarantees that for any time period t, the schedule does not exceed the capacity R_k for any resource k.

In this article, we will solve a project with 21 activities (n=21) (including the two dummy ones) and two resources (k=2). The project details are provided in Table 1 below.

You can also download the instance as a text file in the .rcp format using the GitHub link provided below.

medium_rcpsp_dwave_cqm/RCPSP_19_instance.rcp at main · ceche1212/medium_rcpsp_dwave_cqm

Our project instance in the .rcp format is shown in the text file below.

  21    2 
  10   10

   0   0   0  11 2 3 4 5 6 10 16 17 18 19 20 
   3   1   1   3 15 14 7 
   3   2   8   3 13 12 11 
   3   5   5   2 12 11 
   2   7   3   1 9 
   2   5   6   1 8 
   2   0   0   1 12 
   3   4   4   1 11 
   9   4   6   1 11 
   4   5   4   1 11 
   3   5   6   1 21
   3   7   5   1 21
   6   9   6   1 21
   4   7   0   1 21
   7   6   3   1 21
   2   5   3   1 21
   1   3   7   1 21
   1   0   6   1 21
   2   6   4   1 21
   5   4   8   1 21
   0   0   0   0

The first line of the .rcp instance is always empty. The next line contains the number of project tasks/activities (including the dummy start and dummy end activities), followed by the number of resources k. In our project, we have two resources. Starting with the fourth line of the text file, you’ll find data specific to each project task/activity. The first column contains the duration p_i of the activities. The next k columns provide information on the amount of resources required u_{i,k}​ for activity i and resource k. The subsequent columns contain the precedence constraints, listing the activities that have activity i as a predecessor. For example, in the instance below, activities 11, 2, 3, 4, 5, 6, 10, 16, 17, 18, 19, and 20 require activity 1 to be completed before they can start.

Installing the libraries and setting up the Colab environment

In this section, we’ll walk through the steps to install and import the necessary libraries for solving our project scheduling problem, including setting up your D-Wave environment in Google Colab. We’ll also install [cheche_pm](https://pypi.org/project/cheche-pm/), a Python library I developed to simplify importing project data and constructing the required data structures for our optimization model. If you’re unfamiliar with cheche_pm, I recommend checking out another article where I’ve detailed its functionalities for a deeper understanding of its capabilities.

Efficient Project Scheduling with Python. The Critical Path Method

Let’s start first with getting your D-wave leap API token.

Figure 14. Dwave-Leap site (Snapshot taken by the author)
Figure 14. Dwave-Leap site (Snapshot taken by the author)
    1. Once you have created your account, please proceed to login and on the left-down side of your dashboard you will find your personnel API token.
Figure 15. Dwave-Leap dashboard (Snapshot taken by the author)
Figure 15. Dwave-Leap dashboard (Snapshot taken by the author)
    1. Copy your API token and paste it in the secrets tab of google colab using the key name dwave_leap. We will later import this secret key into the python session without having to reveal it, similar as the process of importing keys from a .env file.
Figure 16. Google Colab Secrets (Snapshot taken by the author)
Figure 16. Google Colab Secrets (Snapshot taken by the author)

Now that we have our D-wave key, let’s proceed to install and import the required libraries and setup the development environment. We can achieve this by following the snippet of code below.

!pip install cheche_pm
!pip install dwave-ocean-sdk

from itertools import product
from dwave.system.samplers import DWaveSampler
from dwave.system.composites import EmbeddingComposite
from dimod import ConstrainedQuadraticModel, CQM, SampleSet,
from dwave.system import LeapHybridCQMSampler
from dimod.vartypes import Vartype
from dimod import Binary, quicksum
from cheche_pm import Project 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import requests
from google.colab import userdata

We will now proceed to setup the D-Wave key using the userdata.get method of google colab.

endpoint = 'https://cloud.dwavesys.com/sapi'
token = userdata.get('dwave_leap') 

Now that the environment is set up, let’s move on to importing the project data and generating the required data structures. We’ll use some of the functionalities of cheche_pm for this. I’ve coded a method that allows us to read RCPSP instances directly from a .rcp file. We’ll use this method to download the instance data from my GitHub repo, import it into our Colab session, and create a Project instance by reading the .rcp file..

url = 'https://raw.githubusercontent.com/ceche1212/medium_rcpsp_dwave_cqm/main/RCPSP_19_instance.rcp'

response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Write the content to a file
    with open("instance.rcp", "w") as file:
        file.write(response.text)
    print("File downloaded and saved as 'instance.rcp'")
else:
    print("Failed to download the file. Status code:", response.status_code)

project = Project.from_rangen_1_rcp_file('instance.rcp')
project.create_project_dict()

Now that we have the project data, let’s visualize it using a project network diagram. This diagram represents a graph G(V,S) where each node in V is a project activity and each edge in S represents a precedence constraint. We can create this diagram using the .plot_network_diagram function from cheche_pm, which leverages pydot to generate and plot the graph.

project.plot_network_diagram()
Figure 17. Project Network Diagram (Image created by the author)
Figure 17. Project Network Diagram (Image created by the author)

We can also generate the data structures typically needed to formulate an RCPSP instance as a MILP. The cheche_pm library includes a method called .produce_outputs_for_MILP, which outputs a dictionary containing all the necessary data structures.

n,p,S,U,C,naive_ph = project.produce_outputs_for_MILP().values()

Here, n represents the number of project activities, p is a list of activity durations, S is a list of all precedence constraints, U is a list of resource consumption for each activity and each resource, and C is a list of resource capacities. The naive project duration, or time horizon naive_ph, is the sum of all activity durations, which is equivalent to executing the activities sequentially.

As I mentioned in the introduction of this article, qubits are a precious resource. Our formulation requires a number of variables/qubits equal to the combination of n+2 activities and t potential start times, with t ranging from 0 up to the time horizon naive_ph. Currently, the naive_ph for this project instance is 65 days, meaning we’re using a very conservative upper bound. If we can find a better upper bound, we can drastically reduce the number of qubits required and make the quantum annealer’s job easier. Fortunately, there are many constructive heuristics for project scheduling that can help us establish a better upper bound for the project time horizon. The cheche_pm library includes a collection of these heuristics, which we can easily call by following the code snippet below.

heuristic = project.run_all_pl_heuristics(max_resources=C)
new_ph = heuristic['makespan']

The method generates a bar graph showing the different makespans for each heuristic tested by cheche_pm.

Figure 18. Bar graph of the makespan of different heuristics tested (Image created by the author)
Figure 18. Bar graph of the makespan of different heuristics tested (Image created by the author)

Using this method, we were able to reduce the project time horizon from 65 to 40 days. This reduction allows us to save 525 qubits ((65−40)×21), representing a significant decrease in qubit usage, thanks to simple heuristics that run in fractions of a second.

Now that we’ve optimized our project time horizon and gathered all the necessary data, let’s proceed to optimize the project schedule using the quantum annealer.

D-wave CQM for the RCPSP

D-Wave offers various methods for solving optimization problems. If your problem is already in QUBO form, you can create a Binary Quadratic Model (BQM) and solve it. That’s exactly what I did in my research article. However, in this article, I wanted to demonstrate a simplified process that allows you to use the quantum annealer with a framework and language similar to those used by other commercial tools like Pyomo, CPLEX or Gurobi.

Here, we’ll use D-Wave’s Constrained Quadratic Model (CQM), which employs a hybrid quantum-classical approach. This approach simplifies the formulation process and enables us to solve larger problem instances. With the current Advantage 6.4 quantum annealer and the CQM solver, we can tackle problems with up to 1 million binary variables. In contrast, D-Wave’s purely quantum BQM method is limited to 5,760 available qubits, making it unsuitable for the project size we’re addressing in this article. The advantage of using a BQM, is that you have total control of the annealing parameters and the schedule, moreover you can use more advanced techniques such as Reverse Quantum Annealing, which is similar as starting the annealing evolution from a warm start.

If you’re interested in learning how to transform the RCPSP into a QUBO and solve smaller instances of the problem using the purely quantum BQM method, I cover that in detail in my research article. A link to all the BQM code used during the research is provided there, and the article is open access, so feel free to read it and download it.

To create a CQM model of our project, we simply follow the code below:

(R, J, T) = (range(len(C)), range(len(p)), range(new_ph))

# Create empty model
cqm=ConstrainedQuadraticModel()

# Create binary variables
x = {(i, t): Binary(f'x{i}_{t}') for i in J for t in T}

# Create objective function (1)
objective = quicksum(t*x[(n+1,t)] for t in T)
cqm.set_objective(objective)

# Add constraint (2)
for a in J:
  cqm.add_constraint( quicksum(x[(a,t)] for t in T) == 1 )

# Add constraint (3) Precedence constraints
for (j,s) in S:
  cqm.add_constraint( quicksum( t*x[(s,t)] - t*x[(j,t)] for t in T ) >= p[j])

# Add constraint (4) Resource capacity constraints
for (r, t) in product(R, T):
  r_c = quicksum( U[j][r]*x[(j,t2)] for j in J for t2 in range(max(0, t - p[j] + 1), t + 1))
  cqm.add_constraint( r_c <= C[r] )

print("CQM model created with number of variables = ",len(cqm.variables))

The CQM model has been created with a total of 840 variables. Now, we just need to invoke the hybrid QA sampler to solve it. First, we’ll create an empty sampler and pass in our API token and endpoint connection (defined at the start of the previous section).

cqm_sampler = LeapHybridCQMSampler(endpoint=endpoint, token=token,)

With the sampler created, the final step is to perform the QA evolution and retrieve the solution samples.

problem_name = 'RCPSP_Medium_Article'

sampleset = cqm_sampler.sample_cqm(cqm,label=problem_name)

annealing_solutions = len(sampleset) # number of solutions inside sampleset

running_time = sampleset.info['run_time'] # running time reported in microseconds

print(f"Annealing stoped after {running_time/1e6:.2f} seconds. {annealing_solutions} obtained")

Solving the instance and extracting the output

After 5.002 seconds, we obtained 58 samples from the hybrid quantum annealing process. Now, let’s extract the best solution and generate the output visualizations for our project.

We can create a dataframe from the results currently stored in the variable sampleset. This dataframe will have three columns: the solution, the energy (which represents the objective function value), and a boolean variable indicating whether the solution obtained from the quantum annealer is valid (feasible) or not.

df_sampleset = sampleset.to_pandas_dataframe(sample_column=True)
df_sampleset = df_sampleset[['sample','energy','is_feasible']]
df_sampleset = df_sampleset.sort_values(by = ['is_feasible','energy'], ascending = [False,True])

Let’s also proceed to extract the best feasible solution of our problem and see if the quantum annealer was able to obtain the optimal project schedule.

try:
  feasible_sampleset = sampleset.filter(lambda row: row.is_feasible)
  best = feasible_sampleset.first
except:
  print('No Feasible solution found')

Z = best.energy #objective value of the best solution
X_out = best.sample #best solution

print(f"Solution obtained with a final project duration of {Z} days")

The quantum annealer successfully produced a feasible project schedule with a makespan of 39 days, which corresponds to the optimal solution for this project instance. Now that we have the solution, let’s proceed to extract it into a schedule dictionary.

SCHEDULE = dict()
for act in project.PROJECT:

  i = project.PROJECT[act]['idx']
  for t in range(new_ph):
    key = f'x{i}_{t}'
    if X_out[key] == 1:
      row = {'ES':t,'EF':t+p[i]}
      SCHEDULE[act] = row
      print(act,row)

With the schedule now stored in a dictionary, we can use another functionality of cheche_pm to produce a Gantt chart by calling the .plot_date_gantt method within the library.

schedule_df = project.generate_datetime_schedule(SCHEDULE)

project.plot_date_gantt(schedule_df,plot_type = 'matplotlib')
Figure 19. Gantt Chart of the solution (Image created by the author)
Figure 19. Gantt Chart of the solution (Image created by the author)

Conclusions

In conclusion, this article has guided you through a detailed process of using D-Wave’s Constrained Quadratic Model (CQM) to schedule a resource-constrained project. I’ve explained what quantum annealing is and how it can be applied to optimization problems, all based on my own research. It’s important to remember that, unlike classical optimization methods such as branch-and-bound, quantum annealing doesn’t guarantee optimality since it’s a heuristic method. However, it’s an extremely powerful heuristic that can be applied to a wide range of optimization problems.

I also compared Gurobi and quantum annealing on the same instance we have worked with in this article, but this time using for both methods the naive time horizon. As shown in Figure 20, Gurobi took 446.9 seconds to produce the optimal solution of 39 days using the naive time horizon, while quantum annealing took only 5.1 seconds to generate a near-optimal solution of 40 days.

Figure 20. Benchmark of gurobi and D-waves CQM (Image created by the author)
Figure 20. Benchmark of gurobi and D-waves CQM (Image created by the author)

Even though quantum annealing didn’t achieve the optimal solution in this benchmark, the beauty of heuristics lies in their speed and flexibility. Heuristics can be combined with exact methods to reach the optimal solution more quickly. In fact, most commercial solvers run a series of heuristics before employing more rigorous methods. When the solution from quantum annealing was used as a warm start for Gurobi, the entire process only took 83.5 seconds (including the time for QA) to produce the optimal solution 🎉 .

In my opinion, the true beauty of quantum computing isn’t just about proving supremacy or proving clear quantum advantages; it’s about how we can combine these quantum capabilities with classical methods to create even more powerful solutions to challenging. In summary I think hybrid is the future, and that is where I am currently steering my research.

You can find the entire notebook with the code developed for this article in the link to my GitHub repository just below.

medium_rcpsp_dwave_cqm/RCPSP_DWAVE_CQM_Medium.ipynb at main · ceche1212/medium_rcpsp_dwave_cqm

I sincerely hope that this article had been enjoyable and had served as a helpful resource for anyone, trying to learn more about quantum computing and more specifically about quantum annealing. If so, I’d love to hear your thoughts! Please feel free to leave a comment or show your appreciation with a clap 👏 . And if you’re interested in staying updated on my latest articles, consider following me on Medium. Your support and feedback are what drive me to keep exploring and sharing. Thank you for taking the time to read, and stay tuned for more insights in my next article!

References

  • [1] Pérez Armas, L. F., Creemers, S., & Deleplanque, S. (2024). Solving the resource constrained project scheduling problem with quantum annealing. Scientific Reports, 14(1), 16784. https://doi.org/10.1038/s41598-024-67168-6
  • [2] Albash, T., & Lidar, D. A. (2018). Adiabatic quantum computation. Reviews of Modern Physics, 90(1), 015002.
  • [3] Kadowaki, T., & Nishimori, H. (1998). Quantum annealing in the transverse Ising model. Physical Review E, 58(5), 5355.
  • [4] Kirkpatrick, S., Gelatt Jr, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. science, 220(4598), 671–680.
  • [5] Istrail, S. (2000, May). Statistical mechanics, three-dimensionality and NP-completeness: I. Universality of intracatability for the partition function of the Ising model across non-planar surfaces. In Proceedings of the thirty-second annual ACM symposium on Theory of computing (pp. 87–96).
  • [6] Glover, F., Kochenberger, G., & Du, Y. (2019). Quantum Bridge Analytics I: a tutorial on formulating and using QUBO models. 4or, 17(4), 335–371.
  • [7] Blazewicz, J., Lenstra, J. K., & Kan, A. R. (1983). Scheduling subject to resource constraints: classification and complexity. Discrete applied mathematics, 5(1), 11–24.
  • [8] Pritsker A, Watters L, Wolfe P (1969) Multi-project scheduling with limited resources: a zero-one programming approach. Manage Sci 16:93–108
  • [9] Martinis, J.M. Saving superconducting quantum processors from decay and correlated errors generated by gamma and cosmic rays. npj Quantum Inf 7, 90 (2021). https://doi.org/10.1038/s41534-021-00431-0

The post Solving a Constrained Project Scheduling Problem with Quantum Annealing appeared first on Towards Data Science.

]]>
How to Solve an Asset Storage Problem with Mathematical Programming https://towardsdatascience.com/how-to-solve-an-asset-storage-problem-with-mathematical-programming-3b96b7cc22d1/ Thu, 11 Jul 2024 07:39:32 +0000 https://towardsdatascience.com/how-to-solve-an-asset-storage-problem-with-mathematical-programming-3b96b7cc22d1/ Solving a two-dimensional assortment problem with Python and Gurobipy

The post How to Solve an Asset Storage Problem with Mathematical Programming appeared first on Towards Data Science.

]]>
Moving out is hard (Image created by DALLE-3)
Moving out is hard (Image created by DALLE-3)

Moving out is hard and, to be honest, very annoying – it kind of sucks. It’s especially challenging if you don’t have a new place to go. In such cases, you’ll need to rent temporary storage place for your belongings while you stay with friends, which can be pricy since storage places charge by the area used (And before you tell me that these kind of situations never happen, allow me to point out that this exact situation happened to one of my friends when she moved from France to Poland).

If moving out is already difficult for individuals, imagine the complexity of moving out and temporarily storing an entire factory. You might think, "Wow, Luis, this is crazy," but such situations happen in real life. It happened to me, and unfortunately, I didn’t have the analytical tools to handle it at the time.

I used to work in the oil and gas service industry. During my last years there, business conditions in the country deteriorated rapidly and became volatile. For some services, the situation became untenable, leading management to cut losses and shut them down. One of the significant costs these services incurred was the rent for their operational facilities, so it was logical to stop using them to save on rent. However, shutting down a facility meant closing entire workshops and moving all the assets to a storage location. Remember, this is oil and gas – these assets are highly sophisticated tools worth millions of dollars. Just some of the spare parts of these assets were worth thousands and were easy targets for theft and black market sales. So, the challenge was not only to move the assets but also to store them temporarily and, most importantly, safeguard them.

My manager, a brilliant guy, approached me with this problem. He told me we were going to bring all the assets of an entire service division into the main facilities and store them in a secure, restricted area. Since these assets were so valuable, we wanted to avoid any possibility of theft or damage. For instance, some specialized trucks had parts that cost thousands of dollars, and given the country’s conditions, the temptation for theft was high. Thus, no one should enter the safe area unless authorized by management. To enforce this rule, we planned to enclose the area with an electric fence ⚡ and infrared sensors connected to an alarm 🚨 . The sensors would trigger the alarm if someone breached the perimeter (I am not kidding; this is a true story). The issue with this plan was that all that protection is expensive, and the company that rents it charges based on the total enclosed area. My manager wanted me to figure out the spatial arrangement of these assets to minimize the safe zone area, and thus the cost. The electric fence company required the enclosed area to be rectangular or square. Additionally, we couldn’t stack the assets to avoid damage (you can’t put a truck on top of another truck), so we were dealing with a two-dimensional storage problem, known formally in the literature as the assortment problem.

Figure 1. Protect the assets at all cost. Sharks are just in the image as a joke, to be honest if we look at the statistics about sharks it is evident that they would be really bad guardians. Shark attacks are actually very rare, In fact during all 2023 there were only 69 confirmed unprovoked attacks worldwide and 22 confirmed provoked shark attacks. Despite these stats these creatures have a bad reputation and are hunted every year (looking at you JAWS), some shark species are in fact endangered and It is estimated that 73 million sharks are killed annually. So please if you are reading this caption help raising awareness about the need to protect sharks worldwide and visit the WWF site to have more information. (Image created by DALLE-3)
Figure 1. Protect the assets at all cost. Sharks are just in the image as a joke, to be honest if we look at the statistics about sharks it is evident that they would be really bad guardians. Shark attacks are actually very rare, In fact during all 2023 there were only 69 confirmed unprovoked attacks worldwide and 22 confirmed provoked shark attacks. Despite these stats these creatures have a bad reputation and are hunted every year (looking at you JAWS), some shark species are in fact endangered and It is estimated that 73 million sharks are killed annually. So please if you are reading this caption help raising awareness about the need to protect sharks worldwide and visit the WWF site to have more information. (Image created by DALLE-3)

I wish I could tell you that I solved this problem using my mathematical programming superpowers, but that didn’t happen. This was long before I had my PhD, and at that time, I had no clue how to approach this problem with mathematical Optimization – in fact, I wouldn’t have even known how to start googling it (like how on earth I was supposed to known that the problem was called the assortment problem?). Sometimes in life, you just don’t know what you don’t know.

"There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know." – Donald Rumsfeld

In fact learning how to solve some of these problems was my main motivation to continue my studies. So in this article I am going to teach you how to formulate this problem, how to solve it using Python and gurobipy and at the end I will briefly talk about some heuristic procedures that can be applied to tackle it.

If you have never used Gurobi before, especially from Google Colab, I invite you to read one of my previous articles where I explain how to set up your Gurobi web license in your Colab environment. You can find the link to the article below:

Optimizing Project Schedules: Using Gurobipy and Cheche_pm warm-starts for Efficient RCPSP…

If you also need a tutorial and more information about linear programming (LP) or integer programming (IP) techniques I strongly recommend you to check these excellent articles written by Bruno, I am leaving the links down below. Another great source is William’s HP book about Mathematical Programming [1].

Linear programming: Theory and applications

A Comprehensive Guide to Modeling Techniques in Mixed-Integer Linear Programming

Article index:

The assortment problem

A general version of the assortment problem involves selecting a subset of items from a larger set to maximize a certain objective, often revenue or profit, under constraints such as shelf space or budget. It’s a common problem in retail and operations management, involving optimization techniques and often consumer choice modeling. The specific assortment problem we are dealing with in this article is also known as the 2D rectangle packing problem, which frequently appears in logistics and production contexts [2–4].

In this problem, the goal is to determine the minimum size of the object such that the individual smaller units can be cut out from it. Formally, it can be defined as the problem of placing a given set of rectangles of the same or different sizes, without overlap, within a rectangle of minimum area [5]. This problem has multiple applications in facility layout situations, and there is an entire class of heuristic approaches for tackling it [6–8].

The problem requires the following variables:

Figure 2. Variables required for modeling the problem (Image created by the author)
Figure 2. Variables required for modeling the problem (Image created by the author)
  • Variables x_i and y_i correspond to the (x,y) coordinates of each one of the assets of our problem that are represented by "Rectangles". We have a set I with all the assets or rectangles.
  • Variables b_i_j_k are auxiliary binary variables that we are going to need for stablishing some "OR" condition on the model, their usage will be explained later with more detail.
  • Variables X and Y correspond to the Total Width and Total Height, resulting from the arrangement of all the assets, with these two variables we will calculate the total area used to store all the assets.
  • finally each rectangle i is going to have a variable r_i, that is going to be used to determine if the asset i is going to be rotated or not.

The problem can be modeled using the formulation below:

Figure 3. Mathematical programming formulation 2D assortment problem (Created by the author)
Figure 3. Mathematical programming formulation 2D assortment problem (Created by the author)

The formulation above might look daunting at first sight, but it is actually quite simple. The objective function corresponds to the total area, which is the product of X and Y, the total width and height, respectively.

Constraint (1) ensures that the total width value X is greater than the x-coordinate plus the width of each item. Similarly, constraint (2) ensures that the total height value Y is greater than the y-coordinate plus the height of each item.

Constraints (3–7) ensure there is no overlap between the positions of the different items. These constraints work as an OR condition, using the variables b_i_j_1, b_i_j_2, b_i_j_3, and b_i_j_4 for each item, along with the Big-M method. In summary, an item can be to the left, to the right, on top, or below another item, but never overlapping.

Constraints (8–11) ensure that the variables belong to the appropriate domains.

Installing the libraries and setting up the Colab environment

To use Gurobi in Google Colab, we first need to install it using the following code:

!pip install gurobipy

Once installed we can proceed to import the required libraries for this project.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from itertools import chain, combinations, permutations, product
import gurobipy as gp
from gurobipy import GRB
from copy import deepcopy

We also need to initialize a gurobi session, so we need to create a params dictionary with all the relevant information of our gurobi license.

WLSACCESSID = '< Copy your WLSACCESSID here >'
WLSSECRET = '< Copy your WLSSECRET here >'
LICENSEID = '< Copy your LICENSEID here >'

params = {
"WLSACCESSID": WLSACCESSID,
"WLSSECRET": WLSSECRET,
"LICENSEID": LICENSEID,
}
env = gp.Env(params=params) #this line initializes the gurobi environment

Input data & pre-processing

For the sake of simplicity, we are going to work with only 34 items of 7 different dimensions, which are kept as small values for illustration purposes. Please note that in my previous experience, I had to deal with hundreds of assets of massive sizes, some of which were literally heavy load trucks. The dimensions of the example items are shown in Table 1.

To ensure that the data is available for use, we will store the different values in lists and then prepare an extended dataframe with each item as a separate row. We can achieve this with the following code:

widths = [4,1,2,3,2,6,10]
heights =[3,1,1,2,4,2,4]
quants = [4,10,8,5,2,3,2]

WIDTHS = []
HEIGHTS = []
for q,w,h in zip(quants,widths,heights):
  for x in range(q):
    HEIGHTS.append(w)
    WIDTHS.append(h)

data_df = pd.DataFrame()
data_df['HEIGHTS'] = HEIGHTS
data_df['WIDTHS'] = WIDTHS

N = len(data_df) # number of items
top = max(data_df['HEIGHTS'].sum(),data_df['WIDTHS'].sum()) # maximum value of X or Y
M = top # to be used as big M for the OR constraints

I = range(N) # for the indexes of each asset "i"
K = range(4) # for the index of the OR variables "b"

Solving the problem with Gurobipy

Now that we have all the required libraries and input data, we can proceed to create the mathematical model described in Figure 3. To implement this model using GurobiPy, we just need to follow the code snippet below:

model = gp.Model("Assortment",env=env)

# (x,y) Coordinate variables
x = model.addVars(I,lb = 0,ub = top,vtype=GRB.CONTINUOUS, name="x")
y = model.addVars(I,lb = 0,ub = top,vtype=GRB.CONTINUOUS, name="y")

# Rotation variables 
R = model.addVars(I,vtype=GRB.BINARY,name = 'R')

X = model.addVar(lb=0,ub = top,vtype = GRB.CONTINUOUS,name = "X")
Y = model.addVar(lb=0,ub = top,vtype = GRB.CONTINUOUS,name = "Y")

# b variables for OR condition
b_vars = [(i,j,k) for i in I for j in I if j!=i for k in K]
B = model.addVars(b_vars,vtype = GRB.BINARY,name = "B")

# Objective function
model.setObjective(X*Y,GRB.MINIMIZE);

# constraints (1) and (2)
for i in I:

  model.addConstr(X >= x[i] + WIDTHS[i]*R[i] + (1-R[i])*HEIGHTS[i])
  model.addConstr(Y >= y[i] + HEIGHTS[i]*R[i] + (1-R[i])*WIDTHS[i])

# Constraints (3-7)
for i in I:
  for j in I:
    if i == j:
      continue
    else:
      #constraint (3)
      model.addConstr(x[i] + WIDTHS[i]*R[i] + (1-R[i])*HEIGHTS[i] <= x[j] + M*(1-B[i,j,0]))
      #constraint (4)
      model.addConstr(x[j] + WIDTHS[j]*R[j] + (1-R[j])*HEIGHTS[j] <= x[i] + M*(1-B[i,j,1]))
      #constraint (5)
      model.addConstr(y[i] + HEIGHTS[i]*R[i] + (1-R[i])*WIDTHS[i] <= y[j] + M*(1-B[i,j,2]))
      #constraint (6)
      model.addConstr(y[j] + HEIGHTS[j]*R[j] + (1-R[j])*WIDTHS[j] <= y[i] + M*(1-B[i,j,3]))
      #constraint (7)
      model.addConstr(B[i,j,0] + B[i,j,1] + B[i,j,2] + B[i,j,3] >= 1)

Please note that constraints (8–11) are established during the variable creation lines, so we don’t need to explicitly add them via the model.addConstr method. Now that the model is complete, we can proceed to solve it. We will let this model run for 600 seconds (10 minutes) since this is a complicated problem to solve. We will also set up a MIP gap of 5%, meaning if the optimization process is within 5% of the optimum value, we can terminate the process and get the best solution obtained so far. To solve the problem, follow the code below:

tl = 600
mip_gap = 0.05

model.setParam('TimeLimit', tl)
model.setParam('MIPGap', mip_gap)
model.optimize()

After 600 seconds, we are still far from the optimal solution, but the used area has decreased substantially. It has gone from an initial value of 620 (please remember this value, as it will be relevant in the next section) to a final value of 238. That’s almost one-third of the initial solution, which is not bad.

Figure 4. Output of the optimization process from gurobipy (Image created by the author)
Figure 4. Output of the optimization process from gurobipy (Image created by the author)

With the problem solved, we just need to extract the solution and plot it. We can easily accomplish this using the Rectangle object from matplotlib by following the code below:

all_vars = model.getVars()
values = model.getAttr("X", all_vars)
names = model.getAttr("VarName", all_vars)

obj = round(model.getObjective().getValue(),0)

total_X = int(round((X.x),0))
total_Y = int(round((Y.x),0))

fig, ax = plt.subplots()

for item in I:

  coords = (x[item].x,y[item].x)

  if R[item].x <= 0.01:
    wid = HEIGHTS[item]
    hig = WIDTHS[item]
  else:
    wid = WIDTHS[item]
    hig = HEIGHTS[item]

  ax.add_patch(Rectangle(coords, wid, hig,
            edgecolor = 'black',
            facecolor = "Grey",
            fill = True,
            alpha = 0.5,
            lw=2))
ax. set_xlim(0, total_X )
ax. set_ylim(0, total_Y )

ax.set_xticks(range(total_X+1))
ax.set_yticks(range(total_Y+1))
ax.grid()
ax.set_title(f" Total area {total_X} x {total_Y} = {int(obj)}")

plt.show()

We can visualize the obtained solution of the assortment from Figure 5 below.

Figure 5. Plot of the solution obtained from gurobipy (Image created by the author)
Figure 5. Plot of the solution obtained from gurobipy (Image created by the author)

Finally, we can save the solution into a dataframe, which can later be downloaded as a CSV or Excel file if we want to share it with a colleague. We can easily do this by following the code snippet below:

output_list = []
for i in I:

  print(f"item {i} x:{x[i].x}, y:{y[i].x}, Rotated:{R[i].x <= 0.01}")
  row = {'item':i,'x':round(x[i].x,2),'y':round(y[i].x,2),'Rotated':R[i].x <= 0.01}
  output_list.append(row)

output_df = pd.DataFrame(output_list)
output_df.to_csv("output_solution.csv") # to download the solution as a .csv file

Constructive Heuristics solutions

As I mentioned in the introduction, although I would have loved to solve this problem optimally using the mathematical programming formulation from the previous section, I wasn’t aware of these techniques when I first faced this challenge. However, that doesn’t mean I didn’t provide a solution. In fact, I did arrange all the assets, and they were enclosed in a safe area. The solution wasn’t optimal, but it was a solution nonetheless. How did I do it? Most likely, I used a heuristic or a combination of various heuristics.

Formally, a heuristic is "any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless good enough as an approximation or attribute substitution." We might not realize it, but we constantly apply heuristics to problem-solving, and this assortment problem is no exception. We can build various constructive heuristics. In the next section, we will discuss two of them.

FFDH: First Fit Decreasing Height Heuristic

With this approach, we simply sort the rectangles by height and then place each asset in the first available position that fits, moving from left to right. To implement this heuristic, we will create a new class in Python and then develop the heuristic as a function that works with instances of that class. We can easily create the new class and instantiate each asset using the code below:

# definition of the new class
class Rectangle_class:
    # constructor
    def __init__(self, width, height,index):
        self.width = width
        self.height = height
        self.x = 0
        self.y = 0
        self.index = index

# create a list of all the assets
rectangles = [] 

# initialize a rectangle instances for each asset
for i in range(len(data_df)): 

  h,w = data_df.iloc[i,0], data_df.iloc[i,1]
  REC = Rectangle_class(w,h,i)
  rectangles.append(REC)

The heuristic function can be implemented using the code below:

def ffdh(rectangles):
    # Sort rectangles by height in descending order
    rectangles.sort(key=lambda rect: rect.height, reverse=True)

    # Initialize variables to keep track of the positions
    current_y = 0
    current_x = 0
    row_height = 0
    total_width = 0

    for rect in rectangles:
        if current_x + rect.width > total_width:
            total_width = current_x + rect.width
        # If rectangle fits in the current row
        if current_x + rect.width <= total_width:
            rect.x = current_x
            rect.y = current_y
            current_x += rect.width
            row_height = max(row_height, rect.height)
        else:
            # Move to the next row
            current_y += row_height
            rect.x = 0
            rect.y = current_y
            current_x = rect.width
            row_height = rect.height

    total_height = current_y + row_height
    return rectangles, total_width, total_height

This function returns the modified list of the assets with their new x and y coordinates, but it also returns the total width and height used to store the assets. Let’s see what we get when we use this heuristic with our problem of 34 assets.

# call the function and store the outputs
packed_rectangles, total_width, total_height = ffdh(rectangles)

total_X = total_width
total_Y = total_height

# calculate the final area used
obj = total_X*total_Y

# plot the solution
fig, ax = plt.subplots(figsize=(16, 6))

for rect in packed_rectangles:

  coords = (rect.x,rect.y)
  wid,hig = rect.width,rect.height

  ax.add_patch(Rectangle(coords, wid, hig,
            edgecolor = 'black',
            facecolor = "Grey",
            fill = True,
            alpha = 0.5,
            lw=2))
ax. set_xlim(0, total_X )
ax. set_ylim(0, total_Y )

ax.set_xticks(range(total_X+1))
ax.set_yticks(range(total_Y+1))
ax.grid()
ax.set_title(f" Total area {total_X} x {total_Y} = {int(obj)}")
plt.xticks(rotation=30)
plt.show()

The obtained output solution is shown in Figure 6 below:

Figure 6. Plot of the solution obtained from the FFDH heuristic (Image created by the author)
Figure 6. Plot of the solution obtained from the FFDH heuristic (Image created by the author)

The solution provided by this heuristic has a total area of 620, which coincidentally was also the starting point for Gurobi. Before the solver starts the branch-and-bound algorithm, it applies various heuristics to refine the initial solution, and most likely this heuristic was one of them.

Shelf Heuristic

This heuristic arranges the assets on a plane with a maximum width constraint (that in this case is provided by us), sorting them by height in descending order. It places the assets on a current shelf until the width limit is reached, then moves to a new shelf, updating the total width and height accordingly. The heuristic is implemented using the function below:

def shelf_heuristic(rectangles, max_width):
    # Sort rectangles by height in descending order
    rectangles.sort(key=lambda rect: rect.height, reverse=True)

    current_x = 0
    current_y = 0
    shelf_height = 0

    for rect in rectangles:
        if current_x + rect.width > max_width:
            # Move to a new shelf
            current_x = 0
            current_y += shelf_height
            shelf_height = 0

        # Place the rectangle
        rect.x = current_x
        rect.y = current_y
        current_x += rect.width
        shelf_height = max(shelf_height, rect.height)

    # total_width = max_width
    # total_height = current_y + shelf_height

    total_width = max([rec.x + rec.width for rec in rectangles])
    total_height = max([rec.y + rec.height for rec in rectangles])

    return rectangles, total_width, total_height

Let’s see what we get when we use this heuristic with our problem of 34 assets, and using a shelf width of 20.

container_width = 20
packed_rectangles,total_width, total_height = shelf_heuristic(rectangles, container_width)

total_X = total_width
total_Y = total_height

obj = total_X*total_Y

# generating the plot of the solution
fig, ax = plt.subplots(figsize=(16, 6))

for rect in packed_rectangles:

  coords = (rect.x,rect.y)
  wid,hig = rect.width,rect.height

  ax.add_patch(Rectangle(coords, wid, hig,
            edgecolor = 'black',
            facecolor = "Grey",
            fill = True,
            alpha = 0.5,
            lw=2))
ax. set_xlim(0, total_X )
ax. set_ylim(0, total_Y )

ax.set_xticks(range(total_X+1))
ax.set_yticks(range(total_Y+1))
ax.grid()
ax.set_title(f" Total area {total_X} x {total_Y} = {int(obj)}")
plt.xticks(rotation=30)
plt.show()

The obtained output solution is shown in Figure 7 below:

Figure 7. Plot of the solution obtained from the Shelf heuristic (Image created by the author)
Figure 7. Plot of the solution obtained from the Shelf heuristic (Image created by the author)

The solution provided by this heuristic has a total area of 340, which is almost half of the area obtained from FFDH. This is impressive, especially considering that heuristics like this one run very quickly and that we are not allowing for rotation of the assets, which would likely improve the quality of the heuristic significantly. Please note that there are literally dozens of heuristics for this problem, we just covered two popular ones; for more information, please check references [6–8].

You might be thinking, wow Luis, it would be nice if the solver could start from this 340 area instead of the previous 620, and you would be totally right. That’s why we will cover this in the next section.

Solving the problem with a warm start from a heuristic

Modern solvers like Gurobi allow users to provide a "warm start" solution when setting up their model. This initial solution helps make the optimization process more efficient. From the solution obtained from our shelf heuristic, we can extract the values and transform the solution into the format expected by the model. Remember, this includes not only the coordinates but also the total width, height, rotation variables, and the overlapping variables b_i_j_k. We can easily transform a solution from a heuristic by using the code snippet below:

heuristic_dict = dict()
for rect in packed_rectangles: 
  index = rect.index
  heuristic_dict[index] = rect

b_values = dict()
for i in I:
  for j in I:
    if i == j:
      continue
    else:
      rect_i = heuristic_dict[i]
      rect_j = heuristic_dict[j]

      # initialize values of b at 0
      b_values[(i,j,0)] = 0
      b_values[(i,j,1)] = 0
      b_values[(i,j,2)] = 0
      b_values[(i,j,3)] = 0

      # check the different conditions and stablish the real values of b
      if rect_i.x + rect_i.width <= rect_j.x:
        b_values[(i,j,0)] = 1

      if rect_j.x + rect_j.width <= rect_i.x:
        b_values[(i,j,1)] = 1

      if rect_i.y + rect_i.height <= rect_j.y:

        b_values[(i,j,2)] = 1

      if rect_j.y + rect_j.height <= rect_i.y:

        b_values[(i,j,3)] = 1   

# store the solution values of x,y and r
x_dict = dict()
y_dict = dict()
r_dict = dict()

for i in I:

  REC = heuristic_dict[i]

  x_dict[i] = REC.x
  y_dict[i] = REC.y
  r_dict[i] = 1 # all assets are not rotated

Now that we have extracted the heuristic solution and transformed it into the proper form expected by our model, we can proceed to re-run the model, adding this solution as a starting point. The code below shows how to create the new model and add the warm start solution.

# Same model as before
model = gp.Model("Assortment_warm_start",env=env)

x = model.addVars(I,lb = 0,ub = top,vtype=GRB.CONTINUOUS, name="x")
y = model.addVars(I,lb = 0,ub = top,vtype=GRB.CONTINUOUS, name="y")

R = model.addVars(I,vtype=GRB.BINARY,name = 'R')

X = model.addVar(lb=0,ub = top,vtype = GRB.CONTINUOUS,name = "X")
Y = model.addVar(lb=0,ub = top,vtype = GRB.CONTINUOUS,name = "Y")

b_vars = [(i,j,k) for i in I for j in I if j!=i for k in K]
B = model.addVars(b_vars,vtype = GRB.BINARY,name = "B")

model.setObjective(X*Y,GRB.MINIMIZE);

for i in I:

  model.addConstr(X >= x[i] + WIDTHS[i]*R[i] + (1-R[i])*HEIGHTS[i])
  model.addConstr(Y >= y[i] + HEIGHTS[i]*R[i] + (1-R[i])*WIDTHS[i])

for i in I:
  for j in I:
    if i == j:
      continue
    else:
      model.addConstr(x[i] + WIDTHS[i]*R[i] + (1-R[i])*HEIGHTS[i] <= x[j] + M*(1-B[i,j,0]))
      model.addConstr(x[j] + WIDTHS[j]*R[j] + (1-R[j])*HEIGHTS[j] <= x[i] + M*(1-B[i,j,1]))

      model.addConstr(y[i] + HEIGHTS[i]*R[i] + (1-R[i])*WIDTHS[i] <= y[j] + M*(1-B[i,j,2]))
      model.addConstr(y[j] + HEIGHTS[j]*R[j] + (1-R[j])*WIDTHS[j] <= y[i] + M*(1-B[i,j,3]))

      model.addConstr(B[i,j,0] + B[i,j,1] + B[i,j,2] + B[i,j,3] >= 1)

# adding the warm start solution from the heuristic
for var in x:
  x[var].Start = x_dict[var]
  y[var].Start = y_dict[var]

  R[var].Start = r_dict[var]

X.Start = total_X
Y.start = total_Y

for var in B:
  B[var] = b_values[var]

Now that the model was initialized with the warm start, we can proceed to solve it.

tl = 600
mip_gap = 0.05

model.setParam('TimeLimit', tl)
model.setParam('MIPGap', mip_gap)
model.optimize()

If everything goes according to plan you should see a message from the solver, saying that it is going to start the optimization process from the solution provided by the user. Like in Figure 8 below.

Figure 8. Output from gurobi, stating that the optimization process is starting from the initial solution provided by the user (Image created by the author)
Figure 8. Output from gurobi, stating that the optimization process is starting from the initial solution provided by the user (Image created by the author)

Conclusions

In conclusion, this article has taken you through a detailed journey on how to solve a Two-Dimensional assortment problem using mathematical programming. We discussed a Mixed Integer Linear Programming formulation for this problem, as well as how to tackle it using some classical constructive heuristics. Moreover we covered how to include the solutions obtained from these heuristics as warm start for our solver. I hope this article serves as a helpful resource for anyone facing similar challenges in their current jobs. I would have loved 💙 to have had this knowledge back then, but in any case it’s never too late to learn 😄 . You can find the entire notebook with the code developed for this article in the link to my GitHub repository just below.

Youtube_optimization/2D_Assortment_Problem_Medium_Article_for_github.ipynb at main ·…

I sincerely hope you found this article useful and entertaining. If so, I’d love to hear your thoughts! Please feel free to leave a comment or show your appreciation with a clap 👏 . And if you’re interested in staying updated on my latest articles, consider following me on Medium. Your support and feedback are what drive me to keep exploring and sharing in this fascinating field. Thank you for taking the time to read, and stay tuned for more insights in my next article!


References

  • [1] Williams HP. Model Building in Mathematical Programming. fourth ed. Chichester: Wiley; 1999
  • [2] Huang, Y.H., Lu, H.C., Wang, Y.C., Chang, Y.F. and Gao, C.K., 2020. A Global Method for a Two-Dimensional Cutting Stock Problem in the Manufacturing Industry. Application of Decision Science in Business and Management, p.167.
  • [3] Li HL, Chang CT. An approximately global optimization method for assortment problems. European Journal of Operational Research. 1998;105:604–612
  • [4] Martello S, Vigo D. Exact solution of the two-dimensional finite bin packing problem. Management Science. 1998;44:388–399
  • [5] Chen CS, Sarin S, Balasubramanian R. A mixed-integer programming model for a class of assortment problems. European Journal of Operational Research. 1993;65:362–367
  • [6] Apple, J.M., and Deisenroth, M.P., "A computerized plant layout and evaluation technique – PLANET", Proceedings of AIIE Annual Conference, May 1973, 121.

  • [7] Buffa, E.S., Armour, G.C., and Vollman, T.E., "Allocating facilities with CRAFT", Harvard Business Review, March-April, 1964, 130.

  • [8] Seehof, J.M., and Evans, W.O., "Automated layout design program (ALDEP)", Journal of Industrial Engineering De￾cember (1967) 690.

The post How to Solve an Asset Storage Problem with Mathematical Programming appeared first on Towards Data Science.

]]>
How I Created a Kaggle-Like Platform for My Students Using Streamlit and How You Can Do It as Well https://towardsdatascience.com/how-i-created-a-kaggle-like-platform-for-my-students-using-streamlit-and-how-you-can-do-it-as-well-5fd10671f559/ Thu, 20 Jun 2024 21:27:33 +0000 https://towardsdatascience.com/how-i-created-a-kaggle-like-platform-for-my-students-using-streamlit-and-how-you-can-do-it-as-well-5fd10671f559/ Gamify machine learning student projects with Streamlit and Google Sheets

The post How I Created a Kaggle-Like Platform for My Students Using Streamlit and How You Can Do It as Well appeared first on Towards Data Science.

]]>
Gamify machine Learning student projects with Streamlit and Google Sheets
Rankings and points can be a great source of motivation (Image created by DALLE-3)
Rankings and points can be a great source of motivation (Image created by DALLE-3)

I love Kaggle and believe its contribution to disseminating Data Science and machine learning has been invaluable. As a fan of video games and gamification, I appreciate how Kaggle’s ranking and points system encourages participants to improve their models through healthy, constructive competition. Kaggle has become so popular, that many instructors have incorporated Kaggle as part of their preferred tools to teach machine learning.

However, as a business school instructor who teaches some machine learning courses, I’ve found that using Kaggle as a tool to evaluate my students’ final machine learning projects a little bit cumbersome. For starters, tracking students submissions is tedious and manual, and my students (please note that most of them are beginners to data science and programming in general), often find it discouraging to see the result of their efforts ranked at the bottom of the Kaggle ranking. Having said this, I think that it is important to acknowledge that Kaggle wasn’t designed as a teaching tool, which explains its limitations in this context.

I’ve always dreamed of creating a mini version of Kaggle tailored for my students. This platform would allow me to replicate Kaggle’s Gamification success and serve as a template for various subjects, including mathematical programming and combinatorial optimization. Initially, I felt discouraged by the effort required to build such a platform using typical web development frameworks available in python link Django or Flask.

Thankfully, I recently discovered Streamlit and its ability to interact with Google Sheets. In this article, I’ll show you how to use Python, Streamlit, and Google Sheets to build a Kaggle-like web application to gamify your lessons. This app will let students log in with individual accounts, submit their solutions via CSV file uploads, score their solutions based on various machine learning metrics, and dynamically rank the submissions. Best of all, I’ll explain how you can deploy this app for free.

Are you ready for some hands-on learning? Let’s have a glimpse at our final app results…

Figure 1. App deployed and in use (Figure created by the author)
Figure 1. App deployed and in use (Figure created by the author)

Please note that this article might be lengthy. I aim to provide as much detail as possible, as I believe it could be beneficial for many teachers and professors who may not be as fluent in Python as a typical data science expert. If you are already a Python expert, you might prefer to skip the article and jump directly to the project’s GitHub repository below:

GitHub – ceche1212/project_medium_kaggle_app: Repository for code for a kaggle like app that can be…

In the original project that I implemented with my students, the app featured three different machine learning sections: one for a regression problem, one for a binary classification problem, and one for a time series forecasting problem. For simplicity, this tutorial will focus on just one: a binary classification problem using the famous Pima diabetes dataset from the UC Irvine Machine Learning Repository. This dataset can also be downloaded from Kaggle.

Article index:

Streamlit and Google sheets

Since version 1.28 in 2023, Streamlit allows users to connect to Google Sheets using their st.connection method. This method enables users to perform CRUD (create, read, update, and delete) operations using Google Sheets as a database. I learned about these features from a YouTube video made by Sven | Coding Is Fun. If you want to watch it, I’m leaving the link below.

I know what you’re thinking, but before you get scared, hear me out. I understand that Excel (Google sheets) is not a database, and I agree with you. I also have nightmares about dealing with companies that use it as one. However, for the app we want to create, it’s more than enough. It allows us to do everything we need, it’s available online, it’s private (since only we and the app can access it – unless someone is skilled enough to hack my Google account), and best of all, it’s free. I do recognize this is an area for improvement, so I’m exploring the possibility of replacing Google Sheets with a connection to Supabase.

Figure 2. Excel is not a database, unless... (Image created by the author)
Figure 2. Excel is not a database, unless… (Image created by the author)

Before jumping straight into the implementation, it’s worth examining the different modules the app must have, as well as the strategy we will follow for the implementation.

App design

The app we are going to create requires several processes. Firstly, we want to ensure that only our students have access to it, so we need a login system. Once users are logged in, they will need a module to submit their results. This requires a process to upload .csv files, verify that the files comply with the expected number of rows and requested columns, and then calculate an evaluation metric comparing the model predictions against the real test data. After this, the student should be able to see the classroom ranking and how their result compares to their peers. Finally, students should be able to see a log of all their submissions and their teammates’ submissions, as this is a group project. Figure 3 showcases an overall view of the User Flow Diagram for our app.

Figure 3. User flow diagram of the app (Image created by the author)
Figure 3. User flow diagram of the app (Image created by the author)

App implementation

For this app, we will use Visual Studio Code. I strongly recommend creating a new project folder on your machine and opening Visual Studio Code from that folder. In my case, I decided to name the folder project_app_medium.

Setting up the project environment

I strongly recommend creating a virtual environment for each Streamlit app you make to avoid dependency conflicts with your other Python projects. After creating and activating your virtual environment, we need to install the following libraries.

pandas == 1.5.3
numpy == 1.26.4
matplotlib
streamlit
streamlit_option_menu
streamlit-extras
st-gsheets-connection
scikit-learn

To install the libraries, create a new text file and name it requirements.txt. Inside this blank text file, copy the libraries listed above and save it; we will need this file for the deployment of our app. Then, in the terminal, type the following command.

pip install -r requirements.txt

This command will install all the libraries listed in the requirements.txt file. Regarding the libraries we are using, we’ll start with the classic "Mirepoix" of all data science projects: numpy, pandas, and matplotlib. We also need streamlit, as this is the base library that contains our framework. To extend the functionality of Streamlit, we will import some community-developed extensions such as streamlit_option_menu, which allows us to create simple sidebar menus, and streamlit-extras, which contains many customizable features. Additionally, we will use st-gsheets-connection to help us connect with Google Sheets. Please note that besides from the libraries above, we are also going to use hashlib for data security and protection. We are going to talk more about this when defining the details of database.

Throughout this tutorial, the following folder structure will be used, containing the following elements:

  • .streamlit: This folder is where general Streamlit settings will be placed. Inside this folder we will store the secrets.toml file which will contain the Google Sheets API credentials needed to stablish a connection.
  • app.py: The main Streamlit script.
  • .gitignore: As its name suggest, files ignored by Git respectively, when making the project commit.
  • logo.png (optional): Is an image that contains the logo of your company, to be displayed in top of the sidebar menu of our app. This is completely optional, in my case I am showing the logo of my company SAVILA GAMES.
  • requirements.txt: Python dependencies to run the application.
  • README.md: Project description.
project_app_medium/
│
├── .streamlit/                # General streamlit configuration
│   └── secrets.toml           # credentials for google sheets connection
├── app.py                     # App code
├── .gitignore
├── logo.png               
├── requirements.txt           # Python dependencies
└── README.md 

Setting up the google sheets database

Now that the environment is set up, we need to create the structure of the Google Sheets database. Open your Google Sheets app and create a new file; I named mine project_database. Then, proceed to create four tabs in the file. First, the "users" tab will contain all the user login credentials as well as the user group configuration. We will use the information from this tab to create the login module of the app. The tab should have the following column structure:

| email                  | name    | last name | password  | group    |
|------------------------|---------|-----------|-----------|----------|
| john.doe@example.com   | John    | Doe       | Pass1234  | G1       |
| jane.smith@example.com | Jane    | Smith     | SecurePwd | G2       |
| alex.jones@example.com | Alex    | Jones     | MyPass789 | G1       |
| emma.brown@example.com | Emma    | Brown     | Emma12345 | G3       |

The next tab is the "log" tab. This tab will store the historical information of the submissions made by the users. It will also be used for the logic behind the ranking and the history submissions module of the app. The tab should have the following column structure:

| user                  | group     | time             | score |
|-----------------------|-----------|------------------|-------|
| john                  | G1        | 2024-06-17 10:00 | 0.85  |
| jane                  | G2        | 2024-06-17 11:00 | 0.72  |
| alex                  | G1        | 2024-06-17 12:00 | 0.90  |
| emma                  | G3        | 2024-06-17 13:00 | 0.65  |

The next tab will be the "test_data" tab. This tab will contain the real test y data used to evaluate the quality of the outputs submitted by the students. For this tutorial, we will split the Pima dataset and select the last 78 rows as the test dataset. The tab will have only one column containing the binary outcome data, with the following column structure:

| y          |
|------------|
| 0          |
| 1          |
| 1          |
| 0          |

The final tab we will create is the ‘configuration’ tab. This tab will contain customizable parameters for the project, such as the deadline and the number of tries allowed per team per day (for reference, Kaggle allows five tries per team per day). This tab will enable us to dynamically change the project characteristics so that it can be easily adapted for use in different semesters. The "configuration" tab should have only one row and the following column structure:

| deadline              | max_per_day     |
|-----------------------|-----------------|
| 2024-07-01 23:59      | 5               |

You can download and view an example of the google sheets used in this tutorial project by clicking on the link below:

example google sheet project medium

Data Privacy and Security

This is a small project that should be under our control and available only to our students. However, we must take precautions regarding data security, especially concerning the private information of our students. Imagine if, for some reason, the database gets leaked and falls into the hands of scammers or malicious entities. Having access to the names, emails, and passwords of our students could put them at risk. Therefore, we must ensure that, even if this happens, the data remains meaningless to these malevolent agents.

You might wonder if we can implement such a system at minimal cost. This is where hashing and the hashlib library come to the rescue.

Hashing is the process of converting input data into a fixed-size string of characters, typically a hash code, using a mathematical algorithm. It ensures data integrity, facilitates quick data retrieval, and securely stores sensitive information. What hashing algorithms are available? Fortunately, Python comes with hashlib, which includes several hashing algorithms:

  • MD5 (md5)
  • SHA-1 (sha1)
  • SHA-224 (sha224)
  • SHA-256 (sha256)
  • SHA-384 (sha384)
  • SHA-512 (sha512)
  • SHA-3 family (sha3_224, sha3_256, sha3_384, sha3_512)
  • BLAKE2 family (blake2b, blake2s)

In our tutorial, we will use one of the available hashing algorithms in hashlib, specifically SHA-256, to transform the emails and passwords of our students into hashed codes. These hashed codes will be stored in the database. This way, even if the database is leaked, their data remains protected. The beauty of hashing lies in the fact that it is impractically hard to reverse the hashed code back to the original information through brute force. For instance, when the emails from the previous section are hashed using SHA-256, they become secure and unreadable.

# original emails

['john.doe@example.com',
 'jane.smith@example.com',
 'alex.jones@example.com',
 'emma.brown@example.com']
# Hashed emails

['836f82db99121b3481011f16b49dfa5fbc714a0d1b1b9f784a1ebbbf5b39577f',
 'f2d1f1c853fd1f4be1eb5060eaae93066c877d069473795e31db5e70c4880859',
 '134318bc6349ad35d7e6b95123898eecdd437ad9b0c49cc4bdd66a811afc6909',
 'd41d9b2f5671358bc6faf79b7435b4a9805a72d012f06d4804815328f39aed1e']

Pretty difficult to decipher, don’t you think? Below, you can find a function that, given a dataframe and a column, returns the hashed items of the specified column. This way you can hashed your database information and then store this values in the online google sheets database.

import hashlib

def hashit(df,column):

  return_list = []
  for data in df[column].tolist():
    hash_object = hashlib.sha256()
    hash_object.update(data.encode())
    return_list.append(hash_object.hexdigest())

  return return_list

You might be thinking, "But Luis, if they have the username and password, they can log in and impersonate one of the students." However, that situation is already covered. The students will log into the app using the emails and passwords provided by us, but they will input their real email and password on the login screen. The app will then take their login credentials, hash them using SHA-256, and verify the hashed output against the database.

So, if the database gets leaked and someone tries to use the information to log in, it won’t work because their input will be hashed again, and it won’t match the stored hash in the database. Let’s look at the code and outputs below as an example.

password = "password"
hash_object.update(password.encode())
hash_password = hash_object.hexdigest()

print(hash_password)

print output:

"5377a16433598554e4a73a61195dbddea9d9956a22df04c3127c698b0dcdee48"

Now if we rehash the already hashed password, as in the code below.

hash_object.update(hash_password.encode())
double_hash_password = hash_object.hexdigest()
print(double_hash_password)

We obtain the following:

"dfd4bb46c954f3802c7c2385b1a6b625b3cf0b4ce6adf59d3eec711c293994bb"

You can easily verify that these two passwords definitely do not match. So you see hashing the previously hashed password produces something completely new.

Establishing google sheets connection

All the instructions required to establish the connection can be found in the GitHub repository of the st-gsheets-connection package. Let’s follow the instructions together:

  • Go to the Google Developers Console and create a new project. Right next to the Google Cloud icon, you will find a drop-down menu. Click on it and then click "Create New Project." Name your project as you wish; in my case, I am calling it project_app_medium.
Figure 4. Google developer console creating project (Image created by the author)
Figure 4. Google developer console creating project (Image created by the author)
  • Now, with the project selected, we need to activate two different APIs: Google Drive and Google Sheets. In the search bar at the top of the page, type "Google Drive," select the API, and then click on "Enable." Repeat the same steps for Google Sheets.
Figure 5. Google developer console enabling API's (Image created by the author)
Figure 5. Google developer console enabling API’s (Image created by the author)
  • With the project APIs enabled, we now need to create a technical user with access to them. Click on "Credentials," then click on "Create Credentials," and select the "Service Account" option. Assign a name to the technical user; in my case, I named it "medium-project-google-sheets." Assign the "Editor" role to the technical user and finally click on "Done."
Figure 6. Google developer console creating technical user (Image created by the author)
Figure 6. Google developer console creating technical user (Image created by the author)
  • With the technical user created, we now need to generate the credentials for this user. Click on the user you just created, then click on "Keys," then click on "Add Key," and select "Create New Key." Choose the JSON option and then click on "Done." This will automatically download a JSON file with all the credentials needed to use Google Sheets from our app.
Figure 7. Google developer console creating technical user JSON credentials (Image created by the author)
Figure 7. Google developer console creating technical user JSON credentials (Image created by the author)
  • The last step is to store the credentials we just downloaded into a secrets.toml file. If you haven’t done so yet, create a new folder named .streamlit inside your project folder. Inside this folder, create a new file named secrets.toml. Open the file with the text editor of your choice (such as VS Code) and paste the information below into it.
# .streamlit/secrets.toml

[connections.gsheets]
spreadsheet = "<spreadsheet-name-or-url>"
worksheet = "<worksheet-gid-or-folder-id>"  # worksheet GID is used when using Public Spreadsheet URL, when usign service_account it will be picked as folder_id
type = ""  # leave empty when using Public Spreadsheet URL, when using service_account -> type = "service_account"
project_id = ""
private_key_id = ""
private_key = ""
client_email = ""
client_id = ""
auth_uri = ""
token_uri = ""
auth_provider_x509_cert_url = ""
client_x509_cert_url = ""
  • Substitute each element in the secrets.toml file with the data from the JSON credentials file downloaded from Google. For the "spreadsheet" field, copy the URL of the Google Sheets database we created for the project. Next copy the "client_email" data from the JSON file, and then go to the Google Sheets database. In the spreadsheet, click on "Share," then paste the "client_email" into the text input field, make sure the "Editor" permission is selected, and click "Send."

With all these preparations complete, we are now ready to code the app.

Libraries, state session variables and app config

We are now going to proceed to import the required libraries for the app and create the session state variables that the app will use. Most of the session state variables will be associated with the login module. In Streamlit, session state variables store information across different interactions within a session. They help maintain state, such as user inputs or selections, between reruns of the app. Initially, these variables will be set to empty strings and will be updated as the app runs. For our specific app, we will create state variables for the username (using the student email as the username), the password, and the group to which the user belongs. We will also set the page title and the page icon (favicon) using the st.set_page_config() method from Streamlit.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime
from pathlib import Path
import streamlit as st
from streamlit_option_menu import option_menu
from streamlit_extras.add_vertical_space import add_vertical_space
from streamlit_extras.stylable_container import stylable_container 
from streamlit_gsheets import GSheetsConnection
from sklearn import metrics
import hashlib

if 'user_name' not in st.session_state:
    st.session_state['user_name'] = ''

if 'student_name' not in st.session_state:
    st.session_state['student_name'] = ''

if 'password' not in st.session_state:
    st.session_state['password'] = ''

if 'group' not in st.session_state:
    st.session_state['group'] = ''

st.set_page_config(
        page_title='Medium Project', 
        page_icon='📈 ' 
    )

To test that everything is working correctly, you can run the application from the root of the directory by executing the following command in the terminal. This will run the app on your localhost.

streamlit run app.py

You should see an empty page named "Medium Project" with the favicon that we selected 📈 , as shown in Figure 8.

Figure 8. Empty Streamlit page (image created by the author)
Figure 8. Empty Streamlit page (image created by the author)

Login module

Now that the session state variables are created, we can proceed to code the first module of our app: the login module. This module will contain the logic illustrated in Figure 9.

Figure 9. Login module process flow diagram (image created by the author)
Figure 9. Login module process flow diagram (image created by the author)

First, we will establish a connection with our Google Sheets database. Next, we will create the sidebar navigation menu using the st.sidebar method in combination with the option_menu method, allowing us to easily create different pages for our app. Once this is set up, we will establish the logic for the login module.

If a user is not logged in, they will be prevented from accessing the next modules. In the login module, we will ask for the username (email) and password. These credentials will be verified against the database. If they match, the user credentials will be stored in the session state variables, and the user will receive a message confirming a successful login. If not the user will receive a warning message, stating that it failed to login. We will also store the project configuration data from the database (project deadline and maximum number of submissions per day). The logic above will be implemented by following the code below.

login = None
log_df_n_cols = 4
login_df_ncols = 5
config_df_ncols = 2

conn = st.connection("gsheets", type=GSheetsConnection)
users_existing_data =  conn.read(worksheet="users", usecols=list(range(login_df_ncols)), ttl=1)
users_existing_data = users_existing_data.dropna(how="all")
users_existing_data.index = users_existing_data['email']

gs_user_db = users_existing_data.T.to_dict()

configuration_parameters = conn.read(worksheet="configuration", usecols=list(range(config_df_ncols)), ttl=1)

deadline = configuration_parameters['deadline']
deadline = pd.to_datetime(deadline)

max_per_day = int(configuration_parameters['max_per_day'])

today = pd.Timestamp.today()

difference = deadline - today

days = difference.dt.days.iloc[0]
seconds = difference.dt.seconds.iloc[0]
hours = seconds // 3600
minutes = (seconds % 3600) // 60

# emojis are obtained from bootstrap icons https://icons.getbootstrap.com/
with st.sidebar:
    st.image("./savila_games_logo.png")
    selected = option_menu(
        menu_title='MEDIUM APP',
        options= ['Login','Rankings','My group Submissions','Submit Results'],
        icons=['bi bi-person-fill-lock', '123','bi bi-database','bi bi-graph-up-arrow'], menu_icon="cast"
    )

    add_vertical_space(1)

    if selected == 'Login':

        user_name = st.text_input('User email', placeholder = 'username@email.com')
        user_password =  st.text_input('Password', placeholder = '12345678',type="password")
        login = st.button("Login", type="primary")
    add_vertical_space(1)

if selected == 'Login':

   st.header('Welcome to your Final group project Challenge')
   st.divider()
   st.subheader(f" Time remaining: {days} days, {hours} hours and {minutes} minutes")
   st.divider()
   st.write('Please proceed to login using the left side panel')
   st.write('⚠ Please note that access to subsequent sections is restricted to logged-in users.')
   st.divider()
   st.subheader('Rules:')
   with st.expander('Rules',expanded=True):
      st.markdown("- This app lets you submit your project calculations.")
      st.markdown("- Everyone on your team can submit their solutions")
      st.markdown(f"- The site let each team submit {max_per_day} submissions per day max ")
      st.markdown("- For each one of the parts whichever team comes up with the best result for a given part gets the top grade for that part.")
      st.markdown("- In case there is a tie. The team that submitted first the solution will get the higher grade.")  

   st.subheader('How to use the app:')
   with st.expander('How to use the app',expanded=True):
      st.markdown("- Check out the **Ranking** 🥇 tab to see where your team stands for each one of the parts.")   
      st.markdown("- Click on **My Group Submissions** 🗃  to see the history of all the solutions that your team has submitted.")
      st.markdown("- Hit **Submit Results** 📊  to drop in a new solution.")
      st.markdown("- Every time you submit, the app checks if it's all good and pops out some feedback in case you had a bug.") 
      st.markdown("- Keep trying new things and submitting - there's no limit")          
   add_vertical_space(1)

if login:

    hash_object_user_name = hashlib.sha256() #hashing
    hash_object_user_name.update(user_name.encode())
    hashed_username = hash_object_user_name.hexdigest()

    if hashed_username not in gs_user_db.keys():
        with st.sidebar:
            st.error('Username not registered')
    else:

        hash_object_password = hashlib.sha256() #hashing
        real_password = gs_user_db[hashed_username]['password']

        hash_object_password.update(user_password.encode())
        hashed_user_password = hash_object_password.hexdigest()

        if hashed_user_password != real_password:
            with st.sidebar:
                st.error('Sorry wrong password')
        else:
            user_first_name = gs_user_db[hashed_username]['name']
            group = gs_user_db[hashed_username]['group']
            st.session_state['user_name'] = user_name
            st.session_state['student_name'] = user_first_name
            st.session_state['password'] = real_password
            st.session_state['group'] =  group
            with st.sidebar:
                st.success(f'{user_first_name} from group ({group}) succesfully log-in', icon="✅")

with st.sidebar:
    if st.session_state['user_name'] != '':
        st.write(f"User: {st.session_state['user_name']} ")
        st.write(f"Group: {st.session_state['group']} ")
        logout = st.button('Logout')
        if logout:
            st.session_state['user_name'] = ''
            st.session_state['password'] = ''
            st.session_state['group'] = ''
            st.session_state['student_name'] = ''
            st.rerun()
    else:
        st.write(f"User: Not logged in ")

Submit results module

Now, with the login module in place, we can proceed to create the submission module. In this module, students will be able to upload a CSV file with the prediction outputs of their models. This module will contain the logic illustrated in Figure 10.

Figure 10. Submit results module process flow diagram (image created by the author)
Figure 10. Submit results module process flow diagram (image created by the author)

We will only allow users to access this module if the project deadline has not passed and if their team has not exceeded the maximum number of submissions per day (to avoid spamming the app, and the risk of overfitting to the test dataset). If either of these conditions fails, the student will receive a warning stating the issue. If everything is correct, the student can submit their results by uploading a CSV file. If the shape of the file matches the requirements (number of rows equal to the test dataset size, and the file contains a column called "predictions"), the submission will be accepted. If the shape does not match, the student will receive a warning message with details about the issue.

If everything matches, the module will calculate the model evaluation metric. In this specific case, we are using the accuracy score, but you could use the F1 score if preferred – the code is easy to modify. After this is done, the module will store the student’s submission in our Google Sheets database, in the "log" tab. The logic above will be implemented by following the code below.

if selected == 'Submit Results':

  st.markdown("""
      <style>
      div[data-testid="stMetric"] {
          background-color: #EEEEEE;
          border: 2px solid #CCCCCC;
          padding: 5% 5% 5% 10%;
          border-radius: 5px;
          overflow-wrap: break-word;
      }
      </style>
      """
      , unsafe_allow_html=True)

  st.header('Submit Predictions')
  st.subheader("Machine Learning Classification")
  st.divider()
  st.subheader(f" Time remaining: {days} days, {hours} hours and {minutes} minutes")
  st.divider()

  if st.session_state['user_name'] == '':
      st.warning('Please log in to be able to submit your project solution')
  else:

      group_log_df = conn.read(worksheet="log", usecols=list(range(log_df_n_cols)), ttl=1).dropna(how="all")
      group_log_df = group_log_df[group_log_df['group'] == st.session_state['group']]
      group_log_df['time'] = pd.to_datetime(group_log_df['time'])

      test_data = conn.read(worksheet="test_data", usecols=list(range(1)), ttl=30).dropna(how="all")
      test_data_y = test_data['y']

      n_test = len(test_data)

      current_date = pd.Timestamp.today()

      submissions_count = group_log_df[(group_log_df['time'].dt.date == current_date.date())].shape[0]

      time_diff = deadline - current_date
      time_diff = time_diff.dt.total_seconds().iloc[0]

      if time_diff <= 0:
          st.warning('Sorry you cannot longer submit anymore, the project deadline has passed')
      else:
          if submissions_count >= max_per_day:
              st.warning(f'Sorry your team has already submitted {submissions_count} times today, and this exceeds the maximun amount of submissions per day')
          else:

              user_file = st.file_uploader("Upload your predictions file",type=['csv'])
              st.caption(f"Your file must have {n_test} rows and at least one column named 'predictions' with your model predictions")

              if user_file is not None:

                  submit_pred = st.button('submit',type="primary",key="submit_pred")

                  if submit_pred:

                      pred_df = pd.read_csv(user_file) 

                      if 'predictions' not in pred_df.columns.to_list():
                          st.error('Sorry there is no "predictions" column in your file', icon="🚨  ")
                      elif len(pred_df) != n_test:
                          st.error(f'Sorry the number of rows of your file ({len(pred_df)}) does not match the expected lenght of ({n_test})', icon="🚨  ")
                      else:
                          with st.spinner('Uploading solution to database'):
                              user_predictions = pred_df['predictions']

                              timestamp = datetime.datetime.now()
                              timestamp = timestamp.strftime("%d/%m/%Y, %H:%M:%S")
                              st.write(f'Submitted on: {timestamp}')                

                              ACC = metrics.accuracy_score(test_data_y,user_predictions)

                              F1 = metrics.f1_score(test_data_y,user_predictions)

                              cm = pd.DataFrame(metrics.confusion_matrix(test_data_y,user_predictions),
                                              columns = ["T Pred","F Pred"],index=["T Real","F Real"])

                              columns_part_2 = st.columns(3)

                              with columns_part_2[0]:
                                  st.metric("ACCURACY",f"{100*ACC:.1f} %")
                              with columns_part_2[1]:
                                  st.metric("F1-Score",f'{F1:.3f}')

                              with columns_part_2[2]:
                                  st.dataframe(cm,use_container_width=True)

                              solution_dict = dict()
                              solution_dict['user'] = st.session_state['student_name']
                              solution_dict['group'] = st.session_state['group']
                              solution_dict['time'] = timestamp
                              solution_dict['score'] = ACC

                              logs_df_2 = conn.read(worksheet="log", usecols=list(range(log_df_n_cols)), ttl=1).dropna(how="all")
                              solution_2 = pd.DataFrame([solution_dict])
                              updated_log_2 = pd.concat([logs_df_2,solution_2],ignore_index=True)
                              conn.update(worksheet="log",data = updated_log_2)
                              st.success(f'Your solution was uploaded on: {timestamp}',icon="✅")
                              st.balloons()

Dynamic ranking module

Our app already allows logged-in users to submit their solutions. Now, we need to create the gamification 🎮 aspect of the app by adding a dynamic ranking that shows students how their solutions compare to those submitted by other teams. The logic of this module is straightforward and does not require a diagram. Essentially, we need to gather all the data from the "log" tab of our Google Sheets database, find the best score for each team, and present these scores in a table, with the team with the best score at the top position, and so on. In case of a tie, where two or more teams have the same score, the app will give a higher rank to the team that submitted their solution first. The logic above will be implemented by following the code below.

if selected == "Rankings":
    st.header('Rankings')

    if st.session_state['user_name'] == '':
        st.warning('Please log in to be able to see The rankings')
    else:
        st.write('The table below shows the rankings for the project')

        rank_df = conn.read(worksheet="log", usecols=list(range(log_df_n_cols)), ttl=1).dropna(how="all")
        GROUPS = list(rank_df['group'].unique())
        default_time = pd.to_datetime('01/01/1901, 00:00:00')

        st.header("Part 4: Machine Learning Classification")
        st.divider()

        ranking_list_2 = []
        for gr in GROUPS:

            mini_df_2 = rank_df[rank_df['group'] == gr]
            if len(mini_df_2) == 0:
                row = {'group':gr,'Accuracy':0,'time':default_time}
                ranking_list_2.append(row)
                continue
            else:
                best_idx_2 = np.argmax(mini_df_2['score'])
                best_value_2 = mini_df_2.iat[best_idx_2,-1]
                best_time_2 = pd.to_datetime(mini_df_2.iat[best_idx_2,2])
                row = {'group':gr,'Accuracy':best_value_2,'time':best_time_2}
                ranking_list_2.append(row)
        ranking_df_2 = pd.DataFrame(ranking_list_2).sort_values(by = ['Accuracy','time'],ascending=[False, True])
        ranking_df_2 = ranking_df_2.reset_index(drop=True)
        ranking_df_2.iat[0,0] = ranking_df_2.iat[0,0] + "   🥇"
        ranking_df_2.iat[1,0] = ranking_df_2.iat[1,0] + "   🥈"
        ranking_df_2.iat[2,0] = ranking_df_2.iat[2,0] + "   🥉"
        st.dataframe(ranking_df_2,use_container_width=True,hide_index=True)

Submissions log module

The last module we will implement is the Submissions Log module. This module will allow each student to access a historical log of all submissions made by themselves and their teammates throughout the project duration. The logic of this module is straightforward and does not require a diagram. We need to gather all the data from the "log" tab of our Google Sheets database, filter it for the current user’s group, and present the information in a table format. The logic above will be implemented by following the code below.

if selected == 'My group Submissions':
    st.header('My Group Submissions')

    if st.session_state['user_name'] == '':
        st.warning('Please log in to be able to see your submission history')
    else:
        st.write(f'The table below shows you the submission history of your group: **{st.session_state["group"]}**')
        group_log_df = conn.read(worksheet="log", usecols=list(range(log_df_n_cols)), ttl=1).dropna(how="all")
        group_log_df = group_log_df[group_log_df['group'] == st.session_state['group']]
        group_log_df = group_log_df[['user','time','score']]

        st.subheader('Submissions History:')
        st.dataframe(group_log_df,use_container_width=True,hide_index=True)    

With this last module coded, the app is complete. The entire code can be found at the link below.

project_medium_kaggle_app/app.py at main · ceche1212/project_medium_kaggle_app

App deployment

So far, our app runs smoothly on local execution. However, the main point of building it is to share it with your students and use it for their final project. This is where deployment becomes crucial. For this project, I chose to deploy the app using the Streamlit Community Cloud. It is free and easy to use. The service uses GitHub to deploy the app, with the only drawback being that the GitHub repository must be public. Therefore, it is essential not to upload any sensitive information, such as the technical user credentials we downloaded from Google. This information will be managed directly within the Streamlit Community Cloud service, so don’t worry about it. If you want to explore other deployment options, I encourage you to check out the excellent article below written by Damian Boh.

3 Easy Ways to Deploy your Streamlit Web App Online

To deploy our app, follow the steps below:

  1. Create a new public GitHub repository. The README.md file can be created directly when you create the repo.
  2. Upload or commit the following files, but please do not, and I repeat, DO NOT upload the secrets.toml file:
  • app.py
  • requirements.txt
  • logo.png (optional, in case you want to customize your app with your university logo or your company logo)

Your repo should look something like Figure 11.

Figure 11. Example snapshot of the repository for deployment. Please note that there is no secrets.toml file in the repo Please do not share your private credentials with the world (image created by the author)
Figure 11. Example snapshot of the repository for deployment. Please note that there is no secrets.toml file in the repo Please do not share your private credentials with the world (image created by the author)
  1. Go to the Streamlit Community Cloud site https://streamlit.io/cloud and sign in. If you don’t have an account, create one.
  2. Once signed in, click on the "Create App" button located in the top right corner of the site, and then select the option to get the app code from GitHub.
  3. Input the name of the repo that contains your app files, select the main branch, and input the name of the Python file (in our case, app.py). You can also customize the app URL name; I chose medium-kaggle-like-app. The form should look something like Figure 12.
Figure 12. Creating a new app in the Streamlit community cloud service (Image created by the author)
Figure 12. Creating a new app in the Streamlit community cloud service (Image created by the author)
  1. Click the "Deploy" button. It might take a few minutes.
  2. Once the app is deployed, you will immediately receive an error from Streamlit, as shown in Figure 13. This is completely normal since the deployed app does not have access to the technical user credentials because we never uploaded the secrets.toml file to GitHub. But don’t worry, we will fix this error in the next step.
Figure 13. Streamlit deployment error (Image created by the author)
Figure 13. Streamlit deployment error (Image created by the author)
  1. In the lower right corner of the site, there is a menu called "Manage App." Click on it, then open the sandwich menu, select "Settings," then select "Secrets," and copy the content of the secrets.toml file into there. The app will automatically reboot and should work just fine.
Figure 14. Fixing the Streamlit deployment error (Image created by the author)
Figure 14. Fixing the Streamlit deployment error (Image created by the author)

You can check the final deployed app by clicking the link below. Log in using the username: john.doe@example.com and the password: pass1234. In the GitHub repository for this project, I’ve included a .csv file that you can submit as your results.

Real life implementation results

Using the framework of this tutorial, I gamified the final project of my Python for Finance course that I teach to my master’s students. Honestly, my expectations for this project were modest. I hoped each team would interact with the platform at least twice during the project, so having 50–60 interactions across 7 teams and 3 project sections by the project deadline would have been considered a success.

But my students gave me one of the best gifts an instructor can receive. After one month, the app received over 690 submissions, nearly 12 times my original expectations. For me, these levels of engagement are unprecedented. Each group submitted an average of more than 30 submissions per project section, equating to roughly one submission per day per section. Comparing the first submission to the best submission for each section and team, there was an average improvement of 21%, with some teams improving their submissions by more than 60%. If I had implemented a typical version of this project instead of the gamified version, this level of improvement likely would not have materialized. This demonstrates the power of gamification. Now, you have an app that you can easily implement in your classes, and everything is free. Awesome, don’t you think?

If you want to read more about gamification, please make sure to read my previous article, where I discuss the logic behind it and how it can help foster engagement and learning.

Transforming Group Projects: Enhancing Learning with Gamification Techniques

Conclusions

This article has taken you through the entire process of creating a CRUD app using Streamlit, integrated with Google Sheets, that can be used to gamify machine learning projects for students. I have also shown you how to deploy your app using the Streamlit Community Cloud service. Please note that this code is extremely flexible and is not limited to only machine learning projects. I also adapted it for one of my project scheduling classes and achieved excellent results.

In the article below, Bruno Scalia C. F. Leite uses Streamlit to deploy a logistics app. If you want to learn how you can use Streamlit to create operations research applications, I recommend checking out his article by following the link below.

Designing Operations Research Solutions: A User-friendly Routing Application with Streamlit

I sincerely hope you found this article useful and entertaining. If so, I’d love to hear your thoughts! Please feel free to leave a comment or show your appreciation with a clap 👏 . And if you’re interested in staying updated on my latest articles, consider following me on Medium. Your support and feedback are what drive me to keep exploring and sharing. Thank you for taking the time to read, and stay tuned for more insights in my next article!

References

The post How I Created a Kaggle-Like Platform for My Students Using Streamlit and How You Can Do It as Well appeared first on Towards Data Science.

]]>
Solving a Resource Planning Problem with Mathematical Programming and Column Generation https://towardsdatascience.com/solving-a-resource-planning-problem-with-mathematical-programming-and-column-generation-07fc7dd21ca6/ Wed, 05 Jun 2024 07:24:08 +0000 https://towardsdatascience.com/solving-a-resource-planning-problem-with-mathematical-programming-and-column-generation-07fc7dd21ca6/ Solving the minimum vertex coloring problem via column generation

The post Solving a Resource Planning Problem with Mathematical Programming and Column Generation appeared first on Towards Data Science.

]]>
Solving a Resource-Planning Problem with Mathematical Programming and Column Generation

Column generation for solving the graph coloring problem with Python and Gurobipy

Undoubtedly the best possible resource planner (Image created by DALLE-3)
Undoubtedly the best possible resource planner (Image created by DALLE-3)

I used to work in the oil and gas industry and one of the toughest problems we faced was resource planning and forecasting. My team and I constantly struggled with this aspect, often improvising and constantly putting out fires 🚒 . Just merely reacting to the situation, with the hope of finding a solution. But as Mark Wahlberg says in his movies "Hope ain’t a tactic".

"But improvisation will only bring you as far as the next crisis, and is never a substitute for thinking several steps ahead and planning to the end."

Robert Greene, The 48 Laws of Power

I always felt there had to be a better way to solve this problem, but I didn’t know it at the time 🤔. In fact acquiring the skills to solve such problems was one of my main motivations for continuing my studies. This article explores a specific application of mathematical optimization and how it could have made my job substantially easier back then. I guess that I am making this article as a gift 🎁 to my past-self.

I worked for a service company in the drilling and measurements division. The company provided tools and personnel to drill oil wells for various clients. Once a wellbore was drilled, the drilling tools were pulled out, sent to a workshop for maintenance and repairs, and then reused for another job. The start date for jobs was mainly determined by the client, so the crew and tools had to be onsite by that date. Based on past data, we could predict a rough estimate for the duration of different jobs. In the country where I worked, the oil and gas service industry was highly competitive. When a client gave us a job, we jumped into it without hesitation, accepting all jobs and figuring out the tool logistics later. Fortunately, most jobs were assigned at least a few weeks in advance. For example, a simplified month’s schedule for planned jobs might look like Figure 1 below (In reality we were dealing not just with 10 jobs per month but more like ~ 50).

Figure 1. Example of jobs timetable (Image created by the author)
Figure 1. Example of jobs timetable (Image created by the author)

Every single month we faced the same problem and always struggled to find the answer. Today, I am not surprised by why it was so difficult to solve; it turns out the problem is a hard combinatorial optimization problem. The problem is as follows: Given a set of jobs with their planned start and duration (thus end date), what is the minimum number of toolsets needed to cover them all? A toolset cannot move directly from one job to the next without a minimum service time, which, in my case, was three days. The goal is to find a lower bound for the number of tools needed for a given month so the maintenance team can plan ahead. Please note that a toolset can only be in a single job at the time.

Now it turns out that this problem is known in the literature as the "minimum vertex coloring problem" more popularly known as the "graph coloring problem" and at it has its origins on an older problem known as the map coloring problem [1].

The graph coloring problem has its roots in the famous four color theorem, first conjectured in 1852 by Francis Guthrie [2] , which posits that any map can be colored using only four colors such that no two adjacent regions share the same color.

Figure2. Example of the four color theorem (Image from Wilkipedia)
Figure2. Example of the four color theorem (Image from Wilkipedia)

More formally, the graph coloring problem is defined as follows: Given an undirected graph G(V,E) with nodes V and edges E, the objective is to assign colors to each node (vertex) such that no two adjacent nodes share the same color, and the total number of colors used is minimized. For example, the graph below in Figure 3, with six nodes, can be colored using only two colors.

Figure 3. Graph coloring example (image created by the author)
Figure 3. Graph coloring example (image created by the author)

At this point, you might be wondering what is this guy talking about❓ what is the relationship between coloring a map or a graph and solving the resource planning problem described above 🤔. It might not be evident at first glance, but both problems are essentially the same. My specific planning problem can be represented as a graph: each job is a node, and all jobs that overlap in time or start within three days of the finishing of another job are connected by an edge. Therefore, finding the minimum number of toolsets required is akin to finding the minimum number of colors needed to color this graph [3].

Moreover, there are numerous problems in various fields and industries that are equivalent or nearly equivalent to my planning problem. For instance:

  • Exam planning: Given a set of courses such as Mathematics, Latin, Biology, etc., and students enrolled in some of these courses, the goal is to find the minimum number of timeslots needed for scheduling the final exams. This ensures that no student has conflicting exams (i.e., more than one exam in the same timeslot).
  • Nurse and operating room planning: Given an amount of operations that are required to be scheduled in a given day, what is the minimum number of operating rooms or nurses that are required.
  • Number of airplanes planning: Given a number of flights ✈ within a day, each with a scheduled start time and duration, what is the minimum number of airplanes needed? considering that there is a required minimum service downtime after each flight.
  • Number of train platform planning: Given a number of trains 🚆 arriving at a station at specific time slots, what is the minimum number of platforms required to handle the trains?
  • Frequency assignment in radio stations: Given a set of radio stations 📻 within a certain distance of each other, what is the minimum number of radio frequencies needed to avoid signal conflicts between them?
  • Computer register allocation problem: This problem involves assigning a limited number of CPU registers to a program’s variables during execution [4]. This problem is linked to graph coloring because it can be modeled as a graph where each node represents a variable and an edge connects two nodes if the corresponding variables are needed at the same time. Solving this involves coloring the graph with the minimum number of colors (registers) so that no two adjacent nodes (simultaneously needed variables) share the same color (register).
  • Sudoku: Yes you are reading this correctly. Solving Sudoku is essentially solving a graph coloring problem because each Sudoku puzzle can be represented as a graph where each cell is a vertex, and edges connect vertices that share a row, column, or 3×3 subgrid. The goal is to assign a color (number 1–9) to each vertex (cell) such that no two adjacent vertices (cells sharing a row, column, or sub-grid) have the same color, analogous to ensuring no repeated numbers in those regions. For this specific setting we are not trying to minimize the number of colors used, what we want is to find a feasible solution with 9 colors.

Essentially, knowing how to solve the graph coloring problem enables you to solve many equivalent problems. Once you recognize one optimization problem, you begin to see them everywhere. It is like looking at the world with a brand new pair of glasses.

"Optimization problems are everywhere" if you keep repeating this phrase, people will awkwardly shy away from you in parties... (Image created by DALLE-3)
"Optimization problems are everywhere" if you keep repeating this phrase, people will awkwardly shy away from you in parties… (Image created by DALLE-3)

Before celebrating too early, I want to remind you that this is a very difficult combinatorial optimization problem. In fact, it is so challenging that it falls into the computational complexity category of NP-Hard. In this article, I will start by explaining the standard integer linear programming formulation for the graph coloring problem. However, due to its combinatorial complexity, this formulation might be prohibitive for solving large instances, even with the best available solvers. Therefore, I will introduce a powerful technique called column generation and how it can be applied to this problem. We will tackle this problem using Python 🐍 , Google Colab as our IDE, and Gurobi as our solver of choice. If you have never used Gurobi before, especially from Google Colab, I invite you to read one of my previous articles where I explain how to set up your Gurobi web license in your Colab environment. You can find the link to the article below:

Optimizing Project Schedules: Using Gurobipy and Cheche_pm warm-starts for Efficient RCPSP…

If you also need a tutorial and more information about linear programming (LP) or integer programming (IP) techniques I strongly recommend you to check these excellent articles written by Bruno Scalia C. F. Leite.

Linear programming: Theory and applications

A Comprehensive Guide to Modeling Techniques in Mixed-Integer Linear Programming

Article Index:


Installing the libraries and setting up the Colab environment

To use Gurobi in Google Colab, we first need to install it using the following code:

!pip install gurobipy

Once installed we can proceed to import the required libraries for this project.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from itertools import chain, combinations, permutations, product
import time
from tqdm import tqdm
import gurobipy as gp
from gurobipy import GRB
import networkx as nx
from pprint import pprint

We also need to initialize a gurobi session, so we need to create a params dictionary with all the relevant information of our gurobi license.

WLSACCESSID = '< Copy your WLSACCESSID here >'
WLSSECRET = '< Copy your WLSSECRET here >'
LICENSEID = '< Copy your LICENSEID here >'

params = {
"WLSACCESSID": WLSACCESSID,
"WLSSECRET": WLSSECRET,
"LICENSEID": LICENSEID,
}
env = gp.Env(params=params) #this line initializes the gurobi environment

Resource planning input data

For the remaining of the article we will continue to work with the example described by Figure 1. The data that generated this graph is presented in Table 1.

In this section, we will transform this data into a graph representation G(V,E), with nodes V representing the jobs and edges E representing the links between overlapping jobs or jobs with less than three days between them. To accomplish this, we will create an undirected graph using NetworkX with the following code:

data = {
    'Job': ['Job-1', 'Job-2', 'Job-3', 'Job-4', 'Job-5', 'Job-6', 'Job-7', 'Job-8', 'Job-9', 'Job-10'],
    'Start': ['2024-06-01', '2024-06-05', '2024-06-08', '2024-06-10', '2024-06-15', '2024-06-18', '2024-06-20', '2024-06-25', '2024-06-28', '2024-07-01'],
    'End': ['2024-06-10', '2024-06-12', '2024-06-15', '2024-06-20', '2024-06-25', '2024-06-28', '2024-07-05', '2024-07-08', '2024-07-10', '2024-07-15']
}

# Create DataFrame
df = pd.DataFrame(data)

# Convert Start and End to datetime
df['Start'] = pd.to_datetime(df['Start'])
df['End'] = pd.to_datetime(df['End'])

# Create the graph
G = nx.Graph()

# Add nodes
for job in df['Job']:
    G.add_node(job)

# Add edges based on overlapping dates or start date being less than 3 days after the end date of another job
for i in range(len(df)):
    for j in range(i + 1, len(df)):
        job1, job2 = df.iloc[i], df.iloc[j]
        if (job1['Start'] <= job2['End'] and job2['Start'] <= job1['End']) or 
           (abs((job1['End'] - job2['Start']).days) < 3 or abs((job2['End'] - job1['Start']).days) < 3):
            G.add_edge(job1['Job'], job2['Job'])

# Print the edges of the graph to check
print("Edges of the graph:")
print(G.edges())

We can print the Edges to verify that everything is correct. In order to generate a better view of the edges, lets use some pprint( ) instead of the regular print( ) function.

# Print the edges of the graph to check
print("Edges of the graph:")
pprint(list(G.edges()),width=60,compact=True)
>>>>>

Edges of the graph:
[('Job-1', 'Job-2'), ('Job-1', 'Job-3'), ('Job-1', 'Job-4'),
 ('Job-2', 'Job-3'), ('Job-2', 'Job-4'), ('Job-3', 'Job-4'),
 ('Job-3', 'Job-5'), ('Job-4', 'Job-5'), ('Job-4', 'Job-6'),
 ('Job-4', 'Job-7'), ('Job-5', 'Job-6'), ('Job-5', 'Job-7'),
 ('Job-5', 'Job-8'), ('Job-6', 'Job-7'), ('Job-6', 'Job-8'),
 ('Job-6', 'Job-9'), ('Job-7', 'Job-8'), ('Job-7', 'Job-9'),
 ('Job-7', 'Job-10'), ('Job-8', 'Job-9'),
 ('Job-8', 'Job-10'), ('Job-9', 'Job-10')]

After verifying that all edges comply with the data in Table 1 and the established business rules (jobs with less than three days between them are linked via an edge), let’s proceed to plot our graph G. After all, what’s the point of doing all this if we can’t generate some fancy visualizations, right?

plt.figure(figsize=(12, 8))
pos = nx.circular_layout(G)
nx.draw(G,pos= pos, with_labels=True,
               node_size=3000, font_size=12,node_color='lightgrey')
plt.show()
Figure 4. Graph representation of the data contained in Table 1 (Image created by the author)
Figure 4. Graph representation of the data contained in Table 1 (Image created by the author)

Standard ILP formulation for the graph coloring problem

Now that our data is ready, we can discuss integer programming formulations for the graph coloring problem. The literature offers a myriad of different integer programming formulations for this problem, including various families such as assignment formulations [5], representative formulations [6], partial ordering-based formulations [7], and set covering-based formulations [8]. For simplicity, in this section, we will cover only the assignment formulation by Mendez-Diaz and Zabala (2006) [5].

This is a purely binary programming formulation, with binary variables w_i and x_{i,v} ​. Here, iis the index of the potential different colors in set H. You might be wondering how many colors we assume initially; the simple answer is that we naively assume we have as many colors as there are nodes. The variable v is the index for the set of all nodes V.

  • Constraint (1) Just ensures that each node has one unique color assigned to it.
  • Constraint (2) Makes sure two adjacent nodes, do not share the same color.
  • Constraints (3) and (4) are just there for symmetry braking and to increase the efficiency of the branch and bound algorithm.

To apply this formulation to our problem, you can use the code below. It creates the binary programming model and solves it.

E = list(G.edges())
V = list(G.nodes())

H,V = range(len(V)), range(len(V))

x_var = [(v,i) for v in V for i in H ]

model = gp.Model("Binary_ILP",env=env)

W = model.addVars(H,vtype=GRB.BINARY, name="w")
X = model.addVars(x_var ,vtype=GRB.BINARY, name="x")

model.setObjective(gp.quicksum(W[i] for i in H),GRB.MINIMIZE); 

for v in V:
  model.addConstr(gp.quicksum(X[(v,i)] for i in H) == 1 )

for i in H:
  for u,v in E:

    u = int(u.split("-")[1])-1
    v = int(v.split("-")[1])-1

    model.addConstr(X[(v,i)] + X[(u,i)] <= W[i] )

for i in H:
  model.addConstr(gp.quicksum(X[(v,i)] for v in V) >= W[i])

for i in range(1,len(V)):
  model.addConstr(W[i] <= W[i-1])

timelimit = 600
mipgap = 0.0

model.setParam('TimeLimit', timelimit)
model.setParam('MIPGap',mipgap)
model.optimize()
Figure 5. Log output from gurobi (Image created by the author)
Figure 5. Log output from gurobi (Image created by the author)

Now that we have found the optimal solution we can proceed to extract the color of each node by using the code below.

colors = ['lightskyblue','tomato','lime','lemonchiffon']
print(f"Total number of colors needed = {model.getObjective().getValue()}")
node_colors = []
for key in X.keys():

  if X[key].x >= 0.99:

    print(f" job-{key[0] +1} is colored with color: {colors[key[1]]}")
    node_colors.append(colors[key[1]])

This will produce the following output.

>>>>

Total number of colors needed = 4.0
 job-1 is colored with color: tomato
 job-2 is colored with color: lemonchiffon
 job-3 is colored with color: lime
 job-4 is colored with color: lightskyblue
 job-5 is colored with color: lemonchiffon
 job-6 is colored with color: lime
 job-7 is colored with color: tomato
 job-8 is colored with color: lightskyblue
 job-9 is colored with color: lemonchiffon
 job-10 is colored with color: lime

We can also proceed to generate to plot of the problem graph G with the coloring solution.

plt.figure(figsize=(12, 8))
pos = nx.circular_layout(G)
nx.draw(G,pos= pos, with_labels=True, node_size=3000, 
                    font_size=12,node_color=node_colors)
plt.show()
Figure 6. Graph representation of the data contained in Table 1 with the coloring solution (Image created by the author)
Figure 6. Graph representation of the data contained in Table 1 with the coloring solution (Image created by the author)

This is a great formulation, and for this small toy problem, Gurobi solved it incredibly fast 🔥 . However, if the number of nodes becomes too large, even the best commercial solvers might struggle to find a decent solution. For example, while the simple example of 10 jobs we’re using in this article is manageable, in my previous job, we often handled more than 40 jobs (nodes) in a single month. So, how challenging is a graph coloring problem with 40 nodes and 40 potential colors? We would have 40 binary variables for w_i​ and 1,600 binary variables for x_{i,v}​ (40 nodes times 40 colors). Now, let’s consider the solution space for this problem. How many possibilities do we have? To simplify, let’s focus on x_{i,v}​; each of these variables can be either 0 or 1. So the total combination of possibilities is 2¹⁶⁰⁰. Imagine trying to brute force the solution from that search space? (trying to explore each possibility)

2¹⁶⁰⁰ That number is just astronomically massive, it is in fact multiple times bigger than the number of observable atoms in the universe ( around 10⁸³ ) crazy don’t you think? 😵

Luckily for us, modern solvers do not use a brute force strategy to evaluate each possibility. However, the problem can still be challenging. That is why we are going to explore a powerful technique called column generation and apply it to solve this problem.


Column Generation

Column generation is a powerful mathematical optimization technique used primarily in large-scale linear programming problems, particularly in cases where the number of potential variables (columns) is prohibitively large (like in our resource planning problem). Think of column generation as starting with a small set of promising variables and gradually add more by identifying which new variables would improve the solution of the problem the most. The initial idea of column generation was introduced by Gilmore and Gomory [9] in 1961.

So Instead of solving the problem in its entirety (like in the previous section), which would be computationally challenging, we start with a manageable subset of variables and solve a so called "restricted master problem" (RMP).

Then using the solution of the RMP, we then generate additional variables that have the potential to improve the objective function, by solving a subproblem known as the pricing problem. This iterative process continues, dynamically adding the most promising variables, until no further improvements can be made, thus converging to the optimal solution.

In general column generation involves the following steps:

  • (1) Solve the Restricted Master Problem (RMP): Begin by relaxing the original large-scale problem, which means solving a simpler, more manageable version of the problem. This initial step reduces computational complexity by not considering all variables at once, and by sometimes considering a different domain from the variables (example instead of binary or integer variables we will relax the problem by assuming continuous variables). Solving the RMP gives us an initial solution and the associated dual values for the constraints. Dual values (or shadow prices) represent the change in the objective function value per unit increase in the right-hand side of the constraints. They provide insight into how much we can improve our solution by relaxing or tightening each constraint. Think of duality as looking at the same problem from two different perspectives, the dual problem looks at the primal (original) problem from the perspective of the constraints.
  • (2) Formulate the Pricing Subproblem: The pricing subproblem is formulated to identify new columns (variables) that can potentially improve the objective function of the RMP. The goal here is to find columns that have a negative reduced cost in a minimization problem (or a positive cost in a maximization problem).
  • (3) Solve the Pricing Subproblem: Solve the pricing subproblem using the duals obtained from the RMP and searching for the best new column.
  • (4) Calculate Reduced Costs: Reduced costs help us determine the potential benefit of adding a new column to the RMP. The reduced cost of a column is calculated as the difference between its original cost and the value of the dual variables corresponding to the constraints it satisfies. Mathematically the reduced cost of adding a new column can be calculated by the following expression.
  • (5) Generate New Columns: If the pricing subproblem finds a columng with a negative reduced cost, then this column is added to the RMP. If no columns with negative reduced costs are found, this indicates that the current solution (with the current set of columns) is optimal.
  • (6) Iterate: Repeat the process by solving the updated RMP with the new columns, recalculating dual values, and resolving the pricing subproblem. This iterative process continues until no further columns with negative reduced costs are identified, signaling convergence to the optimal solution.

Figure 7. showcases a full diagram of the entire process.

Figure 7. Column generation diagram (Image created by the author)
Figure 7. Column generation diagram (Image created by the author)

Column algorithm for the graph coloring problem

Now, armed with this powerful technique, let’s proceed to implement it for solving the graph coloring problem. We will follow the work of Mehrotra et al. (1996) [8]. Instead of using the assignment formulation by Mendez-Diaz and Zabala [5], which employs binary variables to determine if a node is colored by a given color, we will shift towards a set covering formulation.

In this approach, we consider independent sets S, with each set associated with a unique color. These sets can contain different nodes, but each node can only belong to one set. The objective is to find the minimum number of independent sets required to color a given graph G(V,E). With this formulation, the ILP for our master problem (MP) becomes:

Where y_s is a binary variable that is equal to 1 if the set _sis taken and 0 otherwise. The parameter `b{v,s}is binary and it is equal to 1 f the node _v_ is inside the set _s_ . Constraint (1) just guarantees that a node can only be part of one set _s._ If we relaxed the MP, then the variablesy_s` become continuous variables instead of binary variables. After defining the MP then the pricing subproblem can be defined as.

Where x_v is a binary variable that is equal to 1 if the node v is in the independent set and equal to 0 otherwise. d_v just represent the dual values obtained from the solution of the RMP.

To simplify the starting point of this column generation problem, we will initialize with some easy-to-build independent sets called singleton sets. Singleton sets contain only one node each, so we will start with a number of singleton sets equal to the number of nodes in our problem graph. We will create a class that initializes the data structures needed for our master and pricing problems.

class InitialPatternGenerator:

  def __init__(self,Graph):
    columns = ['independentSetCost','independentSet']
    self.G = Graph
    self.nodes = list(self.G.nodes())
    self.edges = list(self.G.edges())
    self.n_Nodes = len(self.nodes)
    IndepndentSet = pd.DataFrame(index = range(self.n_Nodes),columns = columns)
    self.IS_DF = IndepndentSet

  def generateBasicSingletons(self):

    self.IS_DF['independentSetCost'] = 1
    self.IS_DF['independentSet'] = [np.where(self.IS_DF.index == j,1,0) for j in range(self.n_Nodes)]
    return self.IS_DF

  def generate_dictionaries(self):

    name_to_number = dict()
    number_to_name = dict()

    for i,v in enumerate(self.nodes):
      name_to_number[v] = i
      number_to_name[i] = v

    self.name_to_number_dict = name_to_number
    self.number_to_name_dict = number_to_name

  def get_numeric_v_and_e(self):

    edges = []

    for u,v in self.edges:

      u = self.name_to_number_dict[u]
      v = self.name_to_number_dict[v]

      edges.append((u,v))

    nodes = [self.name_to_number_dict[x] for x in self.nodes]

    return nodes, edges

The class above called InitialPatternGenerator generates the different data structures that are required, from the problem graph G. Specifically the method generateBasicSingletons produces a dataframe with the singleton independent sets for each node v . The code below, prepares all of our initial problem inputs and parameters.

IS_generator = InitialPatternGenerator(G)
IS_generator.generate_dictionaries()

nodes,edges = IS_generator.get_numeric_v_and_e()

IS_DF = IS_generator.generateBasicSingletons()
Figure 8. Structure of the singleton pattern set (Image created by the author)
Figure 8. Structure of the singleton pattern set (Image created by the author)

Now that we have all the required inputs and parameters we will proceed to implement the class for the master problem by using the code below.

class MasterProblem:

  def __init__(self,IS_DF,nodes,edges,env):

    self.independentSetCost = IS_DF['independentSetCost'].values
    self.independentSet = IS_DF['independentSet'].values
    self.nodes = nodes
    self.edges = edges
    self.n_Nodes = len(nodes)
    self.model = gp.Model("MasterProblem",env=env)
    self.independentSetIndex = IS_DF.index.values

  def build_model(self):
    self.GenerateVariables()
    self.GenerateConstraints()
    self.GenerateObjective()
    self.model.update()

  def GenerateVariables(self):
    self.independentSetUseVar = self.model.addVars(self.independentSetIndex,
                                            vtype = GRB.BINARY,name = "independentSetUseVar")
  def GenerateConstraints(self):
    for i in range(self.n_Nodes):
      self.model.addConstr(gp.quicksum(self.independentSetUseVar[p]*self.independentSet[p][i] for p in self.independentSetIndex) >= 1, f'C_{i}')

  def GenerateObjective(self):
    self.model.setObjective(gp.quicksum(self.independentSetUseVar[p] for p in self.independentSetIndex),sense = GRB.MINIMIZE)

  def solveRelaxModel(self,silence=True):

    self.relaxedModel = self.model.relax()
    if silence:
      self.relaxedModel.setParam('LogToConsole',0)
    self.relaxedModel.optimize()

  def getDuals(self):

    return self.relaxedModel.getAttr("Pi",self.relaxedModel.getConstrs())

  def addColumn(self,objective,new_IS):

    ctName = f'independentSetUseVar[{len(self.model.getVars())}]'
    new_column = gp.Column(new_IS,self.model.getConstrs())
    self.model.addVar(vtype = GRB.INTEGER,lb = 0, obj = objective, 
                              column = new_column,name = ctName)
    self.model.update()

  def solveFinalModel(self,time_limit = 3600,mipgap = 0.05,silence = True):

    self.model.setParam('TimeLimit',time_limit)
    self.model.setParam('MIPGap',mipgap)
    if silence:
       self.model.setParam('LogToConsole',0)

    self.model.optimize()

  def extractSolution(self,silence = True):

    all_vars = self.model.getVars()
    values = self.model.getAttr("X", all_vars)
    names = self.model.getAttr("VarName", all_vars)

    solution = list()

    if not silence:
      print(f"Total number of colors needed = {self.model.getObjective().getValue()}")

    for name,x,pattern in zip(names,values,NEW_independent_sets):
      if x >= 0.99:
        indices = [index for index, value in enumerate(pattern) if value == 1.0]
        solution.append(indices)
    output = list()
    for c,i_S in enumerate(solution):
      for x in i_S:
        output.append((x,c))
        if not silence:
          print(f"job-{x+1} assigned to color {c+1}")

    return output

The MasterProblem class defined above contains the following methods:

  • The constructor: Initializes the class with the input data, from a list of nodes, edges, our gurobi environment and the singleton pattern dataframe.
  • buildomel: This method builds the entire model by calling other methods related to the creation of variables, objective function and constraints of the formulation of the MP.
  • GenerateVariables: This method adds the variables to the model.
  • GenerateConstraints: This method adds the constraints to the model.
  • GenerateObjective: This method adds the objective function to the model.
  • solveRelaxModel: This method relaxes the master problem (allows the variable y_s to become continuous instead of binary.
  • getDuals: This method obtains the dual values from the solution of the relaxed problem.
  • addColumn: This method adds a new column to the master problem.
  • solveFinalModel: This method is reserved to be used at the end of the optimization process once all the relevant columns have been found.
  • extractSolution: This method extracts the solution from the independent set columns, after solving the master problem without relaxing it.

With the master problem successfully implemented we can now proceed to implement the class of the pricing subproblem by following the snippet of code below.

class SubProblem:

  def __init__(self,IS_DF,nodes,edges,env,duals):

    self.independentSetCost = IS_DF['independentSetCost'].values
    self.independentSet = IS_DF['independentSet'].values
    self.nodes = nodes
    self.edges = edges
    self.n_Nodes = len(nodes)
    self.duals = duals
    self.model = gp.Model("SubProblem",env=env)
    self.nodesIndex = IS_DF.index.values

  def buildModel(self):
    self.GenerateVariables()
    self.GenerateConstraints()
    self.GenerateObjective()
    self.model.update()

  def GenerateVariables(self):
    self.nodePatternVar = self.model.addVars(self.nodesIndex,
                                                 vtype = GRB.BINARY,name = "nodePatternVar")
  def GenerateConstraints(self):

    for u,v in self.edges:

      self.model.addConstr(self.nodePatternVar[u] + self.nodePatternVar[v] <= 1)

  def GenerateObjective(self):
    self.model.setObjective(gp.quicksum(self.nodePatternVar[p]*self.duals[p] for p in self.nodesIndex),sense = GRB.MAXIMIZE)

  def getNewPattern(self):
    return self.model.getAttr("X",self.model.getVars())

  def solveModel(self,time_limit = 3600,mipgap = 0.05,silence = True):

    self.model.setParam('TimeLimit',time_limit)
    self.model.setParam('MIPGap',mipgap)
    if silence:
       self.model.setParam('LogToConsole',0)
    self.model.optimize()

  def getObjectiveValue(self):
    return self.model.getObjective().getValue()

Very similar as with the class of the master problem, the class pricing SubProblem defined above, has the following methods:

  • The constructor: Initializes the class with the input data, from a list of nodes, edges, our gurobi environment, the singleton pattern dataframe and a list of the dual values coming from the master problem.
  • The methods buildModel, GenerateVariables, GenerateConstraints and GenerateObjective work exactly the same as with the master problem.
  • solveModel: This method solves the pricing subproblem
  • getNewPattern: This method generates a new column from the solution of the pricing subproblem.
  • getObjectiveValue: This method retrieves the objective value from the solution of the pricing subproblem.

Now that all the classes are implemented we can finally the proceed to build the main loop for solving our problem. We will be following the exact procedure described by Figure 7 , and following the code below.

master = MasterProblem(IS_DF,nodes,edges,env)
master.build_model()

NEW_independent_sets = [x.tolist() for x in IS_DF['independentSet'].values]

modelImprovable = True
counter = 0

for i in range(20):

  print(f"Iteration: {counter}")

  master.solveRelaxModel()
  duals = master.getDuals()

  subproblem = SubProblem(IS_DF,nodes,edges,env,duals)
  subproblem.buildModel()

  subproblem.solveModel(120,0.0,True)

  modelImprovable = 1 - subproblem.getObjectiveValue() < 0

  print("gain",1 - subproblem.getObjectiveValue())

  if modelImprovable:

    new_independent_set_cost = 1

    new_independent_set = subproblem.getNewPattern()
    print(new_independent_set)

    NEW_independent_sets.append(new_independent_set)

    master.addColumn(new_independent_set_cost,new_independent_set)

    counter+=1
    print("-"*120)
  else:
    break

master.solveFinalModel(3600,0.00)

After few iterations the column generation algorithm halts and we obtain the optimal solution that only requires 4 colors (independent sets), to color the problem graph.

master.extractSolution(silence = False)
Total number of colors needed = 4.0
job-3 assigned to color 1
job-7 assigned to color 1
job-4 assigned to color 2
job-8 assigned to color 2
job-2 assigned to color 3
job-6 assigned to color 3
job-10 assigned to color 3
job-1 assigned to color 4
job-5 assigned to color 4
job-9 assigned to color 4

[(2, 0), (6, 0), (3, 1), (7, 1),
 (1, 2), (5, 2), (9, 2), (0, 3),
 (4, 3), (8, 3)]

We can then proceed to plot the solution into the problem graph G.

colors = ['lightskyblue','tomato','lime','lemonchiffon']
node_colors = [None for x in range(len(nodes))]
for x,c in problem_solution:
  node_colors[x] = colors[c]

plt.figure(figsize=(12, 8))
pos = nx.circular_layout(G)
nx.draw(G,pos= pos, with_labels=True, node_size=3000, font_size=12,node_color=node_colors)
plt.show()
Figure 9: Graph representation of the data contained in Table 1 with the coloring solution obtained by column generation (Image created by the author)
Figure 9: Graph representation of the data contained in Table 1 with the coloring solution obtained by column generation (Image created by the author)

Conclusions

In conclusion, this article covered multiple applications of the graph coloring problem and its relevance to resource planning problems. We discussed the assignment binary programming formulation of Mendez-Diaz and Zabala, the set covering formulation of the same problem, and the implementation of a powerful column generation algorithm published by Mehrotra et al. This powerful technique is particularly useful for large problem instances (for smaller problem instances the assignment formulation of Mendez-Diaz and Zabala is more than enough). I hope this article serves as a helpful resource for anyone facing similar challenges in their current jobs. I would have loved 💙 to have had this knowledge back then, but in any case it’s never too late to learn 😄 . You can find the entire notebook with the code developed for this article in the link to my GitHub repository jsut below.

Youtube_optimization/Column_Generation_graph_coloring_problem_gurobipy_medium_example_for_sharing.ip…

I sincerely hope you found this article useful and entertaining. If so, I’d love to hear your thoughts! Please feel free to leave a comment or show your appreciation with a clap 👏 . And if you’re interested in staying updated on my latest articles, consider following me on Medium. Your support and feedback are what drive me to keep exploring and sharing in this fascinating field. Thank you for taking the time to read, and stay tuned for more insights in my next article!


References

  • [1] Brooks, R. L. (1941), "On coloring the nodes of a network", _Proceedings of the Cambridge Philosophical Society_, 37 (2): 194–197
  • [2] Francis Guthrie. (June 10, 1854), "Tinting Maps", [The Athenaeum](https://en.wikipedia.org/wiki/Athenaeum(Britishmagazine)): 726
  • [3] Marx, Dániel (2004), "Graph colouring problems and their applications in scheduling", Periodica Polytechnica, Electrical Engineering, vol. 48, pp. 11–16
  • [4] Chaitin, G. J. (1982), "Register allocation & spilling via graph colouring", _Proc. 1982 SIGPLAN Symposium on Compiler Construction_, pp. 98–105
  • [5] Méndez-Díaz, Isabel, and Paula Zabala. "A branch-and-cut algorithm for graph coloring." Discrete Applied Mathematics 154.5 (2006): 826–847.
  • [6] Campêlo, Manoel, Ricardo Corrêa, and Yuri Frota. "Cliques, holes and the vertex coloring polytope." Information Processing Letters 89.4 (2004): 159–164.
  • [7] Jabrayilov, A., & Mutzel, P. (2018). New integer linear programming models for the vertex coloring problem. In LATIN 2018: Theoretical Informatics: 13th Latin American Symposium, Buenos Aires, Argentina, April 16–19, 2018, Proceedings 13 (pp. 640–652). Springer International Publishing.
  • [8] Mehrotra, Anuj, and Michael A. Trick. "A column generation approach for graph coloring." informs Journal on Computing 8.4 (1996): 344–354.
  • [9] Gilmore, Paul C., and Ralph E. Gomory. "A linear programming approach to the cutting-stock problem." Operations Research 9.6 (1961): 849–859.

The post Solving a Resource Planning Problem with Mathematical Programming and Column Generation appeared first on Towards Data Science.

]]>