Linear programming is one of the most common optimization techniques. It has a wide range of applications and is frequently used in operations research, industrial design, planning, and the list goes on. Alas, it is not as hyped as machine learning is (which is certainly a form of optimization itself), but is the go-to method for problems that can be formulated through decision variables that have linear relationships. This is a fast practical tutorial, I will perhaps cover the Simplex algorithm and the theory in a later post.
Just to get an idea, we are going to solve a simple problem regarding production scheduling. Imagine that you work for a company that builds computers. A computer is a fairly complex product, and there are several factories that assemble them which the company pays a certain amount per unit. The cost of this computer model on the market is fixed at 500$, different factories assemble the computers at different speeds and costs. Factory f0 produces 2000 per day at 450$ per unit, factory f1 1500 per day at 420$ per unit and f2 1000 per day at 400$ per unit. We have 1 month to assemble 80 000 units under the constraint that no factory is to produce more than double the units than any other factory. The question is, what is the optimal production allocation between the factories such that we maximize the profit obtained from selling the computers under those constraints?
In this tutorial we are going to be using Python and a linear programming optimization package PuLP, copy-paste install with pip:
pip install pulp
Now, in order to solve the computer production problem with linear programming, we need the following things:
So, open up your favorite editor and let's get started. Before defining the variables in PuLP, we need to create a problem object with the following code:
from pulp import *problem = LpProblem("problemName", LpMaximize)
We are going to add the constraints and objective function to this object. Notice that the problem constructor receives a problem name and LpMaximize, which means we want to maximize our objective function. In our case, this is the profit from selling a certain number of computers. Additionally, we also define the constants that we received from the problem statement:
# factory cost per daycf0 = 450cf1 = 420cf2 = 400
# factory throughput per dayf0 = 2000f1 = 1500f2 = 1000
# production goalgoal = 80000
# time limitmax_num_days = 30
num_factories = 3
In the next section, we will define the decision variables.
In PuLP, a decision variable is defined the following way:
variable = LpVariable("variableName")
Sometimes we need to provide bounds to the variable (the default is no bounds), in this case, we would write the following:
var = LpVariable("boundedVariableName",lowerBound,upperBound)
Yet another useful way to define variables in PuLP is using the dicts function. This can be very useful in cases where we need to define a large number of variables of the same type and bounds, variableNames would be a list of keys for the dictionary:
varDict = LpVariable.dicts("varDict", variableNames, lowBound, upBound)
So based on the previous definitions, the decision variables for the computer production problem are the days that we spend producing for each factory:
# factoriesnum_factories = 3factory_days = LpVariable.dicts("factoryDays", list(range(num_factories)), 0, 30, cat="Continuous")
Now that we defined our decision variables we can shift to defining the constraints for the problem. Note that the constraints have to be linear in a linear programming setting. The constraints that we care about are that the number of units assembled should be above or equal to the goal amount and the production constraint that no factory should produce more than double as the other factory:
# goal constraintc1 = factory_days[0]*f1 + factory_days[1]*f2 + factory_days[2] * f3 >= goal
# production constraintsc2 = factory_days[0]*f0 <= 2*factory_days[1]*f1c3 = factory_days[0]*f0 <= 2*factory_days[2]*f2c4 = factory_days[1]*f1 <= 2*factory_days[2]*f2c5 = factory_days[1]*f1 <= 2*factory_days[0]*f0c6 = factory_days[2]*f2 <= 2*factory_days[1]*f1c7 = factory_days[2]*f2 <= 2*factory_days[0]*f0
# adding the constraints to the problemproblem += c1problem += c2problem += c3problem += c4problem += c5problem += c6problem += c7
The objective function for the computer assembly problem is basically minimizing the cost of assembling all of those computers. This can be written simply as maximizing the negative cost:
# objective functionproblem += -factory_days[0]*cf0*f0 - factory_days[1]*cf1*f1 - factory_days[2]*cf2*f2
Let's take a look at the problem definition, this can be simply achieved by the call:
print(problem)
This results in the following output which I think is self-explanatory, it lists the objective function, the constraints and the various decision variables in the problem:
computerAssembly:MAXIMIZE-900000*factoryDays_0 + -630000*factoryDays_1 + -400000*factoryDays_2 + 0SUBJECT TO_C1: 1500 factoryDays_0 + 1000 factoryDays_1 + 1000 factoryDays_2 >= 80000
_C2: 2000 factoryDays_0 - 3000 factoryDays_1 <= 0
_C3: 2000 factoryDays_0 - 2000 factoryDays_2 <= 0
_C4: 1500 factoryDays_1 - 2000 factoryDays_2 <= 0
_C5: - 4000 factoryDays_0 + 1500 factoryDays_1 <= 0
_C6: - 3000 factoryDays_1 + 1000 factoryDays_2 <= 0
_C7: - 4000 factoryDays_0 + 1000 factoryDays_2 <= 0
VARIABLESfactoryDays_0 <= 30 ContinuousfactoryDays_1 <= 30 ContinuousfactoryDays_2 <= 30 Continuous
After we defined everything necessary in our linear programming problem, we can call the solve method which outputs 1 if the problem is solved and -1 if it was infeasible, it is a simple one-liner:
# solvingproblem.solve()
The solutions to the problem can be obtained by accessing the varValue attribute of each variable:
for i in range(3):print(f"Factory {i}: {factory_days[i].varValue}")
This results in the following output:
Factory 0: 23.076923Factory 1: 15.384615Factory 2: 30.0
In linear programming, we assume that the relationships between the variables are linear and that the variables themselves are continuous. As a follow up on this tutorial, I will be covering Mixed Integer Programming, where the variables can be integers, which will prove a very useful thing since it can be used to simulate boolean logic.