Tutorial IV.V - Modelling the Transportation Problem with JuMP
Applied Optimization with Julia
Introduction
Welcome to this tutorial on the transportation problem using JuMP! As always, don’t worry if you’re new to optimization - we’ll walk through everything step by step using a real-world example.
Imagine you’re running a solar panel distribution company. You have several warehouses (suppliers) and need to ship solar panels to various solar farms (customers). Your goal is to minimize the total cost of transportation while meeting all customer demands.
By the end of this tutorial, you’ll be able to:
Understand what a transportation problem is
Set up a transportation problem using JuMP
Solve the problem and interpret the results
Let’s start by loading the necessary packages:
usingJuMP, HiGHSusingDataFrames, CSV
Precompiling JuMP...
37991.8 ms ✓ MathOptInterface
10025.8 ms ✓ JuMP
2 dependencies successfully precompiled in 48 seconds. 61 already precompiled.
Precompiling HiGHS...
8354.1 ms ✓ HiGHS
1 dependency successfully precompiled in 8 seconds. 63 already precompiled.
Precompiling DataFrames...
18580.6 ms ✓ DataFrames
1 dependency successfully precompiled in 19 seconds. 32 already precompiled.
Precompiling QuartoNotebookWorkerDataFramesTablesExt...
1343.9 ms ✓ QuartoNotebookWorker → QuartoNotebookWorkerDataFramesTablesExt
1 dependency successfully precompiled in 1 seconds. 54 already precompiled.
Section 1 - Understanding the Transportation Problem
The transportation problem involves:
Suppliers (our warehouses)
Customers (solar farms)
Transportation costs between each supplier and customer
Supply available at each warehouse
Demand required by each solar farm
Our goal is to decide how many solar panels to ship from each warehouse to each solar farm to minimize total cost.
Let’s set up our problem:
The revenue from each truckload of solar panels is 11000
The variable costs from each truckload of solar panels is 6300
The available panels at the supplier are given in the file available-panels.csv
The requested panels at the customer are given in the file panel-demand.csv
The transportation costs between suppliers and customers are given in the file cost.csv
Note
We can use the CSV.read function to load the data from a CSV file into a DataFrame. If we want to access the directory of the current file, we can again use the convinient @__DIR__ macro.
# Fixed parametersrevenue =11000# Revenue per truckload of solar panelsvarCosts =6300# Variable costs per truckload# Load data from CSV filesavailable = CSV.read("$(@__DIR__)/data/available-panels.csv", DataFrame)requested = CSV.read("$(@__DIR__)/data/panel-demand.csv", DataFrame)travelCosts = CSV.read("$(@__DIR__)/data/cost.csv", DataFrame)println("Data loaded successfully!")
Data loaded successfully!
Tip
Make sure, that you donwload the datasets from Github and store them in a folder called data in the same directory as the script you are currently working on. You can find the data sets in the GitHub repository for this tutorial. Note, that you don’t need to preprocess the data in any way. This tutorial we will focus on the modeling part.
Now, we can check out the data by printing the first few rows of each DataFrame. We can use the first function to get the first few rows of a DataFrame.
println("Available panels:")first(available,5)
Available panels:
5×2 DataFrame
Row
supplier
truckloads_available
String7
Int64
1
a_1
478
2
a_2
371
3
a_3
361
4
a_4
415
5
a_5
354
println("Requested panels:")first(requested,5)
Requested panels:
5×2 DataFrame
Row
solar_farm
truckload_demand
String7
Int64
1
b_1
9
2
b_2
31
3
b_3
47
4
b_4
46
5
b_5
12
println("Travel costs:")first(travelCosts,5)
Travel costs:
5×3 DataFrame
Row
supplier
solar_farm
costs
String7
String7
Int64
1
a_1
b_1
8052
2
a_2
b_1
11845
3
a_3
b_1
6103
4
a_4
b_1
11099
5
a_5
b_1
5625
Exercise 1.1 - Understand the Data
Take a moment to look at the data. Can you answer these questions?
How many warehouses do we have? Save the number in a variable called num_warehouses.
How many solar farms are we supplying? Save the number in a variable called num_solar_farms.
# YOUR ANSWERS BELOW# Hint: Use the `nrow()` function to count rows
Code
# Test your understanding@assert num_warehouses ==nrow(available)@assert num_solar_farms ==nrow(requested)println("Great job! Here are the answers:")println("Number of warehouses: ", num_warehouses)println("Number of solar farms: ", num_solar_farms)
Section 2 - Using dictionaries to store the data
Now, DataFrames are not a very convenient format for our purposes. We have several options now on how to deal with these data sets. As our suppliers and customers are given names, it might be useful to convert the data into dictionaries. Dictionaries are a great data structure that allow us to store key-value pairs, where the keys are unique identifiers and the values are the data associated with those keys. By using dictionaries, we can easily access and modify the data associated with a specific key.
You can use the Dict function to create a dictionary from two arrays or DataFrames. For example: Dict(keys .=> values) will create a dictionary where the keys are the elements of the keys array and the values are the elements of the values array.
Now, let us check out the dictionaries. We can use the first function to get the first few key-value pairs of a dictionary.
Remember, we can also access the value associated with a specific key in a dictionary by using the key inside square brackets. For example: available_dict["a_1"] will return the value associated with the key "a_1".
print("Value associated with supplier 'a_1': ")available_dict["a_1"]
Value associated with supplier 'a_1':
478
Our travel costs dictionary is a bit more complex, as it is dictionary with tuples as keys. We can access the value associated with a specific supplier and customer by using two keys inside square brackets. For example: travelCosts_dict[("a_1","b_1")] will return the value associated with the keys ("a_1","b_1").
print("Value associated with supplier 'a_1' and customer 'b_1': ")travelCosts_dict[("a_1","b_1")]
Value associated with supplier 'a_1' and customer 'b_1':
8052
We can also extract the keys and values of a dictionary using the keys and values functions, as shown in the previous tutorial.
println("Keys of the travel costs dictionary:")first(keys(travelCosts_dict),5)
Dictionaries make it a lot easier to access the data later on, as we can use the keys to directly access the desired value in our model. This will be useful when we want to define the constraints later on.
Section 3 - The model instance
After the preprocessing and data loading, we now can create the model instance with the HiGHS optimizer.
Exercise 3.1 - Creating the model instance
From the last tutorial, you should know how to do this. Create a model instance called transport_model and set the optimizer to HiGHS.
# YOUR CODE BELOW
Code
# Test your answer@asserttypeof(transport_model) == JuMP.Modelprintln("Model instance created successfully!")
Section 4 - Defining the model
Define the variables
We can now define the variables of our model. We need to define a variable for each supplier and customer pair. As before, we can use the @variable macro to define the variables. The syntax is @variable(model, varname[index1,index2] >= 0), where model is the model instance, varname is the name of the variable, and index1 and index2 are the indices of the variable. We can use vectors as input for the indices, but we could also use the keys of the dictionaries. In the following code block we mixed both options, to show you that it is possible.
Next, we can define the objective of our model. We want to maximize the profit, which is the revenue minus the variable costs and the transportation costs. As before, we can use the @objective macro to define the objective. The syntax is @objective(model, Max, expression), where model is the model instance, Max indicates that we want to maximize the expression, and expression is the expression we want to maximize.
@objective(transport_model, Max, sum((revenue-varCosts-travelCosts_dict[(i,j)]) * X[i,j] for i inkeys(available_dict), j inkeys(requested_dict)))
Define the constraints
We can now define the constraints of our model. We need to ensure that the supply from each supplier is enough to cover the demand of each customer. We can use the @constraint macro to define the constraints. The syntax is @constraint(model, expression), where model is the model instance and expression is the expression we want to constrain.
To illustrate the use of dictionaries, we will again use the keys of the dictionaries to define the constraints in the following code block.
@constraint(transport_model, restrictAvailable[i inkeys(available_dict)], sum(X[i,j] for j inkeys(requested_dict)) <= available_dict[i])
Naturally, we could also use the vectors with the names from the DataFrames to define the constraints or we could also just work with ranges from the beginning, e.g. 1:length(available.supplier) and 1:length(requested.solar_farm). Working with names is often more convenient, though.
@constraint(transport_model, restrictDemand[j in requested.solar_farm], sum(X[i,j] for i in available.supplier) <= requested_dict[j])
And that’s it! We have now defined the model and can start optimizing.
Section 5 - Solving the model
Exercise 5.1 - Start optimization
Start the optimization as usual by calling the optimize! function on the model instance.
# YOUR CODE BELOW
Code
# Test your answer@asserttermination_status(transport_model) == MOI.OPTIMALprintln("Model optimized successfully!")
Now, we can access the values of the variables at the optimal solution. But remember, we defined the variables with the keys of the dictionaries, so we need to convert the result back to a DataFrame. Calling the variable itself will just show the structure of the variable, not the values.
first(X,5)
Thus, we need to use the value function to extract the values from the variable.
transport_values =value.(X)
The result is a DenseAxisArray{Float64,2,...} with index sets. To convert it to a DataFrame, we just need to iterate over the keys dictionaries and store the values in a new DataFrame. As we are not interested in values which are zero, we can skip those.
First, we need to initialize an empty DataFrame with the correct column names.
Then, we can iterate over the keys of the dictionaries and store the values in the DataFrame if they are greater than zero.
for i inkeys(available_dict)for j inkeys(requested_dict)if transport_values[i,j] >0push!(transport_df, ( supplier = i, solar_farm = j, truckloads = transport_values[i,j] ) )endendend
Finally, we can print the first few rows of the transportation plan to check if it looks correct.
println("Begining of the transportation plan:")first(transport_df,5)
Note
Although the above code looks rather complicated, it is essentially just iterating over the keys of the dictionaries and storing the values in a new DataFrame. This is a common pattern in optimization, as we often want to convert the result of an optimization problem into a more convenient format for reporting or further processing.
Conclusion
In this tutorial, we have learned how to model and solve the transportation problem using JuMP. We have also learned how to use dictionaries to store and access the data, which will be useful for more complex models. If you have any questions, feel free to ask me via email!
Solutions
You will likely find solutions to most exercises online. However, I strongly encourage you to work on these exercises independently without searching explicitly for the exact answers to the exercises. Understanding someone else’s solution is very different from developing your own. Use the lecture notes and try to solve the exercises on your own. This approach will significantly enhance your learning and problem-solving skills.
Remember, the goal is not just to complete the exercises, but to understand the concepts and improve your programming abilities. If you encounter difficulties, review the lecture materials, experiment with different approaches, and don’t hesitate to ask for clarification during class discussions.
Later, you will find the solutions to these exercises online in the associated GitHub repository, but we will also quickly go over them in next week’s tutorial. To access the solutions, click on the Github button on the lower right and search for the folder with today’s lecture and tutorial. Alternatively, you can ask ChatGPT or Claude to explain them to you. But please remember, the goal is not just to complete the exercises, but to understand the concepts and improve your programming abilities.