Connor Owen
cowen33@gatech.edu
Goal programming is a branch of optimization that deals with balancing multiple goals for a given problem. In many applications of optimization, it is common that a decision maker has to consider multiple, often conflicting, objectives. For example, a city planner might want to balance coverage and equity when deciding where to place a new amenity, or a plant might have a production goal they want to meet exactly, but can go over or under based on conflicting constraints. Goal programming provides a framework with which to formulate this type of problems.
The general formulation of goal programming is define a goal, then define deviations from this goal, and finally minimize the deviation in some way. To explore this problem, we'll go over an example problem inspired by a real-life problem I was asked to solve.
The problem goes like this: there are m mentors, and n mentees, such that m < n. The mentoring sessions happen in groups, so the mentors will each have multiple mentees. Each mentee beforehand lists the preference for their top 3 mentors. We want to come up with an assignment of mentees to mentors that balances individual choice with group size, since we want neither extremely large groups nor extremely small groups.
These two objectives (as you'll see in the data) are conflicting, since if everyone were to get their first choice of mentor, the resulting groups would not be equally sized. So, let's start taking a look at the data.
Mentors: Jack, Cindy, Simrun, Christopher, Edward
Mentees: Chiara, Rachel, Irving, Felipe, Tommy, Spencer, Terrell, Alex, Kristen, Elsa, Melody, Eduardo, Riley, Robin, Marcus, Clarke, Frankie, Sarah, David, Darin, Javier
Below, we'll look at the preferences in a pandas table using Python. 1, 2, and 3 represent their top three choices (1 is their first choice), while a null represents no preference.
import random
#-Preference data: we'll read it into pandas using a dict object
mentors = ['Jack', 'Cindy', 'Simrun', 'Christopher', 'Edward']
mentees = ['Chiara', 'Rachel', 'Irving', 'Felipe', 'Tommy', 'Spencer',
'Terrell', 'Alex', 'Kristen', 'Elsa', 'Melody',
'Eduardo', 'Riley', 'Robin', 'Marcus', 'Clarke',
'Frankie', 'Sarah', 'David', 'Darin', 'Javier']
def selectMentor(randNum):
if 0 <= randNum <= 0.5:
mentor = 'Jack'
elif 0.5 < randNum <= 0.6:
mentor = 'Cindy'
elif 0.6 < randNum <= 0.8:
mentor = 'Simrun'
elif 0.8 < randNum <= 0.9:
mentor = 'Christopher'
else:
mentor = 'Edward'
return mentor
def selectMentors():
(mentor1, mentor2, mentor3) = (selectMentor(random.random()), selectMentor(random.random()), selectMentor(random.random()))
while mentor2 == mentor1:
mentor2 = selectMentor(random.random())
while mentor3 == mentor1 or mentor3 == mentor2:
mentor3 = selectMentor(random.random())
return [mentor1, mentor2, mentor3]
#-we create the preferences using random choices from the lists above
#preferences= {mentee: random.sample(mentors, 3) for mentee in mentees }
preferences= {mentee: selectMentors() for mentee in mentees }
def countPreferences(preferences):
"""
This function goes through the mentee preferences,
and adds up how many times each mentor got ranked
as 1, 2 or 3.
"""
mentorCounts = {}
for i in range(1,4):
for mentee in preferences.keys():
mentorList = preferences[mentee]
mentorCounts[i] = mentorCounts.get(i, {mentor:0 for mentor in mentors})
mentorCounts[i][mentorList[i-1]] = 1 + mentorCounts[i][mentorList[i-1]]
return mentorCounts
countPreferences(preferences)
import pandas as pd
preferencesDataFrame = pd.DataFrame(preferences)
#use to create HTML table of data frame
#print(preferencesDataFrame.to_html())
Alex | Chiara | Clarke | Darin | David | Eduardo | Elsa | Felipe | Frankie | Irving | Javier | Kristen | Marcus | Melody | Rachel | Riley | Robin | Sarah | Spencer | Terrell | Tommy | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Edward | Simrun | Simrun | Cindy | Christopher | Simrun | Jack | Jack | Simrun | Edward | Jack | Christopher | Cindy | Christopher | Simrun | Jack | Jack | Edward | Jack | Christopher | Christopher |
1 | Simrun | Cindy | Jack | Simrun | Edward | Jack | Christopher | Christopher | Cindy | Jack | Christopher | Jack | Jack | Jack | Christopher | Simrun | Simrun | Jack | Cindy | Simrun | Jack |
2 | Cindy | Jack | Christopher | Jack | Simrun | Christopher | Cindy | Simrun | Jack | Simrun | Simrun | Simrun | Simrun | Simrun | Jack | Christopher | Christopher | Simrun | Christopher | Jack | Cindy |
preferences
One possible formulation we can do to match mentees to mentors is to assign costs 1, 2, and 3 to their first, second, and third choices, respectively, and minimize the sum of the assignment. Without any restriction, everyone will naturally get their first choice (since there are no group size limits).
Let's take a look at what the assignment is like with no group balancing.
Our formulation will be fairly simple: we create arcs between mentees and mentors to make a complete bipartite graph. Each arc will be given a binary variable, where 1 represents an assignment, and 0 represents no assignment.
$\mathcal{N}$: set of mentees, indexed by n
$\mathcal{M}$: set of mentors, indexed by m
$\mathrm{p_{n,m}}$ preference mentee n has for mentor m, 1 represents most preferred
$a_{n,m}$: arc variable between mentee n and mentor m
$\min{\; \sum_{m \in \mathcal{M}}\sum_{n \in \mathcal{N}}a_{n,m}\mathrm{p_{n,m}}} $
$\sum_{m \in \mathcal{M}}{a_{n,m}} = 1 \; \forall n \in \mathcal{N} \;\;\; (1)$
$\sum_{n \in \mathcal{N}}{a_{n,m}} \geq 1 \; \forall m \in \mathcal{M} \;\;\; (2)$
$ a_{n,m} \in \{0,1\} \; \forall n \in \mathcal{N}, \; \forall m \in \mathcal{M}$
The objective function above minimizes the cost of assignment, where the cost represents the preference mentee n has for mentor m. A low preference is better, and we can define a "no preference" to be a large number so as to avoid assignment. Constraint 1 ensures that each mentee is assigned to exactly 1 mentor, and constraint 2 ensures that each mentor gets at least 1 mentee.
import pulp
def groupOptimization(preferences):
prob = pulp.LpProblem("No Restrictions Mentoring", pulp.LpMinimize)
arcs = [(mentee, mentor) for mentee in mentees for mentor in mentors]
arc_variables = pulp.LpVariable.dicts("Arcs",
[(mentee, mentor) for mentee in mentees \
for mentor in mentors], \
0, 1, pulp.LpBinary)
def mentorPreferences(mentee, mentor):
#if mentor in preferences then return position in list
if mentor in preferences[mentee]:
return preferences[mentee].index(mentor)
#if mentor not in preferences return large number
else:
return 10
prob += sum(mentorPreferences(mentee, mentor)*arc_variables[(mentee, mentor)] for (mentee, mentor) in arcs )
#Every mentee is assigned to 1 mentor
for mentee in mentees:
prob += sum(arc_variables[(mentee, mentor)] for mentor in mentors) == 1
#Every mentor is assigned at least one individual
for mentor in mentors:
prob += sum(arc_variables[(mentee, mentor)] for mentee in mentees) >= 1
prob.solve()
sol = {}
for mentee in mentees:
for mentor in mentors:
if arc_variables[(mentee, mentor)].value() == 1:
sol[mentee] = mentor
return(sol)
Before we get to creating some visualizations of the optimal solution, we can simply print out the assignment, as well as the preference the mentee had for that assignment. In this case, we see that everyone but Clarke received their preferred mentor. This means that the only reason she received Cindy as a mentor was because of the constraint ensuring that each mentor received at least 1 mentee.
sol1 = groupOptimization(preferences)
#prepare dictionary for counting preferences
countDict1 = {0:0, 1:0, 2:0}
groupSizeDict1 = {mentor:0 for mentor in mentors}
for mentee in sol1.keys():
#print out assignments with preferences
print(mentee, "\t", sol1[mentee], "\t", preferences[mentee].index(sol1[mentee]))
countDict1[preferences[mentee].index(sol1[mentee])] += 1
groupSizeDict1[sol1[mentee]] += 1
No we can create some quick charts showing the distribution of assignments and group sizes. Since the data is straightforward to visualize, I choose a simple matplotlib bar chart with a basic color scheme. We wrap up the visualization in a function so that we can use it later.
def barChartCount(dictionary):
x = []
height = []
for i in dictionary.keys():
x.append(i)
height.append(dictionary[i])
return (x,height)
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import cm
#Function taken from Gareth on CodeGolf
#https://codegolf.stackexchange.com/questions/4707/outputting-ordinal-numbers-1st-2nd-3rd#answer-4712
ordinal = lambda n: "%d%s" % (n,"tsnrhtdd"[(n/10%10!=1)*(n%10<4)*n%10::4])
x, height = barChartCount(countDict1)
plt.bar(x, height, align='center', color= 'maroon')
plt.title("Count of Mentees by Preference Assigned")
plt.ylabel("Mentee count")
#Add 1 to list of preferences to make 0 the 1'st choice
plt.xticks(x, list(map(lambda x: ordinal(x+1)+" choice", x)))
plt.show()
x2, height2 = barChartCount(groupSizeDict1)
plt.bar(list(range(len(x2))), height2, color='navy')
plt.title("Group Size For Each mentor")
plt.ylabel("Mentee count")
plt.xticks(list(range(len(x2))), x2)
plt.show()
##
While in the last model, nearly every mentee got their preferred mentor, we also see that the group sizes are not equitably distributed. For instance, Jack might not be able to give much attention to his many mentees, while Cindy has the capacity to take on more mentors. This is where goal programming comes in: we can add this trade-off between preferences and group size into the model to be able to evaluate solutions that balance both.
Below, we go over the additional parameters, variables and constraints that we need to add to the model. The main idea is that we define an "ideal" or equitable group size, and then define decision variables that are defined to be deviations from this ideal group size. Then, we add a weighted sum of these deviations to the objective function and minimize it.
$G$: ideal average group size, defined as $\frac{|\mathcal{N}|}{|\mathcal{M}|}$
$\pi$: weight for the deviations to be used in the objective function
$\delta^+_{m}$: positive deviation from average group size for mentor m
$\delta^-_{m}$: negative deviation from average group size for mentor m
$ \delta^+_{m} - \delta^-_{m} = \sum_{n \in \mathcal{M}}{a_{n,m}} - G \;\; \forall m \in \mathcal{M}$
$\min{\; \sum_{m \in \mathcal{M}}\sum_{n \in \mathcal{N}}a_{n,m}\mathrm{p_{n,m}} + \pi \sum_{m \in \mathcal{M}}({\delta^+_{m} + \delta^-_{m}}) } $
The additional decision variables are used to quantify the deviations that each mentor has from the average "ideal" group size. This ideal is the number of mentees divided by the number of mentors. While not necessarily an integer, it will still serve in the optimization model. The additional constraint defines the positive and negative deviations. The new objective function minimizes a weighted sum of the assigned preferences and the sum of deviations of the ideal group size.
import pulp
def groupGoalOptimization(preferences):
prob = pulp.LpProblem("No Restrictions Mentoring", pulp.LpMinimize)
arcs = [(mentee, mentor) for mentee in mentees for mentor in mentors]
arc_variables = pulp.LpVariable.dicts("Arcs",
[(mentee, mentor) for mentee in mentees \
for mentor in mentors], \
0, 1, pulp.LpBinary)
#define the deviation variables
devPos = pulp.LpVariable.dicts("Dev_Pos", mentors, 0, len(mentees))
devNeg = pulp.LpVariable.dicts("Dev_Neg", mentors, 0, len(mentees))
#find the average size of a group if groups were equal size
avg_group_size = len(mentees)/len(mentors)
print(avg_group_size)
#function to return preference for a given assignment
def mentorPreferences(mentee, mentor):
#if mentor in preferences then return position in list
if mentor in preferences[mentee]:
return preferences[mentee].index(mentor)
#if mentor not in preferences return large number
else:
#return 10 to make difficult for a no preference
return 10
#the weight used in the objective function for the deviations
devWeight = 5
prob += sum(mentorPreferences(mentee, mentor)*arc_variables[(mentee, mentor)] for (mentee, mentor) in arcs) \
+ devWeight * sum(devPos[mentor] + devNeg[mentor] for mentor in mentors)
#Every mentee is assigned to 1 mentor
for mentee in mentees:
prob += sum(arc_variables[(mentee, mentor)] for mentor in mentors) == 1
#Every mentor is assigned at least one individual
for mentor in mentors:
prob += sum(arc_variables[(mentee, mentor)] for mentee in mentees) >= 1
prob += devPos[mentor] - devNeg[mentor] == sum(arc_variables[(mentee, mentor)] for mentee in mentees) \
- avg_group_size
prob.solve()
#collect the assignment in a dictionary
#key is mentee, value is assigned mentor
sol = {}
for mentee in mentees:
for mentor in mentors:
if arc_variables[(mentee, mentor)].value() == 1:
sol[mentee] = mentor
return(sol)
Now we will solve the problem, and write out the assignment that was stored in the "sol" dictionary. The optimization program will also write out the ideal group size, which in this case is 4.2. Note that while this is not a possible group size, it will serve the intended purpose for goal programming.
sol = groupGoalOptimization(preferences)
countDict = {0:0, 1:0, 2:0}
groupSize = {mentor:0 for mentor in mentors}
for mentee in sol.keys():
print(mentee, "\t", sol[mentee], "\t", preferences[mentee].index(sol[mentee]))
countDict[preferences[mentee].index(sol[mentee])] += 1
groupSize[sol[mentee]] += 1
Now we can visualize the assignment in the simple bar charts that we defined earlier with a function. You can see the trade-off between preferences and group size in this chart. We note that the size of Jack's group has gone down, while Cindy's has gone up. While the group sizes are more equally distributed, there are now more individuals that did not get their first choice.
It's worthwile noting that as a decision maker, you would have the weight $\pi$ allows to be able to fine-tune that trade-off and create different scenarios, and ultimately select the best option that suits business needs.
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import cm
x1, height1 = barChartCount(countDict)
plt.bar(x1, height1, align='center', color= 'maroon')
plt.title("Count of Mentees by Preference Assigned")
plt.ylabel("Mentee count")
#Add 1 to list of preferences to make 0 the 1'st choice
plt.xticks(x1, list(map(lambda x: ordinal(x+1) + " choice", x1)))
plt.show()
x2, height2 = barChartCount(groupSize)
plt.bar(list(range(len(x2))), height2, color='navy')
plt.title("Group Size For Each Mentor")
plt.ylabel("Mentee count")
plt.xticks(list(range(len(x2))), x2)
plt.show()