James Gregson's Website

Feb 01, 2019

Assignment Problems

Assignment problems occur in many fields that deal with resource allocation. There are many variants, however the simplest involves \(N\) agents indexed by \(i\) which must perform \(N\) tasks indexed by \(j\). The cost for agent \(i\) to perform task \(j\) is encoded in a cost matrix \(C\) that is weighted component-wise by assignments from an assignment matrix \(W_{ij}\) to form the objective.

It's common to have the requirements that each task must be completed and that no agent can perform more than one task. Often it is required that each agent perform a single task to completion but I'm not considering that case here.

\begin{align*} \newcommand{\argmin}{{\bf\mbox{argmin}}} \newcommand{\st}{{\bf\mbox{subject to:}}} \newcommand{\gap}{\hspace{0.25cm}} W = \argmin\gap& &\sum_{ij} C_{ij} W_{ij} \\ \st\gap& & \sum_i W_{ij} = 1\gap\forall j\\ & & \sum_j W_{ij} = 1\gap\forall i\\ & & W_{ij} \geq 0 \gap\forall i,j \end{align*}

The problem can be solved using the Hungarian algorithm but I thought I'd give it a try with ADMM to see how well it works.

The probability simplex constraints are a pain but it's pretty easy to solve this by introducing splitting variables \(A, B\) each equal to \(W\). This gives the following modified problem:

\begin{align*} W = \argmin\gap& &\sum_{ij} C_{ij} W_{ij} + \sum_i \mathcal{I}_\Delta(A_{i*}) + \sum_j \mathcal{I}_\Delta(B_{*j}) \\ \st\gap& & W - A = 0 \\ & & W - B = 0 \end{align*}

The function \(\mathcal{I}_\Delta(v)\) is the indicator function for probability simplex constraints on the rows of \(V\):

\begin{equation*} \mathcal{I}_\Delta(v) = \begin{cases} 0 & \gap\mbox{if } v_i \geq 0 \mbox{ and } \sum_j v_i = 1 \gap\forall i \\ \infty &\gap\mbox{otherwise} \end{cases} \end{equation*}

The proximal operator for this functions requires Euclidean projection onto the non-negativity and sum contraints. It's possible to alternately project on the non-negativity and sum contraints but this converges slowly. However, an \(O(N \log N)\) algorithm exists and can be implemented in Python as below:

def prox_probability_simplex( v ):
    """Algorithm of https://eng.ucmerced.edu/people/wwang5/papers/SimplexProj.pdf"""
    j = numpy.arange(1,len(v)+1)
    u = numpy.sort(v)[::-1]
    y = numpy.cumsum( u )
    rho = numpy.argmax((u+(1.0-y)/j>0)*j)+1
    off = (1-y[rho-1])/rho
    ret = numpy.maximum( v+off, 0.0 )
    return ret

Neglecting constants w.r.t \(W, A, B\), the augmented Lagrangian of this problem is:

\begin{equation*} \mathcal{L}(W,A,U_A,B,U_B) = \sum_{ij} C_{ij} W_{ij} + \sum_i \mathcal{I}_\Delta(A_{i*}) + \sum_j \mathcal{I}_\Delta(B_{*i}) + \frac{\rho}{2}\| W - A + U_A \|_2^2 + \frac{\rho}{2}\| W - A + U_A \|_2^2 \end{equation*}

ADMM consists of alternating minimizations for the solution \(W\) and the splitting variables \(A\) and \(B\). After each set of updates, the (scaled) Lagrange multipliers \(U_A, U_B\) are updated. The objective terms parallelize really cleanly, the first and last two terms parallelize per-component of \(W\). The \(\mathcal{I}_\Delta(.)\) terms each parallelize independently over rows and columns of the splitting variables.

The resulting algorithm is as follows. I have not parallelized the solves although the \(W\) solve is vectorized at least:

def assignment_problem( C, W=None, rho=0.1, thresh=0.55, early_stop=True, num_iterations=10000 ):
    """Applies an ADMM algorithm to the assignment problem

    This algorithm will only solve the problem at the limit of iterations. Early
    stopping or running too few iterations could give a non-optimal or invalid
    matching (i.e. not a permutation of agents to tasks)

    That said, it often does okay!

    Inputs:
        C (NxN matrix): Input cost matric mapping agents (row indices) to tasks
            (column indices). Invalid assignments should be indicated with very
            large positive entries

        W (NxN matrix): Initial assignments to warm-start the process or None
            to start from all zeros

        rho (scalar): Lagrangian augmentation weight

        thresh (scalar): Threshold above which assignments are treated as hard
            for the purposes of returned assignments

        early_stop (boolean): enable early stopping in the case that thresholding
            the assignment matrix by thresh produces a valid assignment. This
            does not necessarily produce an optimal solution!

        num_iterations (int): maximum number of iterations to perform, may be
            fewer if early_stop == True

    Returns:

        assignment (N-element integer array): assignments from agents to task
            where each element assignment[i]=v maps agent i to task v

        assignment matrix (NxN matrix): solution variables to warm-start if
            a valid solution was not found

        loss (scalar): the achieved loss/cost of the solution

        iter (integer): number of iterations performed
    """
    if W is None:
        W = numpy.zeros_like(C)

    A = W.copy()
    B = W.copy()
    Ua = W-A
    Ub = W-B

    loss = numpy.sum(C*W)
    for iter in range( num_iterations ):
        # update primary variables, trivially parallelizes per component
        W = numpy.maximum( (rho*(A-Ua)+rho*(B-Ub)-C)/(2.0*rho), 0.0 )

        # update row constraints, trivially parallizes by row
        for i in range(A.shape[0]):
            A[i,:] = prox_probability_simplex( W[i,:] + Ua[i,:] )
        Ua += (W - A)

        # update column constraints, trivially parallelizes by column
        for i in range(A.shape[1]):
            B[:,i] = prox_probability_simplex( W[:,i] + Ub[:,i] )
        Ub += (W - B)

        tloss = numpy.sum(C*W)
        rowsum = numpy.sum(W > 0.55, axis=0)
        colsum = numpy.sum(W > 0.55, axis=1)
        if early_stop and numpy.max(rowsum) == 1 and numpy.min(rowsum) == 1 and numpy.max(colsum) == 1 and numpy.min(colsum) == 1:
            return numpy.argmax( W > thresh, axis=1 ), W, tloss, iter

    return numpy.argmax( W > thresh, axis=1 ), W, tloss, iter

def check_validity( x ):
    """Checks that a solution from the above is a permutation"""
    return numpy.linalg.norm( numpy.sort(x) - numpy.arange(len(x)) ) > 1e-8

It's worth noting a few things:

  • With the splitting and ADMM, this algorithm will only exactly solve the problem in the limit of iteration count.
  • Early stopping can reduce the iterations needed to achieve a valid, low-cost matching by 10-50X depending on the problem but this matching may not be optimal. For some test problems with 10x10 cost matrices having standard normal entries this happened about 1% of the time. The achieved loss values are typically within 1-2% of each other.
  • Early stopping does always generate identical output running the algorithm for many more iterations. Subsets of agents/tasks where multiple matchings produce close results tend to be hard to resolve.