James Gregson's Website

Sep 08, 2018

Bilateral Filtering in Python

Bilteratal filtering is a process for removing noise from images. Like most filters, each output pixel is produced as a weighted combination of input pixels. However unlike typical filters, the weights used to combine input pixels are a function of the input pixels themselves rather than fixed in advance. This makes it a non-linear filter and is key to its effectiveness.

Given an input grayscale image \({\bf I}(y,x)\) where \(x\) & \(y\) are column and row indices, bilateral filtering defines the output image \({\bf F}\) as:

$$ {\bf F}(y,x) = \frac{ \sum_{r=y-w}^{y+w} \sum_{s=x-w}^{x+w} {\bf I}(r,s) \mathcal{N}_s(r-y,s-x) \mathcal{N}_v \left( {\bf I}(y,x)-{\bf I}(s,r) \right) }{ \sum_{r=y-w}^{y+w} \sum_{s=x-w}^{x+w} \mathcal{N}_s(r-y,s-x) \mathcal{N}_v \left( {\bf I}(y,x)-{\bf I}(s,r) \right)} $$

This looks more involved than it actually is:

  • The numerator and denominator both sum over a window that is \(2w+1 \times 2w+1\) pixels in size, centered on \(y\),\(x\). The pixel coordinates \(r\),\(s\) reference each pixel within that window.
  • The pixel values of the input are combined via a weighted sum using the spatial proximity in \(\mathcal{N}_s(\delta_y,\delta_x) = e^{ -\frac{\delta_x^2+\delta_y^2}{2\sigma_s^2} }\), as in standard Gaussian filtering.
  • An additional factor based on the value proximity \(\mathcal{N}_v \left( v \right) = e^{ -\frac{v^2}{2 \sigma_v^2} }\) is used to modify the weights. This is the nonlinear part of the filter and causes pixels with similar value to the central pixel to be weighted more strongly.
  • The entire sum is normalized by the sum of the product of the spatial proximity and value proximity weights in order to produce an output with proper range.

The mathematical definition is good for understanding but a straightforward implementation results in four nested loops: two for the rows/columns and two for the offsets in each window. In pure python this leads to considerable overhead. A better approach is to vectorize the operations by:

  • Only looping over the offsets for the window
  • Generating the spatial proximity weights once per offset
  • Shifting the entire input image by these offsets
  • Generating the value proximity weights in builk using the shifted image
  • Accumulating an unnormalized result image and sum of weights image in bulk
  • Returning the pixel-wise quotient of the result and weight sum images

Code that implements this is below:

def filter_bilateral( img_in, sigma_s, sigma_v, reg_constant=1e-8 ):
    """Simple bilateral filtering of an input image

    Performs standard bilateral filtering of an input image. If padding is desired,
    img_in should be padded prior to calling

        img_in       (ndarray) monochrome input image
        sigma_s      (float)   spatial gaussian std. dev.
        sigma_v      (float)   value gaussian std. dev.
        reg_constant (float)   optional regularization constant for pathalogical cases

        result       (ndarray) output bilateral-filtered image

        ValueError whenever img_in is not a 2D float32 valued numpy.ndarray

    # check the input
    if not isinstance( img_in, numpy.ndarray ) or img_in.dtype != 'float32' or img_in.ndim != 2:
        raise ValueError('Expected a 2D numpy.ndarray with float32 elements')

    # make a simple Gaussian function taking the squared radius
    gaussian = lambda r2, sigma: (numpy.exp( -0.5*r2/sigma**2 )*3).astype(int)*1.0/3.0

    # define the window width to be the 3 time the spatial std. dev. to 
    # be sure that most of the spatial kernel is actually captured
    win_width = int( 3*sigma_s+1 )

    # initialize the results and sum of weights to very small values for
    # numerical stability. not strictly necessary but helpful to avoid
    # wild values with pathological choices of parameters
    wgt_sum = numpy.ones( img_in.shape )*reg_constant
    result  = img_in*reg_constant

    # accumulate the result by circularly shifting the image across the
    # window in the horizontal and vertical directions. within the inner
    # loop, calculate the two weights and accumulate the weight sum and 
    # the unnormalized result image
    for shft_x in range(-win_width,win_width+1):
        for shft_y in range(-win_width,win_width+1):
            # compute the spatial weight
            w = gaussian( shft_x**2+shft_y**2, sigma_s )

            # shift by the offsets
            off = numpy.roll(img_in, [shft_y, shft_x], axis=[0,1] )

            # compute the value weight
            tw = w*gaussian( (off-img_in)**2, sigma_v )

            # accumulate the results
            result += off*tw
            wgt_sum += tw

    # normalize the result and return
    return result/wgt_sum

An example of calling the function is below:

# use opencv for image loading
import cv2
import numpy

# function definition here....

# read the lena image, convert to float and scale to [0,1]
I = cv2.imread('images/lena.png', cv2.IMREAD_UNCHANGED ).astype(numpy.float32)/255.0

# bilateral filter the image
B = numpy.stack([ 
        filter_bilateral( I[:,:,0], 10.0, 0.1 ),
        filter_bilateral( I[:,:,1], 10.0, 0.1 ),
        filter_bilateral( I[:,:,2], 10.0, 0.1 )], axis=2 )

# stack the images horizontally
O = numpy.hstack( [I,B] )

# write out the image
cv2.imwrite( 'images/lena_bilateral.png', out*255.0 )

Which produces the following, the input image is one the left and the output image is on the right.

Input Lena image and bilateral filtered

The only downside to this is the speed. Evaluating the value proximity weights is costly. It can be sped up by discretizing and interpolating them from a lookup table. Since the filter also preserves edges very well, there is often the temptation to use very large spatial filter sizes. In the example above I used a value of 10 which corresponds to a 61x61 pixel kernel. This makes the filter a prime candidate to implement in C++ or CUDA for efficiency.