The The Forward Backward-Splitting Method (FBSM) Simply Explained

Using simple and understandable terms, along with code snippets (python) to provide a straightforward explanation

The FBSM isn’t just another tool in my Numerical Optimization Toolbox. Even though it tries to find the optimal value for an objective, accounting for additional function or constraints into account, just like other Methods.

Why is the FBSM important and what makes it so special?

Personally, the Algorithm played an important role to get in touch with the domain of Numerical Optimization. What made it so perfect is the simplicity of the underlying idea, which makes it easy to follow along while coding. Although it’s intuitive, it’s yet beautifully effective. But now let’s have a look at the idea

The Simple Idea Behind FBSM

In a nutshell the FBSM divides and conquers.

To give an abstract idea of the algorithm, it breaks down the optimization task into 2 manageable pieces, which are easier to work with. The algorithm iteratively switches between optimizing these 2 sub-tasks.

To put it into more concrete terms, FBSM splits the objective function h(x)into two pieces: f(x) and g(x), where:

f(x) is a smooth function that is differentiable and represents a step in the direction we want to optimize
g(x) is the non-smooth function that is a step in the direction of the constraints we want to take into account

In doing so, we are able to optimize the smooth and non-smooth parts of the objective function separately.

But now let‘s have a look into implementation.

I decided to go with an Object Oriented approach.

class FBSM:

    def __init__(self,
                 x,
                 parameters,
                 gradOpjective: function, 
                 proximalOperator: function,
                 stepsize: float):

        self.gradObjective = gradOpjective  # gradient of the objective function
        self.proximalOperator = proximalOperator   # proximal operator for L1 regularization
        self.stepsize = stepsize   # size of steps
        self.x = x   # starting point
        self.params = parameters   # parameters for Objective


## Rest of the class ...

Class attribute descriptions:

x — represents the vector starting point
parameters — the parameters that need to be fed into the objective function h(x). In my case those are: M — matrix with double the width as height, y — vector with the same length as the height of M, tau — a positive scalar quantity
gradObjective — gradient function of f(x)
proximalOperator — the function g(x)
stepsize — defines the length of the step taken in each iteration

Since we had a closer look at the class structure, let’s start with building the necessary methods for the forward step and backward step.

The Forward Step

## ...

   def forward (self):
        """
        The Forward Step moves in the direction of the negative gradient of the objective function
        """

        gradient = GradObjective(self.x, *self.params)
        x_k1 = self.x - self.stepsize * gradient
        return x_k1

## Rest of the class ...

During the Forward step we compute the gradient of the objective function h(x) with respect to the current position which is represented by the variable `x` and the parameters `params`.

The resulting vector we receive from this operation defines the direction in which we should go in order to minimize the problem (i.e. optimize it). Therefore we multiply this vector with our step-size and subtract the resulting value from our current position `x` to obtain our new position.

The Backward Step

## ...

    def backward (self, x_k1):
        """
        The Backward Step moves the new x into the direction of the Proximal Operator
        """
        return self.proximalOperator(x_k1)

## Rest of the class ...

For the Backward step we simply plug in the the current position into the proximal Operator. It’s idea is to find the closest point to the current position, in the set of the optimizers

Everything put together

    def solve (self, n_iterations = 1000, tol =1e-6):
        """
        Performs the Forward and Backward steps iteravely until the limit of iterations is reached or the distance moved falls below the tolerance.
        """
        steps = []

        for i in range(n_iterations):

            steps.append(self.x) #let's append the start as step or the last backward step
            x_k1 = self.forward()

            steps.append(x_k1) #let's append the last forward step
            x_k1 = self.backward(x_k1)

            distance = np.linalg.norm(self.x - x_k1) #compute the traveled distance during one forward and backward step

            if distance < tol:
                break
            
            self.x = x_k1
        
        return self.x, np.asarray(steps)

Now we have everything we need. Namely the Forward Step and the Backward Step.

Let’s put it all together and create a method which iteratevly performs a forward and a backward step until the distance traveled in an optimization step is smaller than a certain threshold (tolerance) `tol` or the number of steps exceeds the maximum steps `n_iterations`.

Additionally, because solely returning the final position is boring, we keep track of all performed steps and return them as well. This way we can see how the method performed on the problem space and can better analyze its behavior.

In conclusion

The FBSM is a great starting point to peer into the world of Numerical Optimization and learn how to implement mathematical algorithms.

Despite how powerful FBSM is, its applicability is limited to problems with a convex component in the objective function.

However, when applied to problems lacking a convex component, the FBSM may not converge, which means that it is not guaranteed to find an optimal solution.

I’ve uploaded my complete Implementation, together with an exemplary Objective and problems, to Github. Please feel free to have a look. https://github.com/j-tobias/Forward-Backward-Splitting