Preconditioning saddle-point systems

Introduction

In this demo, we will discuss strategies for solving saddle-point systems using the mixed formulation of the Poisson equation introduced previously as a concrete example. Such systems are somewhat tricky to precondition effectively, modern approaches typically use block-factorisations. We will encounter a number of methods in this tutorial. For many details and background on solution methods for saddle point systems, [BGL05] is a nice review. [ESW14] is an excellent text with a strong focus on applications in fluid dynamics.

We start by repeating the formulation of the problem. Starting from the primal form of the Poisson equation, \(\nabla^2 u = -f\), we introduce a vector-valued flux, \(\sigma = \nabla u\). The problem then becomes to find \(u\) and \(\sigma\) in some domain \(\Omega\) satisfying

\[\begin{split}\sigma - \nabla u &= 0 \quad &\textrm{on}\ \Omega\\ \nabla \cdot \sigma &= -f \quad &\textrm{on}\ \Omega\\ u &= u_0 \quad &\textrm{on}\ \Gamma_D\\ \sigma \cdot n &= g \quad &\textrm{on}\ \Gamma_N\end{split}\]

for some specified function \(f\). We now seek \((u, \sigma) \in V \times \Sigma\) such that

\[\begin{split}\int_\Omega \sigma \cdot \tau + (\nabla \cdot \tau)\, u\,\mathrm{d}x &= \int_\Gamma (\tau \cdot n)\,u\,\mathrm{d}s &\quad \forall\ \tau \in \Sigma, \\ \int_\Omega (\nabla \cdot \sigma)\,v\,\mathrm{d}x &= -\int_\Omega f\,v\,\mathrm{d}x &\quad \forall\ v \in V.\end{split}\]

A stable choice of discrete spaces for this problem is to pick \(\Sigma_h \subset \Sigma\) to be the lowest order Raviart-Thomas space, and \(V_h \subset V\) to be the piecewise constants, although this is not the only choice. For ease of exposition we choose the domain to be the unit square, and enforce homogeneous Dirichlet conditions on all walls. The forcing term is chosen to be random.

Globally coupled elliptic problems, such as the Poisson problem, require effective preconditioning to attain mesh independent convergence. By this we mean that the number of iterations of the linear solver does not grow when the mesh is refined. In this demo, we will study various ways to achieve this in Firedrake.

As ever, we begin by importing the Firedrake module:

from firedrake import *

Bulding the problem

Rather than defining a mesh and function spaces straight away, since we wish to consider the effect that mesh refinement has on the performance of the solver, we instead define a Python function which builds the problem we wish to solve. This takes as arguments the size of the mesh, the solver parameters we wish to apply, an optional parameter specifying a “preconditioning” operator to apply, and a final optional argument specifying whether the block system should be assembled as a single “monolithic” matrix or a \(2 \times 2\) block of smaller matrices.

def build_problem(mesh_size, parameters, aP=None, block_matrix=False):
    mesh = UnitSquareMesh(2 ** mesh_size, 2 ** mesh_size)

    Sigma = FunctionSpace(mesh, "RT", 1)
    V = FunctionSpace(mesh, "DG", 0)
    W = Sigma * V

Having built the function spaces, we can now proceed to defining the problem. We will need some trial and test functions for the spaces:

#
    sigma, u = TrialFunctions(W)
    tau, v = TestFunctions(W)

along with a function to hold the forcing term, living in the discontinuous space.

#
    f = Function(V)

To initialise this function to a random value we access its Vector form and use numpy to set the values:

#
    import numpy as np
    fvector = f.vector()
    fvector.set_local(np.random.uniform(size=fvector.local_size()))

Note that the homogeneous Dirichlet conditions in the primal formulation turn into homogeneous Neumann conditions on the dual variable and we therefore drop the surface integral terms in the variational formulation (they are identically zero). As a result, the specification of the variational problem is particularly simple:

#
    a = dot(sigma, tau)*dx + div(tau)*u*dx + div(sigma)*v*dx
    L = -f*v*dx

Now we treat the mysterious optional aP argument. When solving a linear system, Firedrake allows specifying that the problem should be preconditioned with an operator different to the operator defining the problem to be solved. We will use this functionality in a number of cases later. The aP function will take one argument, the FunctionSpace defining the space, and return a bilinear form suitable for assembling as an operator. Obviously we only do so if aP is provided.

#
    if aP is not None:
        aP = aP(W)

Now we have all the pieces to build our linear system. We will return a LinearSolver object from this function, so we preassemble the operators to build it. It is here that we must specify whether we want a monolithic matrix or not, by setting the matrix type parameter to assemble().

#
    if block_matrix:
        mat_type = 'nest'
    else:
        mat_type = 'aij'
    A = assemble(a, mat_type=mat_type)
    if aP is not None:
        P = assemble(aP, mat_type=mat_type)
    else:
        P = None

    solver = LinearSolver(A, P=P, solver_parameters=parameters)

The solve() method of LinearSolver objects needs both the assembled right hand side and a Function in which to place the result.

#
    w = Function(W)
    b = assemble(L)

Finally, we return all three objects as a tuple.

#
    return solver, w, b

With these preliminaries out of the way, we can now move on to solution strategies, in particular, preconditioner options.

Preconditioner choices

A naive approach

To illustrate the problem, we first attempt to solve the problem on a sequence of finer and finer meshes preconditioning the problem with zero-fill incomplete LU factorisation. Configuration of the solver is carried out by providing appropriate parameters when constructing the LinearSolver object through the solver_parameters keyword argument which should be a dict of parameters. These parameters are passed directly to PETSc, and their form is described in more detail in Solving PDEs. For this problem, we use GMRES with a restart length of 100,

parameters = {
    "ksp_type": "gmres",
    "ksp_gmres_restart": 100,

solve to a relative tolerance of 1e-8,

#
    "ksp_rtol": 1e-8,

and precondition with ILU(0).

#
    "pc_type": "ilu",
    }

We now loop over a range of mesh sizes, assembling the system and solving it

print("Naive preconditioning")
for n in range(8):
    solver, w, b = build_problem(n, parameters, block_matrix=False)
    solver.solve(w, b)

Finally, at each mesh size, we print out the number of cells in the mesh and the number of iterations the solver took to converge

#
    print(w.function_space().mesh().num_cells(), solver.ksp.getIterationNumber())

The resulting convergence is unimpressive:

Mesh elements GMRES iterations
2 2
8 12
32 27
128 54
512 111
2048 255
8192 717
32768 2930

Were this a primal Poisson problem, we would be able to use a standard algebraic multigrid preconditioner, such as hypre. However, this dual formulation is slightly more complicated.

Schur complement approaches

A better approach is to use a Schur complement preconditioner, described in Preconditioning mixed finite element systems. The system we are trying to solve is conceptually a \(2\times 2\) block matrix.

\[\begin{split}\left(\begin{matrix} A & B \\ C & 0 \end{matrix}\right)\end{split}\]

which admits a factorisation

\[\begin{split}\left(\begin{matrix} I & 0 \\ C A^{-1} & I\end{matrix}\right) \left(\begin{matrix}A & 0 \\ 0 & S\end{matrix}\right) \left(\begin{matrix} I & A^{-1} B \\ 0 & I\end{matrix}\right),\end{split}\]

with the Schur complement \(S = -C A^{-1} B\). The inverse of the operator can be therefore be written as

\[\begin{split}P = \left(\begin{matrix} I & -A^{-1}B \\ 0 & I \end{matrix}\right) \left(\begin{matrix} A^{-1} & 0 \\ 0 & S^{-1}\end{matrix}\right) \left(\begin{matrix} I & 0 \\ -CA^{-1} & I\end{matrix}\right).\end{split}\]

An algorithmically optimal solution

If we can find a good way of approximating \(P\) then we can use that to precondition our original problem. This boils down to finding good approximations to \(A^{-1}\) and \(S^{-1}\). For our problem, \(A\) is just a mass matrix and so we can invert it well with a cheap method: either a few iterations of jacobi or ILU(0) are fine. The troublesome term is \(S\) which is spectrally a Laplacian, but dense (since \(A^{-1}\) is dense). However, before we worry too much about this, let us just try using a Schur complement preconditioner. This simple setup can be driven using only solver options.

Note that we will exactly invert the inner blocks for \(A^{-1}\) and \(S^{-1}\) using Krylov methods. We therefore need to use flexible GMRES as our outer solver, since the use of inner Krylov methods in our preconditioner makes the application of the preconditioner nonlinear. This time we use the default restart length of 30, but solve to a relative tolerance of 1e-8:

parameters = {
    "ksp_type": "fgmres",
    "ksp_rtol": 1e-8,

this time we want a fieldsplit preconditioner.

#
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "pc_fieldsplit_schur_fact_type": "full",

If we use this preconditioner and invert all the blocks exactly, then the preconditioned operator will have at most three distinct eigenvalues [MGW00] and hence GMRES should converge in at most three iterations. To try this, we start out by exactly inverting \(A\) and \(S\) to check the convergence.

    "fieldsplit_0_ksp_type": "cg",
    "fieldsplit_0_pc_type": "ilu",
    "fieldsplit_0_ksp_rtol": 1e-12,
    "fieldsplit_1_ksp_type": "cg",
    "fieldsplit_1_pc_type": "none",
    "fieldsplit_1_ksp_rtol": 1e-12,
}

Let’s go ahead and run this. Note that for this problem, we’re applying the action of blocks, so we can use a block matrix format.

print("Exact full Schur complement")
for n in range(8):
    solver, w, b = build_problem(n, parameters, block_matrix=True)
    solver.solve(w, b)
    print(w.function_space().mesh().num_cells(), solver.ksp.getIterationNumber())

The resulting convergence is algorithmically good, however, the larger problems still take a long time.

Mesh elements fGMRES iterations
2 1
8 1
32 1
128 1
512 1
2048 1
8192 1
32768 1

We can improve things by building a matrix used to precondition the inversion of the Schur complement. Note how we’re currently not using any preconditioning, and so the inner solver struggles (this can be observed by additionally running with the parameter "fieldsplit_1_ksp_converged_reason": True.

As we increase the number of mesh elements, the solver inverting \(S\) takes more and more iterations, which means that we take longer and longer to solve the problem as the mesh is refined.

Mesh elements CG iterations on S
2 2
8 7
32 32
128 73
512 149
2048 289
8192 553
32768 1143

Approximating the Schur complement

Fortunately, PETSc gives us some options to try here. For our problem a diagonal “mass-lumping” of the velocity mass matrix gives a good approximation to \(A^{-1}\). Under these circumstances \(S_p = -C \mathrm{diag}(A)^{-1} B\) is spectrally close to \(S\), but sparse, and can be used to precondition the solver inverting \(S\). To do this, we need some additional parameters. First we repeat those that remain unchanged

parameters = {
    "ksp_type": "fgmres",
    "ksp_rtol": 1e-8,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "pc_fieldsplit_schur_fact_type": "full",
    "fieldsplit_0_ksp_type": "cg",
    "fieldsplit_0_pc_type": "ilu",
    "fieldsplit_0_ksp_rtol": 1e-12,
    "fieldsplit_1_ksp_type": "cg",
    "fieldsplit_1_ksp_rtol": 1e-12,

Now we tell PETSc to construct \(S_p\) using the diagonal of \(A\), and to precondition the resulting linear system using algebraic multigrid from the hypre suite.

    "pc_fieldsplit_schur_precondition": "selfp",
    "fieldsplit_1_pc_type": "hypre"
}

Note

For this set of options to work, you will have needed to build PETSc with support for hypre (for example, by specifying --download-hypre when configuring).

Let’s see what happens.

print("Schur complement with S_p")
for n in range(8):
    solver, w, b = build_problem(n, parameters, block_matrix=True)
    solver.solve(w, b)
    print(w.function_space().mesh().num_cells(), solver.ksp.getIterationNumber())

This is much better, the problem takes much less time to solve and when observing the iteration counts for inverting \(S\) we can see why.

Mesh elements CG iterations on S
2 2
8 8
32 17
128 18
512 19
2048 19
8192 19
32768 19

We can now think about backing off the accuracy of the inner solves. Effectively computing a worse approximation to \(P\) that we hope is faster, despite taking more GMRES iterations. Additionally we can try dropping some terms in the factorisation of \(P\), by adjusting pc_fieldsplit_schur_fact_type from full to one of upper, lower, or diag we make the preconditioner slightly worse, but gain because we require fewer applications of \(A^{-1}\). For our problem where computing \(A^{-1}\) is cheap, this is not a great problem, however for many fluids problems \(A^{-1}\) is expensive and it pays to experiment.

For example, we might wish to try a full factorisation, but approximate \(A^{-1}\) by a single application of ILU(0) and \(S^{-1}\) by a single multigrid V-cycle on \(S_p\). To do this, we use the following set of parameters.

parameters = {
    "ksp_type": "gmres",
    "ksp_rtol": 1e-8,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "pc_fieldsplit_schur_fact_type": "full",
    "fieldsplit_0_ksp_type": "preonly",
    "fieldsplit_0_pc_type": "ilu",
    "fieldsplit_1_ksp_type": "preonly",
    "pc_fieldsplit_schur_precondition": "selfp",
    "fieldsplit_1_pc_type": "hypre"
}

Note how we can switch back to GMRES here, our inner solves are linear and so we no longer need a flexible Krylov method.

print("Schur complement with S_p and inexact inner inverses")
for n in range(8):
    solver, w, b = build_problem(n, parameters, block_matrix=True)
    solver.solve(w, b)
    print(w.function_space().mesh().num_cells(), solver.ksp.getIterationNumber())

This results in the following GMRES iteration counts

Mesh elements GMRES iterations
2 2
8 9
32 11
128 13
512 13
2048 12
8192 12
32768 12

and the solves take only a few seconds.

Providing the Schur complement approximation

Instead of asking PETSc to build an approximation to \(S\) which we then use to solve the problem, we can provide one ourselves. Recall that \(S\) is spectrally a Laplacian only in a discontinuous space. A natural choice is therefore to use an interior penalty DG formulation for the Laplacian term and provide it as \(Sp\). Since this preconditioning matrix is block-diagonal, we then need to tell PETSc that it should use the original operator, rather than the preconditioning matrix, to compute the action of the off-diagonal blocks. This is done using pc_fieldsplit_off_diag_use_amat.

parameters = {
    "ksp_type": "gmres",
    "ksp_rtol": 1e-8,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "pc_fieldsplit_schur_fact_type": "full",
    "pc_fieldsplit_off_diag_use_amat": True,
    "fieldsplit_0_ksp_type": "preonly",
    "fieldsplit_0_pc_type": "ilu",
    "fieldsplit_1_ksp_type": "preonly",
    "fieldsplit_1_pc_type": "hypre"
}

Additionally, we need to provide our build_problem function with the ability to construct this operator. To do so, we define a function to pass as the aP argument.

def dg_laplacian(W):
    sigma, u = TrialFunctions(W)
    tau, v = TestFunctions(W)
    n = FacetNormal(W.mesh())
    alpha = Constant(4.0)
    gamma = Constant(8.0)
    h = CellSize(W.mesh())
    h_avg = (h('+') + h('-'))/2
    a_dg = dot(sigma, tau)*dx \
           + dot(grad(v), grad(u))*dx \
           - dot(avg(grad(v)), jump(u, n))*dS \
           - dot(jump(v, n), avg(grad(u)))*dS \
           + alpha/h_avg * dot(jump(v, n), jump(u, n))*dS \
           - dot(grad(v), u*n)*ds \
           - dot(v*n, grad(u))*ds \
           + (gamma/h)*dot(v, u)*ds

    return a_dg

Now we just need to pass this extra argument to the build_problem function

print("DG approximation for S_p")
for n in range(8):
    solver, w, b = build_problem(n, parameters, aP=dg_laplacian, block_matrix=True)
    solver.solve(w, b)
    print(w.function_space().mesh().num_cells(), solver.ksp.getIterationNumber())

This actually results in slightly worse convergence than the diagonal approximation we used above.

Mesh elements GMRES iterations
2 2
8 10
32 17
128 20
512 19
2048 19
8192 18
32768 18

Block diagonal preconditioners

An alternate approach to using a Schur complement is to use a block-diagonal preconditioner. To do this, we note that the mesh-dependent ill conditioning of linear operators comes from working in the wrong norm. To convert into working in the correct norm, we can precondition our problem using the Riesz map for the spaces. For details on the mathematics behind this approach see for example [Kir10].

We are working in a space \(W \subset H(\text{div}) \times L^2\), and as such, the appropriate Riesz map is just \(H(\text{div})\) inner product in \(\Sigma\) and the \(L^2\) inner product in \(V\). As was the case for the DG Laplacian, we do this by providing a function that constructs this operator to our build_problem function.

def riesz(W):
    sigma, u = TrialFunctions(W)
    tau, v = TestFunctions(W)

    return (dot(sigma, tau) + div(sigma)*div(tau) + u*v)*dx

Now we set up the solver parameters. We will still use a fieldsplit preconditioner, but this time it will be additive, rather than a Schur complement.

parameters = {
    "ksp_type": "gmres",
    "ksp_rtol": 1e-8,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "additive",

Now we choose how to invert the two blocks. The second block is easy, it is just a mass matrix in a discontinuous space and is therefore inverted exactly using a single application of zero-fill ILU.

#
    "fieldsplit_1_ksp_type": "preonly",
    "fieldsplit_1_pc_type": "ilu",

The \(H(\text{div})\) inner product is the tricky part. In fact, we currently do not have a good way of inverting this in Firedrake. For now we will invert it with a direct solver. This is a reasonable option up to a few tens of thousands of degrees of freedom.

#
    "fieldsplit_0_ksp_type": "preonly",
    "fieldsplit_0_pc_type": "lu",
}

Note

For larger problems, you will probably need to use a sparse direct solver such as MUMPS, which may be selected by additionally specifying "fieldsplit_0_pc_factor_mat_solver_package": "mumps".

To use MUMPS you will need to have configured PETSc appropriately (using at the very least --download-mumps).

Let’s see what the iteration count looks like now.

print("Riesz-map preconditioner")
for n in range(8):
    solver, w, b = build_problem(n, parameters, aP=riesz, block_matrix=True)
    solver.solve(w, b)
    print(w.function_space().mesh().num_cells(), solver.ksp.getIterationNumber())
Mesh elements GMRES iterations
2 3
8 5
32 5
128 5
512 5
2048 5
8192 5
32768 5

Providing access to scalable preconditioners for these kinds of problems is currently a wishlist item for Firedrake. There are two options, either geometric multigrid with strong, Schwarz-based, smoothers [AFW00]. Or else algebraic multigrid approaches using the auxiliary-space preconditioning method of [HX07]. Support for the algebraic approach is available in hypre (the AMS and AMR preconditioners), and an interface exists in PETSc. If you’re interested in adding the missing pieces to support this in Firedrake, we would love to hear from you.

A runnable python script version of this demo is available here.

References

[AFW00]D. N. Arnold, R. S. Falk, and R. Winther. Multigrid in H(div) and H(curl). Numerische Mathematik, 85:197–217, 2000. doi:10.1007/PL00005386.
[BGL05]Michele Benzi, Gene H. Golub, and Jörg Liesen. Numerical solution of saddle point problems. Acta Numerica, 14:1–137, 5 2005. doi:10.1017/S0962492904000212.
[ESW14]Howard Elman, David Silvester, and Andy Wathen. Finite elements and fast iterative solvers. Oxford University Press, second edition edition, 2014.
[HX07]Ralf Hiptmair and Jinchao Xu. Nodal auxiliary space preconditioning in H(curl) and H(div) spaces. SIAM Journal on Numerical Analysis, 45(6):2483–2509, 2007. doi:10.1137/060660588.
[Kir10]Robert C. Kirby. From functional analysis to iterative methods. SIAM Review, 52(2):269–293, 2010. doi:10.1137/070706914.
[MGW00]Malcolm F. Murphy, Gene H. Golub, and Andrew J. Wathen. A note on preconditioning for indefinite linear systems. SIAM Journal on Scientific Computing, 21(6):1969–1972, 2000. doi:10.1137/S1064827599355153.