falkon.preconditioner

Preconditioner

class falkon.preconditioner.preconditioner.Preconditioner

Generic preconditioner class, used to accelerate solutions to linear systems.

Given a system of equations \(H\beta = Y\), where \(H\) typically contains in some form our data matrix X and Y contains the targets. We can use matrix \(B\) to create an equivalent linear system which will have lower condition number:

\[BB^\top H \beta = Y\]

where \(BB^\top \approx H^{-1}\) in order to make the preconditioner effective, but not too expensive to compute. Then, in order to use the preconditioner in an algorithm based on matrix-vector products (such as conjugate gradient descent), we must be able to “apply” the matrix \(B\) and its transpose \(B^ op\) to any vector.

For this reason, this class exposes abstract methods apply and apply_t which should be overridden in concrete preconditioner implementations

See also

falkon.preconditioner.FalkonPreconditioner

for an actual preconditioner implementation

Cholesky preconditioners

FalkonPreconditioner

class falkon.preconditioner.FalkonPreconditioner(penalty: float, kernel, opt: FalkonOptions)

Approximated Cholesky Preconditioner for FALKON.

The preconditioner is based on the \(K_{MM}\) kernel between the inducing points. A two step approximation of the inverse matrix via two Cholesky decompositions is performed.

Starting with \(K_{MM}\) we obtain \(T = \mathrm{chol}(K_{MM})\). Then we can obtain \(A = \mathrm{chol}(\frac{1}{M} T T^\top + \lambda)\) via another Cholesky decomposition. Both T and A are upper triangular: the first gets stored in the upper triangle of the \(K_{MM}\) matrix (called fC in the code), while the second is stored in the lower triangle.

Whenever we want to use one of the two triangles we must reset the matrix diagonal, since it is shared between the two matrices.

Parameters:
  • penalty (float) – The regularization parameter for KRR. Must be greater than 0.

  • kernel (falkon.kernels.kernel.Kernel) – The kernel object. This is used to compute the M*M kernel between inducing points. The kernel matrix is then overwritten by the preconditioner itself.

  • opt (FalkonOptions) –

    Additional options to be used in computing the preconditioner. Relevant options are:

    • pc_epsilonthe jitter to add to the kernel matrix to make

      it positive-definite and allow Cholesky decomposition. This can be either a float, or a dictionary mapping from torch datatypes (e.g. float32, float64) to an appropriate float. Typically float32 requires more jitter than float64.

    • cpu_preconditionera boolean value which overrides CPU/GPU

      settings and forces the function to compute the whole preconditioner on the CPU. If set to False, we fall back to the usual CPU/GPU settings (i.e. ‘use_cpu’ option and the availability of a GPU).

apply(v: Tensor) Tensor

Solve two systems of equations \(ATx = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

apply_t(v: Tensor) Tensor

Solve two systems of equations \(A^\top T^\top x = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

init(X: Tensor | SparseTensor, weight_vec: Tensor | None = None)

Initialize the preconditioner matrix.

This method must be called before the preconditioner can be used.

Parameters:
  • X (torch.Tensor) – The (M x D) matrix of Nystroem centers

  • weight_vec – An optional vector of size (M x 1) which is used for reweighted least-squares. This vector should contain the weights corresponding to the Nystrom centers.

invA(v: Tensor) Tensor

Solve the system of equations \(Ax = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

trsm()

the function used to solve the system of equations

invAt(v: Tensor) Tensor

Solve the system of equations \(A^\top x = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

invT(v: Tensor) Tensor

Solve the system of equations \(Tx = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

invTt(v: Tensor) Tensor

Solve the system of equations \(T^\top x = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

LogisticPreconditioner

class falkon.preconditioner.LogisticPreconditioner(kernel, loss, opt: FalkonOptions)

Approximate Cholesky Preconditioner for Logistic-FALKON.

The preconditioner is based on the K_MM kernel between the inducing points. A two step approximation of the inverse matrix via two cholesky decompositions is performed.

T = chol(K_MM)    => T.T @ T = K_MM
A = chol(1/M * (T @ (T.T @ W)) + lambda)

So T and A are both upper triangular. W is a diagonal matrix of weights derived from the 2nd derivative of the loss function.

Here we store T in the upper triangular part of the fC matrix, and A in the upper triangular part of the matrix. Whenever we need to use one or the other we need to reset the diagonal of fC since it is shared between the two matrices. W is of size M and is the only difference with respect to the normal FALKON preconditioner (falkon.preconditioner.FalkonPreconditioner).

Parameters:
  • kernel (falkon.kernels.kernel.Kernel) – The kernel object. This is used to compute the M*M kernel between inducing points. This kernel is then overwritten by the preconditioner itself.

  • loss (falkon.gsc_losses.Loss) – The loss-function used for defining kernel weights.

  • opt (FalkonOptions) –

    Additional options to be used in computing the preconditioner. Relevant options are:

    • pc_epsilonthe jitter to add to the kernel matrix to make

      it positive-definite and allow Cholesky decomposition. This can be either a float, or a dictionary mapping from torch datatypes (e.g. float32, float64) to an appropriate float. Typically float32 requires more jitter than float64.

    • cpu_preconditionera boolean value which overrides CPU/GPU

      settings and forces the function to compute the whole preconditioner on the CPU. If set to False, we fall back to the usual CPU/GPU settings (i.e. ‘use_cpu’ option and the availability of a GPU).

See also

falkon.gsc_losses.LogisticLoss

for an example of loss function used for kernel reweighting.

falkon.models.LogisticFalkon

for the logistic kernel estimator which uses this preconditioner.

apply(v)

Solve two systems of equations \(ATx = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

apply_t(v)

Solve two systems of equations \(A^\top T^\top x = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

init(X: Tensor | SparseTensor, Y: Tensor, alpha: Tensor, penalty: float, N: int) None

Initialize the preconditioner matrix.

This method must be called before the preconditioner becomes usable.

Parameters:
  • X (torch.Tensor) – (M x D) matrix of Nystroem centers

  • Y (torch.Tensor) – (M x 1) vector of targets corresponding to the Nystroem centers X

  • alpha (torch.Tensor) – (M x 1) parameter vector (of the same dimension as Y) which gives the current solution to the optimization problem.

  • penalty (float) – Regularization amount

  • N (int) – Number of points in the full data-set.

Notes

If debug=True is present in the options, this method will print a lot of extra information pertaining timings of the various preconditioner operations. This can be useful to help understand how the preconditioner works.

invA(v)

Solve the system of equations \(Ax = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

invAt(v)

Solve the system of equations \(A^\top x = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

invT(v)

Solve the system of equations \(Tx = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations

invTt(v)

Solve the system of equations \(T^\top x = v\) for unknown vector \(x\).

Multiple right-hand sides are supported (by simply passing a 2D tensor for v)

Parameters:

v – The right-hand side of the triangular system of equations

Returns:

x – The solution, computed with the trsm function.

See also

falkon.preconditioner.pc_utils.trsm()

the function used to solve the system of equations