falkon.optim

Optimizer

class falkon.optim.Optimizer

Base class for optimizers. This is an empty shell at the moment.

Conjugate gradient methods

ConjugateGradient

class falkon.optim.ConjugateGradient(opt: ConjugateGradientOptions | None = None)
solve(X0: Tensor | None, B: Tensor, mmv: Callable[[Tensor], Tensor], max_iter: int, callback: Callable[[int, Tensor, float], None] | None = None) Tensor

Conjugate-gradient solver with optional support for preconditioning via generic MMV.

This solver can be used for iterative solution of linear systems of the form $AX = B$ with respect to the X variable. Knowledge of A is only needed through matrix-vector multiplications with temporary solutions (must be provided through the mmv function).

Preconditioning can be achieved by incorporating the preconditioner matrix in the mmv function.

Parameters:
  • X0 (Optional[torch.Tensor]) – Initial solution for the solver. If not provided it will be a zero-tensor.

  • B (torch.Tensor) – Right-hand-side of the linear system to be solved.

  • mmv – User-provided function to perform matrix-vector multiplications with the design matrix A. The function must accept a single argument (the vector to be multiplied), and return the result of the matrix-vector multiplication.

  • max_iter (int) – Maximum number of iterations the solver will perform. Early stopping is implemented via the options passed in the constructor of this class (in particular look at cg_tolerance options) i + 1, X, e_train

  • callback – An optional, user-provided function which shall be called at the end of each iteration with the current solution. The arguments to the function are the iteration number, a tensor containing the current solution, and the total time elapsed from the beginning of training (note that this time explicitly excludes any time taken by the callback itself).

Returns:

The solution to the linear system X.

FalkonConjugateGradient

class falkon.optim.FalkonConjugateGradient(kernel: Kernel, preconditioner: Preconditioner, opt: FalkonOptions, weight_fn=None)

Preconditioned conjugate gradient solver, optimized for the Falkon algorithm.

The linear system solved is

\[\widetilde{B}^\top H \widetilde{B} \beta = \widetilde{B}^\top K_{nm}^\top Y\]

where \(\widetilde{B}\) is the approximate preconditioner

\[\widetilde{B} = 1/\sqrt{n}T^{-1}A^{-1}\]

\(\beta\) is the preconditioned solution vector (from which we can get \(\alpha = \widetilde{B}\beta\)), and \(H\) is the \(m\times m\) sketched matrix

\[H = K_{nm}^\top K_{nm} + \lambda n K_{mm}\]
Parameters:
  • kernel – The kernel class used for the CG algorithm

  • preconditioner – The approximate Falkon preconditioner. The class should allow triangular solves with both \(T\) and \(A\) and multiple right-hand sides. The preconditioner should already have been initialized with a set of Nystrom centers. If the Nystrom centers used for CG are different from the ones used for the preconditioner, the CG method could converge very slowly.

  • opt – Options passed to the CG solver and to the kernel for computations.

See also

falkon.preconditioner.FalkonPreconditioner

for the preconditioner class which is responsible for computing matrices T and A.