falkon.optim
Optimizer
- class falkon.optim.Optimizer
Base class for optimizers. This is an empty shell at the moment.
Conjugate gradient methods
ConjugateGradient
- class falkon.optim.ConjugateGradient(opt: ConjugateGradientOptions | None = None)
- solve(X0: Tensor | None, B: Tensor, mmv: Callable[[Tensor], Tensor], max_iter: int, callback: Callable[[int, Tensor, float], None] | None = None) Tensor
Conjugate-gradient solver with optional support for preconditioning via generic MMV.
This solver can be used for iterative solution of linear systems of the form $AX = B$ with respect to the X variable. Knowledge of A is only needed through matrix-vector multiplications with temporary solutions (must be provided through the mmv function).
Preconditioning can be achieved by incorporating the preconditioner matrix in the mmv function.
- Parameters:
X0 (Optional[torch.Tensor]) – Initial solution for the solver. If not provided it will be a zero-tensor.
B (torch.Tensor) – Right-hand-side of the linear system to be solved.
mmv – User-provided function to perform matrix-vector multiplications with the design matrix A. The function must accept a single argument (the vector to be multiplied), and return the result of the matrix-vector multiplication.
max_iter (int) – Maximum number of iterations the solver will perform. Early stopping is implemented via the options passed in the constructor of this class (in particular look at cg_tolerance options) i + 1, X, e_train
callback – An optional, user-provided function which shall be called at the end of each iteration with the current solution. The arguments to the function are the iteration number, a tensor containing the current solution, and the total time elapsed from the beginning of training (note that this time explicitly excludes any time taken by the callback itself).
- Returns:
The solution to the linear system X.
FalkonConjugateGradient
- class falkon.optim.FalkonConjugateGradient(kernel: Kernel, preconditioner: Preconditioner, opt: FalkonOptions, weight_fn=None)
Preconditioned conjugate gradient solver, optimized for the Falkon algorithm.
The linear system solved is
\[\widetilde{B}^\top H \widetilde{B} \beta = \widetilde{B}^\top K_{nm}^\top Y\]where \(\widetilde{B}\) is the approximate preconditioner
\[\widetilde{B} = 1/\sqrt{n}T^{-1}A^{-1}\]\(\beta\) is the preconditioned solution vector (from which we can get \(\alpha = \widetilde{B}\beta\)), and \(H\) is the \(m\times m\) sketched matrix
\[H = K_{nm}^\top K_{nm} + \lambda n K_{mm}\]- Parameters:
kernel – The kernel class used for the CG algorithm
preconditioner – The approximate Falkon preconditioner. The class should allow triangular solves with both \(T\) and \(A\) and multiple right-hand sides. The preconditioner should already have been initialized with a set of Nystrom centers. If the Nystrom centers used for CG are different from the ones used for the preconditioner, the CG method could converge very slowly.
opt – Options passed to the CG solver and to the kernel for computations.
See also
falkon.preconditioner.FalkonPreconditioner
for the preconditioner class which is responsible for computing matrices T and A.