falkon.mmv_ops

The algorithms to compute kernels and kernel-vector products blockwise on GPUs and CPU. The algorithms in this module are kernel agnostic. Refer to falkon.kernels for the actual kernel implementations.

The KeOps wrapper only supports the mmv operation (kernel-vector products). The matrix-multiplication implementations instead support three different operations:

mm which calculates the full kernel

mmv which calculates kernel-vector products

dmmv which calculates double kernel-vector products (which are operations like \(K^\top (K v)\) where \(K\) is a kernel matrix and \(v\) is a vector).

run_keops_mmv

A thin wrapper to KeOps is provided to allow for block-splitting and multiple GPU usage. This only supports kernel-vector products.

falkon.mmv_ops.keops.run_keops_mmv(X1: Tensor, X2: Tensor, v: Tensor, other_vars: List[Tensor], out: Tensor | None, formula: str, aliases: List[str], axis: int, reduction: str = 'Sum', opt: FalkonOptions | None = None) → Tensor

fmm

Block-wise kernel calculation. If the inputs require gradient, this function uses a differentiable implementation.

falkon.mmv_ops.fmm.fmm(kernel: Kernel, opt: BaseOptions | None, out: Tensor | None, diag: bool, X1: Tensor | SparseTensor, X2: Tensor | SparseTensor, kwargs_m1: Dict[str, Tensor] | None = None, kwargs_m2: Dict[str, Tensor] | None = None) → Tensor

fmmv

Block-wise kernel-vector products.

falkon.mmv_ops.fmmv.fmmv(X1: Tensor | SparseTensor, X2: Tensor | SparseTensor, v: Tensor, kernel: Kernel, out: Tensor | None = None, opt: BaseOptions | None = None, kwargs_m1: Dict[str, Tensor] | None = None, kwargs_m2: Dict[str, Tensor] | None = None)

fdmmv

Block-wise double kernel-vector products.

Double kernel-vector product

Computes kernel \(K = k(X_1, X_2)\) and then the double kernel-vector product \(K^{\top} (K v + w)\).

Parameters:

X1 – \(n \times d\) input matrix.
X2 – \(m \times d\) input matrix.
v – \(m \times t\) vector (to be multiplied by the kernel)
w – \(n \times t\) vector to be added to the first k-v product
kernel – Kernel object responsible for computing kernel blocks
out – Optional output matrix of size \(m \times t\). If specified, the output will be stored in it, otherwise a new output matrix will be allocated
differentiable – Whether the inputs are intended to be differentiated with. Currently setting this to True results in a NotImplementedError.
opt – Options to be used for this operation
kwargs_m1 – Keyword arguments containing tensors which should be split along with m1. For example this could be a set of indices corresponding to m1, which are then correctly split and available in the kernel computation.
kwargs_m2 – Keyword arguments containing tensors which should be split along with m2. For example this could be a set of indices corresponding to m2, which are then correctly split and available in the kernel computation.

Returns:

out – Output of the double kernel-vector product. Will use the same storage as the out parameter if it was specified

incore_fmmv

falkon.mmv_ops.fmmv_incore.incore_fmmv(mat: Tensor, vec: Tensor, out: Tensor | None = None, transpose: bool = False, opt: FalkonOptions | None = None) → Tensor

incore_fdmmv

falkon.mmv_ops.fmmv_incore.incore_fdmmv(mat: Tensor, vec: Tensor, w: Tensor | None, out: Tensor | None = None, opt: FalkonOptions | None = None) → Tensor

Low-level functions

The following are some of the low-level functions which help compute kernels and kernel-vector products block-wise. They are specialized for different input types.

falkon.mmv_ops.fmm.sparse_mm_run_thread(m1: SparseTensor, m2: SparseTensor, out: Tensor, kernel: Kernel, n: int, m: int, comp_dt: dtype, dev: device, tid: int, kwargs_m1: Dict[str, Tensor], kwargs_m2: Dict[str, Tensor])

Inner loop to compute (part of) a kernel matrix for two sparse input tensors

Parameters:

m1 – Left input tensor for computing the kernel
m2 – Right input tensor for computing the kernel
out – Output dense matrix in which to store the result
kernel – Kernel object, used for computing the kernel. This must implement the falkon.kernels.kernel.Kernel.compute_sparse() method.
n – Block size for the first axis of m1
m – Block size for the first ais of m2
comp_dt – Data-type in which to run the actual calculations (may be different from the data-type of m1 or m2).
dev – Device on which to run the calculations
tid – Thread ID. If on the main thread this will be -1
kwargs_m1 – Keyword arguments containing tensors which should be split along with m1. For example this could be a set of indices corresponding to m1, which are then correctly split and available in the kernel computation.
kwargs_m2 – Keyword arguments containing tensors which should be split along with m2. For example this could be a set of indices corresponding to m2, which are then correctly split and available in the kernel computation.

Returns:

out (torch.Tensor) – The kernel matrix. Should use the same underlying storage as the parameter out.

falkon.mmv_ops.fmmv.sparse_mmv_run_thread(m1: SparseTensor, m2: SparseTensor, v: Tensor, out: Tensor, kernel: Kernel, blk_n: int, blk_m: int, mem_needed: int, dev: device, tid: int, kwargs_m1: Dict[str, Tensor], kwargs_m2: Dict[str, Tensor])

Inner loop to compute (part of) a kernel-vector product for sparse input matrices.

Parameters:

m1 – Left input tensor for computing the kernel
m2 – Right input tensor for computing the kernel
v – Dense vector to be multiplied by the kernel matrix
out – Dense output vector which should store the result of the kernel vector product on exit from this function.
kernel – Kernel object, used for computing the kernel. This must implement the falkon.kernels.kernel.Kernel.compute_sparse() method.
blk_n – Block size for the first axis of m1
blk_m – Block size for the first ais of m2
mem_needed – Memory needed for pre-allocations
dev – Device on which to run the calculations
tid – Thread ID or -1 if on main thread
kwargs_m1 – Keyword arguments containing tensors which should be split along with m1. For example this could be a set of indices corresponding to m1, which are then correctly split and available in the kernel computation.
kwargs_m2 – Keyword arguments containing tensors which should be split along with m2. For example this could be a set of indices corresponding to m2, which are then correctly split and available in the kernel computation.

Returns:

out (torch.Tensor) – The kernel matrix. Should use the same underlying storage as the parameter out.