falkon.kernels¶

Kernel¶

class falkon.kernels.kernel.Kernel(name: str, kernel_type: str, opt: Optional[falkon.options.FalkonOptions])

Abstract kernel class. Kernels should inherit from this class, overriding appropriate methods.

To extend Falkon with new kernels, you should read the documentation of this class carefully. In particular, you will need to implement _prepare(), _apply() and _finalize() methods.

Other methods which should be optionally implemented are the sparse versions _prepare_sparse() and _apply_sparse() (note that there is no _finalize_sparse, since the _finalize takes as input a partial kernel matrix, and even with sparse data, kernel matrices are assumed to be dense. Therefore, even for sparse data, the _finalize() method will be used.

To provide a KeOps implementation, you will have to inherit also from the KeopsKernelMixin class, and implement its abstract methods. In case a KeOps implementation is provided, you should make sure to override the _decide_mmv_impl() and _decide_dmmv_impl() so that the KeOps implementation is effectively used. Have a look at the falkon.kernels.PolynomialKernel class for an example of how to integrate KeOps in the kernel.

Parameters
• name – A short name for the kernel (e.g. “Gaussian”)

• kernel_type – A short string describing the type of kernel. This may be used to create specialized functions in falkon.mmv_ops which optimize for a specific kernel type.

• opt – Base set of options to be used for operations involving this kernel.

__call__(X1, X2, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute the kernel matrix between X1 and X2

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• out (torch.Tensor or None) – Optional tensor of shape (N x M) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The kernel between X1 and X2.

abstract _apply(X1, X2, out)None

Main kernel operation, usually matrix multiplication.

This function will be called with two blocks of data which may be subsampled on the first and second dimension (i.e. X1 may be of size n x d where n << N and d << D). The output shall be stored in the out argument, and not be returned.

Parameters
• X1 (torch.Tensor) – (n x d) tensor. It is a block of the X1 input matrix, possibly subsampled in the first dimension.

• X2 (torch.Tensor) – (m x d) tensor. It is a block of the X2 input matrix, possibly subsampled in the first dimension.

• out (torch.Tensor) – (n x m) tensor. A tensor in which the output of the operation shall be accumulated. This tensor is initialized to 0 before calling _apply, but in case of subsampling of the data along the second dimension, multiple calls will be needed to compute a single (n x m) output block. In such case, the first call to this method will have a zeroed tensor, while subsequent calls will simply reuse the same object.

abstract _apply_sparse(X1, X2, out)None

Main kernel computation for sparse tensors.

Unlike the :meth_apply method, the X1 and X2 tensors are only subsampled along the first dimension. Take note that the out tensor is not sparse.

Parameters
_decide_dmmv_impl(X1, X2, v, w, opt: falkon.options.FalkonOptions)

Choose which dmmv function to use for this data.

Note that dmmv functions compute double kernel-vector products (see dmmv() for an explanation of what they are).

Parameters
• X1 (torch.Tensor) – First data matrix, of shape (N x D)

• X2 (torch.Tensor) – Second data matrix, of shape (M x D)

• v (torch.Tensor or None) – Vector for the matrix-vector multiplication (M x T)

• w (torch.Tensor or None) – Vector for the matrix-vector multiplicatoin (N x T)

• opt (FalkonOptions) – Falkon options. Options may be specified to force GPU or CPU usage.

Returns

dmmv_fn – A function which allows to perform the mmv operation.

Notes

This function decides based on the inputs: if the inputs are sparse, it will choose the sparse implementations; if CUDA is detected, it will choose the CUDA implementation; otherwise it will simply choose the basic CPU implementation.

_decide_mm_impl(X1, X2, opt: falkon.options.FalkonOptions)

Choose which mm function to use for this data.

Note that mm functions compute the kernel itself so KeOps may not be used.

Parameters
• X1 (torch.Tensor) – First data matrix, of shape (N x D)

• X2 (torch.Tensor) – Second data matrix, of shape (M x D)

• opt (FalkonOptions) – Falkon options. Options may be specified to force GPU or CPU usage.

Returns

mm_fn – A function which allows to perform the mm operation.

Notes

This function decides based on the inputs: if the inputs are sparse, it will choose the sparse implementations; if CUDA is detected, it will choose the CUDA implementation; otherwise it will simply choose the basic CPU implementation.

_decide_mmv_impl(X1, X2, v, opt: falkon.options.FalkonOptions)

Choose which mmv function to use for this data.

Note that mmv functions compute the kernel-vector product

Parameters
• X1 (torch.Tensor) – First data matrix, of shape (N x D)

• X2 (torch.Tensor) – Second data matrix, of shape (M x D)

• v (torch.Tensor) – Vector for the matrix-vector multiplication (M x T)

• opt (FalkonOptions) – Falkon options. Options may be specified to force GPU or CPU usage.

Returns

mmv_fn – A function which allows to perform the mmv operation.

Notes

This function decides based on the inputs: if the inputs are sparse, it will choose the sparse implementations; if CUDA is detected, it will choose the CUDA implementation; otherwise it will simply choose the basic CPU implementation.

abstract _finalize(A, d)

Final actions to be performed on a partial kernel matrix.

All elementwise operations on the kernel matrix should be performed in this method. Operations should be performed inplace by modifying the matrix A, to improve memory efficiency. If operations are not in-place, out-of-memory errors are possible when using the GPU.

Parameters
Returns

A – The same tensor as the input, if operations are performed in-place. Otherwise another tensor of the same shape.

abstract _prepare(X1, X2)Any

Pre-processing operations necessary to compute the kernel.

This function will be called with two blocks of data which may be subsampled on the first dimension (i.e. X1 may be of size n x D where n << N). The function should not modify X1 and X2. If necessary, it may return some data which is then made available to the _finalize() method.

For example, in the Gaussian kernel, this method is used to compute the squared norms of the datasets.

Parameters
• X1 (torch.Tensor) – (n x D) tensor. It is a block of the X1 input matrix, possibly subsampled in the first dimension.

• X2 (torch.Tensor) – (m x D) tensor. It is a block of the X2 input matrix, possibly subsampled in the first dimension.

Returns

abstract _prepare_sparse(X1, X2)

Data preprocessing for sparse tensors.

This is an equivalent to the _prepare() method for sparse tensors.

Parameters
Returns

• Data derived from X1 and X2 which is needed by the _finalize() method when

• finishing to compute a kernel tile.

dmmv(X1, X2, v, w, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute double matrix-vector multiplications where the matrix is the current kernel.

The general form of dmmv operations is: Kernel(X2, X1) @ (Kernel(X1, X2) @ v + w) where if v is None, then we simply have Kernel(X2, X1) @ w and if w is None we remove the additive factor. At least one of w and v must be provided.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor or None) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• w (torch.Tensor or None) – A vector to compute matrix-vector products. This may also be a matrix of shape (N x T) but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (M x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (M x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)  # N is 100, D is 3
>>> X2 = torch.randn(150, 3)  # M is 150
>>> v = torch.randn(150, 1)
>>> w = torch.randn(100, 1)
>>> out = k.dmmv(X1, X2, v, w, out=None)
>>> out.shape
torch.Size([150, 1])

extra_mem()Dict[str, float]

Compute the amount of extra memory which will be needed when computing this kernel.

Often kernel computation needs some extra memory allocations. To avoid using too large block-sizes which may lead to OOM errors, you should declare any such extra allocations for your kernel here.

Indicate extra allocations as coefficients on the required dimensions. For example, if computing a kernel needs to re-allocate the data-matrix (which is of size n * d), the return dictionary will be: {‘nd’: 1}. Other possible coefficients are on d, n, m which are respectively the data-dimension, the number of data-points in the first data matrix and the number of data-points in the second matrix. Pairwise combinations of the three dimensions are possible (i.e. nd, nm, md). Make sure to specify the dictionary keys as is written here since they will not be recognized otherwise.

Returns

extra_allocs (dictionary) – A dictionary from strings indicating on which dimensions the extra-allocation is needed (allowed strings: ‘n’, ‘m’, ‘d’, ‘nm’, ‘nd’, ‘md’) to floating-point numbers indicating how many extra-allocations are needed.

mmv(X1, X2, v, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute matrix-vector multiplications where the matrix is the current kernel.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (N x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (N x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)
>>> X2 = torch.randn(150, 3)
>>> v = torch.randn(150, 1)
>>> out = k.mmv(X1, X2, v, out=None)
>>> out.shape
torch.Size([100, 1])

class falkon.kernels.keops_helpers.KeopsKernelMixin
keops_dmmv_helper(X1, X2, v, w, kernel, out, opt, mmv_fn)

performs fnc(X1*X2’, X1, X2)’ * ( fnc(X1*X2’, X1, X2) * v + w )

Parameters
• X1 (torch.Tensor) – N x D tensor

• X2 (torch.Tensor) – M x D tensor

• v (torch.Tensor) – M x T tensor. Often, T = 1 and this is a vector.

• w (torch.Tensor) – N x T tensor. Often, T = 1 and this is a vector.

• kernel (falkon.kernels.kernel.Kernel) – Kernel instance to calculate this kernel. This is only here to preserve API structure.

• out (torch.Tensor or None) – Optional tensor in which to store the output (M x T)

• opt (FalkonOptions) – Options to be passed downstream

• mmv_fn (Callable) – The function which performs the mmv operation. Two mmv operations are (usually) needed for a dmmv operation.

Notes

The double MMV is implemented as two separate calls to the user-supplied mmv_fn. The first one calculates the inner part of the formula (NxT) while the second calculates the outer matrix-vector multiplication which

Gaussian kernel¶

class falkon.kernels.GaussianKernel(sigma: Union[float, torch.Tensor], opt: Optional[falkon.options.FalkonOptions] = None)

Class for computing the Gaussian kernel and related kernel-vector products

The Gaussian kernel is one of the most common and effective kernel embeddings since it is infinite dimensional, and governed by a single parameter. The kernel length-scale determines the width of the Gaussian distribution which is placed on top of each point. A larger sigma corresponds to a wide Gaussian, so that the relative influence of far away points will be high for computing the kernel at a given datum. On the opposite side of the spectrum, a small sigma means that only nearby points will influence the kernel.

Parameters
• sigma – The length-scale of the kernel. This can be a scalar, and then it corresponds to the standard deviation of the Gaussian distribution from which the kernel is derived. If sigma is a vector of size d (where d is the dimensionality of the data), it is interpreted as the diagonal standard deviation of the Gaussian distribution. It can also be a matrix of size d*d where d, in which case sigma will be the precision matrix (inverse covariance).

• opt – Additional options to be forwarded to the matrix-vector multiplication routines.

Examples

Creating a Gaussian kernel with a single length-scale. Operations on this kernel will not use KeOps.

>>> K = GaussianKernel(sigma=3.0, opt=FalkonOptions(keops_active="no"))


Creating a Gaussian kernel with a different length-scale per dimension

>>> K = GaussianKernel(sigma=torch.tensor([1.0, 3.5, 7.0]))


Creating a Gaussian kernel object with full covariance matrix (randomly chosen)

>>> mat = torch.randn(3, 3, dtype=torch.float64)
>>> sym_mat = mat @ mat.T
>>> K = GaussianKernel(sigma=sym_mat)
>>> K
GaussianKernel(sigma=tensor([[ 2.0909,  0.0253, -0.2490],
[ 0.0253,  0.3399, -0.5158],
[-0.2490, -0.5158,  4.4922]], dtype=torch.float64))  #random


Notes

The Gaussian kernel with a single length-scale follows

$k(x, x') = \exp{-\dfrac{\lVert x - x' \rVert^2}{2\sigma^2}}$

When the length-scales are specified as a matrix, the RBF kernel is determined by

$k(x, x') = \exp{-\dfrac{1}{2}x\Sigma x'}$

In both cases, the actual computation follows a different path, working on the expanded norm.

__call__(X1, X2, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute the kernel matrix between X1 and X2

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• out (torch.Tensor or None) – Optional tensor of shape (N x M) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The kernel between X1 and X2.

dmmv(X1, X2, v, w, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute double matrix-vector multiplications where the matrix is the current kernel.

The general form of dmmv operations is: Kernel(X2, X1) @ (Kernel(X1, X2) @ v + w) where if v is None, then we simply have Kernel(X2, X1) @ w and if w is None we remove the additive factor. At least one of w and v must be provided.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor or None) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• w (torch.Tensor or None) – A vector to compute matrix-vector products. This may also be a matrix of shape (N x T) but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (M x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (M x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)  # N is 100, D is 3
>>> X2 = torch.randn(150, 3)  # M is 150
>>> v = torch.randn(150, 1)
>>> w = torch.randn(100, 1)
>>> out = k.dmmv(X1, X2, v, w, out=None)
>>> out.shape
torch.Size([150, 1])

mmv(X1, X2, v, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute matrix-vector multiplications where the matrix is the current kernel.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (N x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (N x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)
>>> X2 = torch.randn(150, 3)
>>> v = torch.randn(150, 1)
>>> out = k.mmv(X1, X2, v, out=None)
>>> out.shape
torch.Size([100, 1])


Laplacian kernel¶

class falkon.kernels.LaplacianKernel(sigma: float, opt: Optional[falkon.options.BaseOptions] = None)

Class for computing the Laplacian kernel, and related kernel-vector products.

The Laplacian kernel is similar to the Gaussian kernel, but less sensitive to changes in the parameter sigma.

Parameters

sigma – The length-scale of the Laplacian kernel

Notes

The Laplacian kernel is determined by the following formula

$k(x, x') = \exp{-\frac{\lVert x - x' \rVert}{\sigma}}$
__call__(X1, X2, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute the kernel matrix between X1 and X2

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• out (torch.Tensor or None) – Optional tensor of shape (N x M) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The kernel between X1 and X2.

dmmv(X1, X2, v, w, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute double matrix-vector multiplications where the matrix is the current kernel.

The general form of dmmv operations is: Kernel(X2, X1) @ (Kernel(X1, X2) @ v + w) where if v is None, then we simply have Kernel(X2, X1) @ w and if w is None we remove the additive factor. At least one of w and v must be provided.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor or None) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• w (torch.Tensor or None) – A vector to compute matrix-vector products. This may also be a matrix of shape (N x T) but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (M x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (M x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)  # N is 100, D is 3
>>> X2 = torch.randn(150, 3)  # M is 150
>>> v = torch.randn(150, 1)
>>> w = torch.randn(100, 1)
>>> out = k.dmmv(X1, X2, v, w, out=None)
>>> out.shape
torch.Size([150, 1])

mmv(X1, X2, v, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute matrix-vector multiplications where the matrix is the current kernel.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (N x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (N x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)
>>> X2 = torch.randn(150, 3)
>>> v = torch.randn(150, 1)
>>> out = k.mmv(X1, X2, v, out=None)
>>> out.shape
torch.Size([100, 1])


Dot-Product kernels¶

Polynomial kernel¶

class falkon.kernels.PolynomialKernel(alpha: Union[torch.Tensor, float], beta: Union[torch.Tensor, float], degree: Union[torch.Tensor, float], opt: Optional[falkon.options.FalkonOptions] = None)

Polynomial kernel with multiplicative and additive constants.

Follows the formula

$(\alpha * X_1^\top X_2 + \beta)^{\mathrm{degree}}$

Where all operations apart from the matrix multiplication are taken element-wise.

Parameters
• alpha (float-like) – Multiplicative constant

• beta (float-like) – Additive constant

• degree (float-like) – Power of the polynomial kernel

• opt (Optional[FalkonOptions]) – Options which will be used in downstream kernel operations.

__call__(X1, X2, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute the kernel matrix between X1 and X2

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• out (torch.Tensor or None) – Optional tensor of shape (N x M) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The kernel between X1 and X2.

dmmv(X1, X2, v, w, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute double matrix-vector multiplications where the matrix is the current kernel.

The general form of dmmv operations is: Kernel(X2, X1) @ (Kernel(X1, X2) @ v + w) where if v is None, then we simply have Kernel(X2, X1) @ w and if w is None we remove the additive factor. At least one of w and v must be provided.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor or None) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• w (torch.Tensor or None) – A vector to compute matrix-vector products. This may also be a matrix of shape (N x T) but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (M x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (M x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)  # N is 100, D is 3
>>> X2 = torch.randn(150, 3)  # M is 150
>>> v = torch.randn(150, 1)
>>> w = torch.randn(100, 1)
>>> out = k.dmmv(X1, X2, v, w, out=None)
>>> out.shape
torch.Size([150, 1])

mmv(X1, X2, v, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute matrix-vector multiplications where the matrix is the current kernel.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (N x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (N x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)
>>> X2 = torch.randn(150, 3)
>>> v = torch.randn(150, 1)
>>> out = k.mmv(X1, X2, v, out=None)
>>> out.shape
torch.Size([100, 1])


Linear kernel¶

class falkon.kernels.LinearKernel(beta: Union[torch.Tensor, float] = 0.0, sigma: Union[torch.Tensor, float] = 1.0, opt: Optional[falkon.options.FalkonOptions] = None)

Linear Kernel with optional scaling and translation parameters.

The kernel implemented here is the covariance function in the original input space (i.e. X @ X.T) with optional parameters to translate and scale the kernel: beta + 1/(sigma**2) * X @ X.T

Parameters
• beta (float-like) – Additive constant for the kernel, default: 0.0

• sigma (float-like) – Multiplicative constant for the kernel. The kernel will be multiplied by the inverse of sigma squared. Default: 1.0

• opt (Optional[FalkonOptions]) – Options which will be used in downstream kernel operations.

Examples

>>> k = LinearKernel(beta=0.0, sigma=2.0)
>>> X = torch.randn(100, 3)  # 100 samples in 3 dimensions
>>> kernel_matrix = k(X, X)
>>> torch.testing.assert_allclose(kernel_matrix, X @ X.T * (1/2**2))

__call__(X1, X2, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute the kernel matrix between X1 and X2

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• out (torch.Tensor or None) – Optional tensor of shape (N x M) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The kernel between X1 and X2.

dmmv(X1, X2, v, w, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute double matrix-vector multiplications where the matrix is the current kernel.

The general form of dmmv operations is: Kernel(X2, X1) @ (Kernel(X1, X2) @ v + w) where if v is None, then we simply have Kernel(X2, X1) @ w and if w is None we remove the additive factor. At least one of w and v must be provided.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor or None) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• w (torch.Tensor or None) – A vector to compute matrix-vector products. This may also be a matrix of shape (N x T) but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (M x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (M x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)  # N is 100, D is 3
>>> X2 = torch.randn(150, 3)  # M is 150
>>> v = torch.randn(150, 1)
>>> w = torch.randn(100, 1)
>>> out = k.dmmv(X1, X2, v, w, out=None)
>>> out.shape
torch.Size([150, 1])

mmv(X1, X2, v, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute matrix-vector multiplications where the matrix is the current kernel.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (N x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (N x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)
>>> X2 = torch.randn(150, 3)
>>> v = torch.randn(150, 1)
>>> out = k.mmv(X1, X2, v, out=None)
>>> out.shape
torch.Size([100, 1])


Sigmoid kernel¶

class falkon.kernels.SigmoidKernel(alpha: Union[torch.Tensor, float], beta: Union[torch.Tensor, float], opt: Optional[falkon.options.FalkonOptions] = None)

Sigmoid (or hyperbolic tangent) kernel function, with additive and multiplicative constants.

Follows the formula

$k(x, y) = \tanh(\alpha x^\top y + \beta)$
Parameters
• alpha (float-like) – Multiplicative constant

• beta (float-like) – Multiplicative constant

• opt (Optional[FalkonOptions]) – Options which will be used in downstream kernel operations.

__call__(X1, X2, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute the kernel matrix between X1 and X2

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• out (torch.Tensor or None) – Optional tensor of shape (N x M) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The kernel between X1 and X2.

dmmv(X1, X2, v, w, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute double matrix-vector multiplications where the matrix is the current kernel.

The general form of dmmv operations is: Kernel(X2, X1) @ (Kernel(X1, X2) @ v + w) where if v is None, then we simply have Kernel(X2, X1) @ w and if w is None we remove the additive factor. At least one of w and v must be provided.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor or None) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• w (torch.Tensor or None) – A vector to compute matrix-vector products. This may also be a matrix of shape (N x T) but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (M x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (M x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)  # N is 100, D is 3
>>> X2 = torch.randn(150, 3)  # M is 150
>>> v = torch.randn(150, 1)
>>> w = torch.randn(100, 1)
>>> out = k.dmmv(X1, X2, v, w, out=None)
>>> out.shape
torch.Size([150, 1])

mmv(X1, X2, v, out=None, opt: Optional[falkon.options.FalkonOptions] = None)

Compute matrix-vector multiplications where the matrix is the current kernel.

Parameters
• X1 (torch.Tensor) – The first data-matrix for computing the kernel. Of shape (N x D): N samples in D dimensions.

• X2 (torch.Tensor) – The second data-matrix for computing the kernel. Of shape (M x D): M samples in D dimensions. Set X2 == X1 to compute a symmetric kernel.

• v (torch.Tensor) – A vector to compute the matrix-vector product. This may also be a matrix of shape (M x T), but if T is very large the operations will be much slower.

• out (torch.Tensor or None) – Optional tensor of shape (N x T) to hold the output. If not provided it will be created.

• opt (Optional[FalkonOptions]) – Options to be used for computing the operation. Useful are the memory size options and CUDA options.

Returns

out (torch.Tensor) – The (N x T) output.

Examples

>>> import falkon, torch
>>> k = falkon.kernels.GaussianKernel(sigma=2)  # You can substitute the Gaussian kernel by any other.
>>> X1 = torch.randn(100, 3)
>>> X2 = torch.randn(150, 3)
>>> v = torch.randn(150, 1)
>>> out = k.mmv(X1, X2, v, out=None)
>>> out.shape
torch.Size([100, 1])