kornia.feature

non_maxima_suppression2d(input: torch.Tensor, kernel_size: Tuple[int, int]) → torch.Tensor[source]

Applies non maxima suppression to filter.

See NonMaximaSuppression2d for details.

gftt_response(input: torch.Tensor, grads_mode: str = 'sobel', sigmas: Optional[torch.Tensor] = None) → torch.Tensor[source]

Computes the Shi-Tomasi cornerness function. Function does not do any normalization or nms. The response map is computed according the following formulation:

\[R = min(eig(M))\]

where:

\[\begin{split}M = \sum_{(x,y) \in W} \begin{bmatrix} I^{2}_x & I_x I_y \\ I_x I_y & I^{2}_y \\ \end{bmatrix}\end{split}\]
Parameters
  • input (torch.Tensor) – 4d tensor

  • grads_mode (string) – can be ‘sobel’ for standalone use or ‘diff’ for use on Gaussian pyramid

  • sigmas (optional, torch.Tensor) – coefficients to be multiplied by multichannel response. n Should be shape of (B) It is necessary for performing non-maxima-suppression across different scale pyramid levels.See vlfeat

Returns

the response map per channel.

Return type

torch.Tensor

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H, W)\)

Examples

>>> input = torch.tensor([[[
    [0., 0., 0., 0., 0., 0., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 0., 0., 0., 0., 0., 0.],
]]])  # 1x1x7x7
>>> # compute the response map
>>> output = gftt_response(input)
tensor([[[[0.0155, 0.0334, 0.0194, 0.0000, 0.0194, 0.0334, 0.0155],
  [0.0334, 0.0575, 0.0339, 0.0000, 0.0339, 0.0575, 0.0334],
  [0.0194, 0.0339, 0.0497, 0.0000, 0.0497, 0.0339, 0.0194],
  [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
  [0.0194, 0.0339, 0.0497, 0.0000, 0.0497, 0.0339, 0.0194],
  [0.0334, 0.0575, 0.0339, 0.0000, 0.0339, 0.0575, 0.0334],
  [0.0155, 0.0334, 0.0194, 0.0000, 0.0194, 0.0334, 0.0155]]]])
harris_response(input: torch.Tensor, k: Union[torch.Tensor, float] = 0.04, grads_mode: str = 'sobel', sigmas: Optional[torch.Tensor] = None) → torch.Tensor[source]

Computes the Harris cornerness function. Function does not do any normalization or nms.The response map is computed according the following formulation:

\[R = max(0, det(M) - k \cdot trace(M)^2)\]

where:

\[\begin{split}M = \sum_{(x,y) \in W} \begin{bmatrix} I^{2}_x & I_x I_y \\ I_x I_y & I^{2}_y \\ \end{bmatrix}\end{split}\]

and \(k\) is an empirically determined constant \(k ∈ [ 0.04 , 0.06 ]\)

Parameters
  • input – torch.Tensor: 4d tensor

  • k (torch.Tensor) – the Harris detector free parameter.

  • grads_mode (string) – can be ‘sobel’ for standalone use or ‘diff’ for use on Gaussian pyramid

  • sigmas (optional, torch.Tensor) –

    coefficients to be multiplied by multichannel response. n Should be shape of (B) It is necessary for performing non-maxima-suppression across different scale pyramid levels.See vlfeat

Returns

the response map per channel.

Return type

torch.Tensor

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H, W)\)

Examples

>>> input = torch.tensor([[[
    [0., 0., 0., 0., 0., 0., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 0., 0., 0., 0., 0., 0.],
]]])  # 1x1x7x7
>>> # compute the response map
>>> output = harris_response(input, 0.04)
tensor([[[[0.0012, 0.0039, 0.0020, 0.0000, 0.0020, 0.0039, 0.0012],
  [0.0039, 0.0065, 0.0040, 0.0000, 0.0040, 0.0065, 0.0039],
  [0.0020, 0.0040, 0.0029, 0.0000, 0.0029, 0.0040, 0.0020],
  [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
  [0.0020, 0.0040, 0.0029, 0.0000, 0.0029, 0.0040, 0.0020],
  [0.0039, 0.0065, 0.0040, 0.0000, 0.0040, 0.0065, 0.0039],
  [0.0012, 0.0039, 0.0020, 0.0000, 0.0020, 0.0039, 0.0012]]]])
hessian_response(input: torch.Tensor, grads_mode: str = 'sobel', sigmas: Optional[torch.Tensor] = None) → torch.Tensor[source]

Computes the absolute of determinant of the Hessian matrix. Function does not do any normalization or nms. The response map is computed according the following formulation:

\[R = det(H)\]

where:

\[\begin{split}M = \sum_{(x,y) \in W} \begin{bmatrix} I_{xx} & I_{xy} \\ I_{xy} & I_{yy} \\ \end{bmatrix}\end{split}\]
Parameters
  • input – torch.Tensor: 4d tensor

  • grads_mode (string) – can be ‘sobel’ for standalone use or ‘diff’ for use on Gaussian pyramid

  • sigmas (optional, torch.Tensor) –

    coefficients to be multiplied by multichannel response. n Should be shape of (B) It is necessary for performing non-maxima-suppression across different scale pyramid levels.See vlfeat

Returns

the response map per channel.

Return type

torch.Tensor

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H, W)\)

Examples

>>> input = torch.tensor([[[
    [0., 0., 0., 0., 0., 0., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 1., 1., 1., 1., 1., 0.],
    [0., 0., 0., 0., 0., 0., 0.],
]]])  # 1x1x7x7
>>> # compute the response map
>>> output = hessian_response(input)
tensor([[[[0.0155, 0.0334, 0.0194, 0.0000, 0.0194, 0.0334, 0.0155],
  [0.0334, 0.0575, 0.0339, 0.0000, 0.0339, 0.0575, 0.0334],
  [0.0194, 0.0339, 0.0497, 0.0000, 0.0497, 0.0339, 0.0194],
  [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
  [0.0194, 0.0339, 0.0497, 0.0000, 0.0497, 0.0339, 0.0194],
  [0.0334, 0.0575, 0.0339, 0.0000, 0.0339, 0.0575, 0.0334],
  [0.0155, 0.0334, 0.0194, 0.0000, 0.0194, 0.0334, 0.0155]]]])
extract_patches_from_pyramid(img: torch.Tensor, laf: torch.Tensor, PS: int = 32, normalize_lafs_before_extraction: bool = True) → torch.Tensor[source]

Extract patches defined by LAFs from image tensor. Patches are extracted from appropriate pyramid level

Parameters
  • laf – (torch.Tensor).

  • images – (torch.Tensor) images, LAFs are detected in

  • PS – (int) patch size, default = 32

  • normalize_lafs_before_extraction (bool) – if True, lafs are normalized to image size, default = True

Returns

(torch.Tensor) \((B, N, CH, PS,PS)\)

Return type

patches

extract_patches_simple(img: torch.Tensor, laf: torch.Tensor, PS: int = 32, normalize_lafs_before_extraction: bool = True) → torch.Tensor[source]

Extract patches defined by LAFs from image tensor. No smoothing applied, huge aliasing (better use extract_patches_from_pyramid)

Parameters
  • img – (torch.Tensor) images, LAFs are detected in

  • laf – (torch.Tensor).

  • PS – (int) patch size, default = 32

  • normalize_lafs_before_extraction (bool) – if True, lafs are normalized to image size, default = True

Returns

(torch.Tensor) \((B, N, CH, PS,PS)\)

Return type

patches

normalize_laf(LAF: torch.Tensor, images: torch.Tensor) → torch.Tensor[source]
Normalizes LAFs to [0,1] scale from pixel scale. See below:
>>> B,N,H,W = images.size()
>>> MIN_SIZE = min(H,W)
[a11 a21 x]
[a21 a22 y]
becomes:
[a11/MIN_SIZE a21/MIN_SIZE x/W]
[a21/MIN_SIZE a22/MIN_SIZE y/H]
Parameters
  • LAF – (torch.Tensor).

  • images – (torch.Tensor) images, LAFs are detected in

Returns

(torch.Tensor).

Return type

LAF

Shape:
  • Input: \((B, N, 2, 3)\)

  • Output: \((B, N, 2, 3)\)

denormalize_laf(LAF: torch.Tensor, images: torch.Tensor) → torch.Tensor[source]
De-normalizes LAFs from scale to image scale.
>>> B,N,H,W = images.size()
>>> MIN_SIZE = min(H,W)
[a11 a21 x]
[a21 a22 y]
becomes
[a11*MIN_SIZE a21*MIN_SIZE x*W]
[a21*MIN_SIZE a22*MIN_SIZE y*H]
Parameters
  • LAF – (torch.Tensor).

  • images – (torch.Tensor) images, LAFs are detected in

Returns

(torch.Tensor).

Return type

LAF

Shape:
  • Input: \((B, N, 2, 3)\)

  • Output: \((B, N, 2, 3)\)

laf_to_boundary_points(LAF: torch.Tensor, n_pts: int = 50) → torch.Tensor[source]

Converts LAFs to boundary points of the regions + center. Used for local features visualization, see visualize_laf function

Parameters
  • LAF – (torch.Tensor).

  • n_pts – number of points to output

Returns

(torch.Tensor) tensor of boundary points

Return type

pts

Shape:
  • Input: \((B, N, 2, 3)\)

  • Output: \((B, N, n_pts, 2)\)

ellipse_to_laf(ells: torch.Tensor) → torch.Tensor[source]

Converts ellipse regions to LAF format. Ellipse (a, b, c) and upright covariance matrix [a11 a12; 0 a22] are connected by inverse matrix square root: A = invsqrt([a b; b c]) See also https://github.com/vlfeat/vlfeat/blob/master/toolbox/sift/vl_frame2oell.m

Parameters

ells – (torch.Tensor): tensor of ellipses in Oxford format [x y a b c].

Returns

(torch.Tensor) tensor of ellipses in LAF format.

Return type

LAF

Shape:
  • Input: \((B, N, 5)\)

  • Output: \((B, N, 2, 3)\)

Example

>>> input = torch.ones(1, 10, 5)  # BxNx5
>>> output = kornia.ellipse_to_laf(input)  #  BxNx2x3
make_upright(laf: torch.Tensor, eps: float = 1e-09) → torch.Tensor[source]

Rectifies the affine matrix, so that it becomes upright

Parameters
  • laf – (torch.Tensor): tensor of LAFs.

  • eps (float) – for safe division, (default 1e-9)

Returns

tensor of same shape.

Return type

torch.Tensor

Shape:
  • Input: \((B, N, 2, 3)\)

  • Output: \((B, N, 2, 3)\)

Example

>>> input = torch.ones(1, 5, 2, 3)  # BxNx2x3
>>> output = kornia.make_upright(input)  #  BxNx2x3
scale_laf(laf: torch.Tensor, scale_coef: Union[float, torch.Tensor]) → torch.Tensor[source]

Multiplies region part of LAF ([:, :, :2, :2]) by a scale_coefficient. So the center, shape and orientation of the local feature stays the same, but the region area changes.

Parameters
  • laf – (torch.Tensor): tensor [BxNx2x3] or [BxNx2x2].

  • scale_coef – (torch.Tensor): broadcastable tensor or float.

Returns

tensor BxNx2x3 .

Return type

torch.Tensor

Shape:
  • Input: :math: (B, N, 2, 3)

  • Input: :math: (B, N,) or ()

  • Output: :math: (B, N, 1, 1)

Example

>>> input = torch.ones(1, 5, 2, 3)  # BxNx2x3
>>> scale = 0.5
>>> output = kornia.scale_laf(input, scale)  # BxNx2x3
get_laf_scale(LAF: torch.Tensor) → torch.Tensor[source]

Returns a scale of the LAFs

Parameters

LAF – (torch.Tensor): tensor [BxNx2x3] or [BxNx2x2].

Returns

tensor BxNx1x1 .

Return type

torch.Tensor

Shape:
  • Input: :math: (B, N, 2, 3)

  • Output: :math: (B, N, 1, 1)

Example

>>> input = torch.ones(1, 5, 2, 3)  # BxNx2x3
>>> output = kornia.get_laf_scale(input)  # BxNx1x1
raise_error_if_laf_is_not_valid(laf: torch.Tensor) → None[source]

Auxilary function, which verifies that input is a torch.tensor of [BxNx2x3] shape

Parameters

laf

class NonMaximaSuppression2d(kernel_size: Tuple[int, int])[source]

Applies non maxima suppression to filter.

class BlobHessian(grads_mode='sobel')[source]

nn.Module that calculates Hessian blobs See hessian_response() for details.

class CornerGFTT(grads_mode='sobel')[source]

nn.Module that calculates Shi-Tomasi corners See gfft_response() for details.

class CornerHarris(k: Union[float, torch.Tensor], grads_mode='sobel')[source]

nn.Module that calculates Harris corners See harris_response() for details.

class SIFTDescriptor(patch_size: int = 41, num_ang_bins: int = 8, num_spatial_bins: int = 4, rootsift: bool = True, clipval: float = 0.2)[source]

Module, which computes SIFT descriptors of given patches

Parameters
  • patch_size – (int) Input patch size in pixels (41 is default)

  • num_ang_bins – (int) Number of angular bins. (8 is default)

  • num_spatial_bins – (int) Number of spatial bins (4 is default)

  • clipval – (float) default 0.2

  • rootsift – (bool) if True, RootSIFT (Arandjelović et. al, 2012)

  • computed (is) –

Returns

SIFT descriptor of the patches

Return type

Tensor

Shape:
  • Input: (B, 1, num_spatial_bins, num_spatial_bins)

  • Output: (B, num_ang_bins * num_spatial_bins ** 2)

Examples::
>>> input = torch.rand(23, 1, 32, 32)
>>> SIFT = kornia.SIFTDescriptor(32, 8, 4)
>>> descs = SIFT(input) # 23x128
class ScaleSpaceDetector(num_features: int = 500, mr_size: float = 6.0, scale_pyr_module: torch.nn.modules.module.Module = ScalePyramid(n_levels=3, init_sigma=1.6, min_size=10, border=4, sigma_step=1.2599210498948732), resp_module: torch.nn.modules.module.Module = BlobHessiangrads_mode=sobel), nms_module: torch.nn.modules.module.Module = ConvSoftArgmax3d(kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), temperature=tensor(1.), normalized_coordinates=False, eps=1e-08, output_value=True), ori_module: torch.nn.modules.module.Module = PassLAF(), aff_module: torch.nn.modules.module.Module = PassLAF())[source]
Module for differentiable local feature detection, as close as possible to classical

local feature detectors like Harris, Hessian-Affine or SIFT (DoG). It has 5 modules inside: scale pyramid generator, response (“cornerness”) function, soft nms function, affine shape estimator and patch orientation estimator. Each of those modules could be replaced with learned custom one, as long, as they respect output shape.

Parameters
  • num_features – (int) Number of features to detect. default = 500. In order to keep everything batchable, output would always have num_features outputed, even for completely homogeneous images.

  • mr_size – (float), default 6.0. Multiplier for local feature scale compared to the detection scale. 6.0 is matching OpenCV 12.0 convention for SIFT.

  • scale_pyr_module – (nn.Module), which generates scale pyramid. See ScalePyramid for details. Default is ScalePyramid(3, 1.6, 10)

  • resp_module – (nn.Module), which calculates ‘cornerness’ of the pixel. Default is BlobHessian().

  • nms_module – (nn.Module), which outputs per-patch coordinates of the responce maxima. See ConvSoftArgmax3d for details.

  • ori_module – (nn.Module) for local feature orientation estimation. Default is PassLAF, which does nothing. See LAFOrienter for details.

  • aff_module – (nn.Module) for local feature affine shape estimation. Default is PassLAF, which does nothing. See LAFAffineShapeEstimator for details.

forward(img: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]

Three stage local feature detection. First the location and scale of interest points are determined by detect function. Then affine shape and orientation.

Parameters

img – (torch.Tensor), shape [BxCxHxW]

Returns

shape [BxNx2x3]. Detected local affine frames. responses (torch.Tensor): shape [BxNx1]. Response function values for corresponding lafs

Return type

lafs (torch.Tensor)

class PassLAF[source]

Dummy module to use instead of local feature orientation or affine shape estimator

forward(laf: torch.Tensor, img: torch.Tensor) → torch.Tensor[source]
Parameters
  • laf – torch.Tensor: 4d tensor

  • img (torch.Tensor) – the input image tensor

Returns

unchanged laf from the input.

Return type

torch.Tensor

class PatchAffineShapeEstimator(patch_size: int = 19, eps: float = 1e-10)[source]

Module, which estimates the second moment matrix of the patch gradients in order to determine the affine shape of the local feature as in [Baumberg00].

Parameters
  • patch_size – int, default = 19

  • eps – float, for safe division, default is 1e-10

forward(patch: torch.Tensor) → torch.Tensor[source]
Parameters

patch – (torch.Tensor) shape [Bx1xHxW]

Returns

3d tensor, shape [Bx1x5]

Return type

ellipse_shape

class LAFAffineShapeEstimator(patch_size: int = 32)[source]

Module, which extracts patches using input images and local affine frames (LAFs), then runs PatchAffineShapeEstimator on patches to estimate LAFs shape. Then original LAF shape is replaced with estimated one. The original LAF orientation is not preserved, so it is recommended to first run LAFAffineShapeEstimator and then LAFOrienter.

Parameters

patch_size – int, default = 32

forward(laf: torch.Tensor, img: torch.Tensor) → torch.Tensor[source]
Parameters
  • laf – (torch.Tensor) shape [BxNx2x3]

  • img – (torch.Tensor) shape [Bx1xHxW]

Returns

(torch.Tensor) shape [BxNx2x3]

Return type

laf_out

class LAFOrienter(patch_size: int = 32, num_angular_bins: int = 36)[source]

Module, which extracts patches using input images and local affine frames (LAFs), then runs PatchDominantGradientOrientation on patches and then rotates the LAFs by the estimated angles

Parameters
  • patch_size – int, default = 32

  • num_angular_bins – int, default is 36

forward(laf: torch.Tensor, img: torch.Tensor) → torch.Tensor[source]
Parameters
  • laf – (torch.Tensor), shape [BxNx2x3]

  • img – (torch.Tensor), shape [Bx1xHxW]

Returns

(torch.Tensor), shape [BxNx2x3]

Return type

laf_out

class PatchDominantGradientOrientation(patch_size: int = 32, num_angular_bins: int = 36, eps: float = 1e-08)[source]

Module, which estimates the dominant gradient orientation of the given patches, in radians. Zero angle points towards right.

Parameters
  • patch_size – int, default = 32

  • num_angular_bins – int, default is 36

  • eps – float, for safe division, and arctan, default is 1e-8

forward(patch: torch.Tensor) → torch.Tensor[source]
Parameters

patch – (torch.Tensor) shape [Bx1xHxW]

Returns

(torch.Tensor) shape [Bx1]

Return type

patch

Zha19

Richard Zhang. Making convolutional networks shift-invariant again. In ICML. 2019.

Baumberg00

A. Baumberg. Reliable feature matching across widely separated views. In CVPR. 2000.