kornia.geometry.subpix

Module with useful functionalities to extract coordinates sub=pixel accuracy.

spatial_soft_argmax2d(input: torch.Tensor, temperature: torch.Tensor = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08) → torch.Tensor[source]

Function that computes the Spatial Soft-Argmax 2D of a given input heatmap.

Returns the index of the maximum 2d coordinates of the give map. The output order is x-coord and y-coord.

Parameters
  • temperature (torch.Tensor) – factor to apply to input. Default is 1.

  • normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is True.

  • eps (float) – small value to avoid zero division. Default is 1e-8.

Shape:
  • Input: \((B, N, H, W)\)

  • Output: \((B, N, 2)\)

Examples

>>> input = torch.tensor([[[
... [0., 0., 0.],
... [0., 10., 0.],
... [0., 0., 0.]]]])
>>> spatial_soft_argmax2d(input, normalized_coordinates=False)
tensor([[[1.0000, 1.0000]]])
conv_soft_argmax2d(input: torch.Tensor, kernel_size: Tuple[int, int] = (3, 3), stride: Tuple[int, int] = (1, 1), padding: Tuple[int, int] = (1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08, output_value: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Function that computes the convolutional spatial Soft-Argmax 2D over the windows of a given input heatmap. Function has two outputs: argmax coordinates and the softmaxpooled heatmap values themselves. On each window, the function computed is

\[ij(X) = \frac{\sum{(i,j)} * exp(x / T) \in X} {\sum{exp(x / T) \in X}}\]
\[val(X) = \frac{\sum{x * exp(x / T) \in X}} {\sum{exp(x / T) \in X}}\]

where T is temperature.

Parameters
  • kernel_size (Tuple[int,int]) – the size of the window

  • stride (Tuple[int,int]) – the stride of the window.

  • padding (Tuple[int,int]) – input zero padding

  • temperature (torch.Tensor) – factor to apply to input. Default is 1.

  • normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is True.

  • eps (float) – small value to avoid zero division. Default is 1e-8.

  • output_value (bool) – if True, val is outputed, if False, only ij

Shape:
  • Input: \((N, C, H_{in}, W_{in})\)

  • Output: \((N, C, 2, H_{out}, W_{out})\), \((N, C, H_{out}, W_{out})\), where

\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[0] - (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]
\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[1] - (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]
Examples::
>>> input = torch.randn(20, 16, 50, 32)
>>> nms_coords, nms_val = conv_soft_argmax2d(input, (3,3), (2,2), (1,1), output_value=True)
conv_soft_argmax3d(input: torch.Tensor, kernel_size: Tuple[int, int, int] = (3, 3, 3), stride: Tuple[int, int, int] = (1, 1, 1), padding: Tuple[int, int, int] = (1, 1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = False, eps: float = 1e-08, output_value: bool = True, strict_maxima_bonus: float = 0.0) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Function that computes the convolutional spatial Soft-Argmax 3D over the windows of a given input heatmap. Function has two outputs: argmax coordinates and the softmaxpooled heatmap values themselves. On each window, the function computed is:

\[ijk(X) = \frac{\sum{(i,j,k)} * exp(x / T) \in X} {\sum{exp(x / T) \in X}}\]
\[val(X) = \frac{\sum{x * exp(x / T) \in X}} {\sum{exp(x / T) \in X}}\]

where T is temperature.

Parameters
  • kernel_size (Tuple[int,int,int]) – size of the window

  • stride (Tuple[int,int,int]) – stride of the window.

  • padding (Tuple[int,int,int]) – input zero padding

  • temperature (torch.Tensor) – factor to apply to input. Default is 1.

  • normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is False.

  • eps (float) – small value to avoid zero division. Default is 1e-8.

  • output_value (bool) – if True, val is outputed, if False, only ij

  • strict_maxima_bonus (float) – pixels, which are strict maxima will score (1 + strict_maxima_bonus) * value. This is needed for mimic behavior of strict NMS in classic local features

Shape:
  • Input: \((N, C, D_{in}, H_{in}, W_{in})\)

  • Output: \((N, C, 3, D_{out}, H_{out}, W_{out})\), \((N, C, D_{out}, H_{out}, W_{out})\), where

\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]
\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]
\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - (\text{kernel\_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\]

Examples

>>> input = torch.randn(20, 16, 3, 50, 32)
>>> nms_coords, nms_val = conv_soft_argmax3d(input, (3, 3, 3), (1, 2, 2), (0, 1, 1))
conv_quad_interp3d(input: torch.Tensor, strict_maxima_bonus: float = 10.0, eps: float = 1e-07)[source]

Function that computes the single iteration of quadratic interpolation of of the extremum (max or min) location and value per each 3x3x3 window which contains strict extremum, similar to one done is SIFT

Parameters
  • strict_maxima_bonus (float) – pixels, which are strict maxima will score (1 + strict_maxima_bonus) * value. This is needed for mimic behavior of strict NMS in classic local features

  • eps (float) – parameter to control the hessian matrix ill-condition number.

Shape:
  • Input: \((N, C, D_{in}, H_{in}, W_{in})\)

  • Output: \((N, C, 3, D_{out}, H_{out}, W_{out})\), \((N, C, D_{out}, H_{out}, W_{out})\), where

\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]
\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]
\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - (\text{kernel\_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\]

Examples

>>> input = torch.randn(20, 16, 3, 50, 32)
>>> nms_coords, nms_val = conv_quad_interp3d(input, 1.0)
spatial_softmax2d(input: torch.Tensor, temperature: torch.Tensor = tensor(1.)) → torch.Tensor[source]

Applies the Softmax function over features in each image channel.

Note that this function behaves differently to torch.nn.Softmax2d, which instead applies Softmax over features at each spatial location.

Returns a 2D probability distribution per image channel.

Parameters
  • input (torch.Tensor) – the input tensor.

  • temperature (torch.Tensor) – factor to apply to input, adjusting the “smoothness” of the output distribution. Default is 1.

Shape:
  • Input: \((B, N, H, W)\)

  • Output: \((B, N, H, W)\)

Examples

>>> heatmaps = torch.tensor([[[
... [0., 0., 0.],
... [0., 0., 0.],
... [0., 1., 2.]]]])
>>> spatial_softmax2d(heatmaps)
tensor([[[[0.0585, 0.0585, 0.0585],
          [0.0585, 0.0585, 0.0585],
          [0.0585, 0.1589, 0.4319]]]])
spatial_expectation2d(input: torch.Tensor, normalized_coordinates: bool = True) → torch.Tensor[source]

Computes the expectation of coordinate values using spatial probabilities.

The input heatmap is assumed to represent a valid spatial probability distribution, which can be achieved using spatial_softmax2d.

Returns the expected value of the 2D coordinates. The output order of the coordinates is (x, y).

Parameters
  • input (torch.Tensor) – the input tensor representing dense spatial probabilities.

  • normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is True.

Shape:
  • Input: \((B, N, H, W)\)

  • Output: \((B, N, 2)\)

Examples

>>> heatmaps = torch.tensor([[[
... [0., 0., 0.],
... [0., 0., 0.],
... [0., 1., 0.]]]])
>>> spatial_expectation2d(heatmaps, False)
tensor([[[1., 2.]]])
render_gaussian2d(mean: torch.Tensor, std: torch.Tensor, size: Tuple[int, int], normalized_coordinates: bool = True)[source]

Renders the PDF of a 2D Gaussian distribution.

Parameters
  • mean (torch.Tensor) – the mean location of the Gaussian to render, \((\mu_x, \mu_y)\).

  • std (torch.Tensor) – the standard deviation of the Gaussian to render, \((\sigma_x, \sigma_y)\).

  • size (list) – the (height, width) of the output image.

  • normalized_coordinates – whether mean and std are assumed to use coordinates normalized in the range of [-1, 1]. Otherwise, coordinates are assumed to be in the range of the output shape. Default is True.

Shape:
  • mean: \((*, 2)\)

  • std: \((*, 2)\). Should be able to be broadcast with mean.

  • Output: \((*, H, W)\)

Module

class SpatialSoftArgmax2d(temperature: torch.Tensor = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08)[source]

Module that computes the Spatial Soft-Argmax 2D of a given heatmap.

See spatial_soft_argmax2d() for details.

class ConvSoftArgmax2d(kernel_size: Tuple[int, int] = (3, 3), stride: Tuple[int, int] = (1, 1), padding: Tuple[int, int] = (1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08, output_value: bool = False)[source]

Module that calculates soft argmax 2d per window.

See conv_soft_argmax2d() for details.

class ConvSoftArgmax3d(kernel_size: Tuple[int, int, int] = (3, 3, 3), stride: Tuple[int, int, int] = (1, 1, 1), padding: Tuple[int, int, int] = (1, 1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = False, eps: float = 1e-08, output_value: bool = True, strict_maxima_bonus: float = 0.0)[source]

Module that calculates soft argmax 3d per window.

See conv_soft_argmax3d() for details.

class ConvQuadInterp3d(strict_maxima_bonus: float = 10.0, eps: float = 1e-07)[source]

Module that calculates soft argmax 3d per window See conv_quad_interp3d() for details.