kornia.geometry.subpix¶
Module with useful functionalities to extract coordinates sub=pixel accuracy.
-
spatial_soft_argmax2d
(input: torch.Tensor, temperature: torch.Tensor = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08) → torch.Tensor[source]¶ Function that computes the Spatial Soft-Argmax 2D of a given input heatmap.
Returns the index of the maximum 2d coordinates of the give map. The output order is x-coord and y-coord.
- Parameters
temperature (torch.Tensor) – factor to apply to input. Default is 1.
normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is True.
eps (float) – small value to avoid zero division. Default is 1e-8.
- Shape:
Input: \((B, N, H, W)\)
Output: \((B, N, 2)\)
Examples
>>> input = torch.tensor([[[ ... [0., 0., 0.], ... [0., 10., 0.], ... [0., 0., 0.]]]]) >>> spatial_soft_argmax2d(input, normalized_coordinates=False) tensor([[[1.0000, 1.0000]]])
-
conv_soft_argmax2d
(input: torch.Tensor, kernel_size: Tuple[int, int] = (3, 3), stride: Tuple[int, int] = (1, 1), padding: Tuple[int, int] = (1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08, output_value: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Function that computes the convolutional spatial Soft-Argmax 2D over the windows of a given input heatmap. Function has two outputs: argmax coordinates and the softmaxpooled heatmap values themselves. On each window, the function computed is
\[ij(X) = \frac{\sum{(i,j)} * exp(x / T) \in X} {\sum{exp(x / T) \in X}}\]\[val(X) = \frac{\sum{x * exp(x / T) \in X}} {\sum{exp(x / T) \in X}}\]where T is temperature.
- Parameters
temperature (torch.Tensor) – factor to apply to input. Default is 1.
normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is True.
eps (float) – small value to avoid zero division. Default is 1e-8.
output_value (bool) – if True, val is output, if False, only ij
- Shape:
Input: \((N, C, H_{in}, W_{in})\)
Output: \((N, C, 2, H_{out}, W_{out})\), \((N, C, H_{out}, W_{out})\), where
\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[0] - (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[1] - (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]- Examples::
>>> input = torch.randn(20, 16, 50, 32) >>> nms_coords, nms_val = conv_soft_argmax2d(input, (3,3), (2,2), (1,1), output_value=True)
-
conv_soft_argmax3d
(input: torch.Tensor, kernel_size: Tuple[int, int, int] = (3, 3, 3), stride: Tuple[int, int, int] = (1, 1, 1), padding: Tuple[int, int, int] = (1, 1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = False, eps: float = 1e-08, output_value: bool = True, strict_maxima_bonus: float = 0.0) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Function that computes the convolutional spatial Soft-Argmax 3D over the windows of a given input heatmap. Function has two outputs: argmax coordinates and the softmaxpooled heatmap values themselves. On each window, the function computed is:
\[ijk(X) = \frac{\sum{(i,j,k)} * exp(x / T) \in X} {\sum{exp(x / T) \in X}}\]\[val(X) = \frac{\sum{x * exp(x / T) \in X}} {\sum{exp(x / T) \in X}}\]where T is temperature.
- Parameters
temperature (torch.Tensor) – factor to apply to input. Default is 1.
normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is False.
eps (float) – small value to avoid zero division. Default is 1e-8.
output_value (bool) – if True, val is output, if False, only ij
strict_maxima_bonus (float) – pixels, which are strict maxima will score (1 + strict_maxima_bonus) * value. This is needed for mimic behavior of strict NMS in classic local features
- Shape:
Input: \((N, C, D_{in}, H_{in}, W_{in})\)
Output: \((N, C, 3, D_{out}, H_{out}, W_{out})\), \((N, C, D_{out}, H_{out}, W_{out})\), where
\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - (\text{kernel\_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\]
Examples
>>> input = torch.randn(20, 16, 3, 50, 32) >>> nms_coords, nms_val = conv_soft_argmax3d(input, (3, 3, 3), (1, 2, 2), (0, 1, 1))
-
conv_quad_interp3d
(input: torch.Tensor, strict_maxima_bonus: float = 10.0, eps: float = 1e-07)[source]¶ Function that computes the single iteration of quadratic interpolation of of the extremum (max or min) location and value per each 3x3x3 window which contains strict extremum, similar to one done is SIFT
- Parameters
- Shape:
Input: \((N, C, D_{in}, H_{in}, W_{in})\)
Output: \((N, C, 3, D_{out}, H_{out}, W_{out})\), \((N, C, D_{out}, H_{out}, W_{out})\), where
\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - (\text{kernel\_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\]
Examples
>>> input = torch.randn(20, 16, 3, 50, 32) >>> nms_coords, nms_val = conv_quad_interp3d(input, 1.0)
-
spatial_softmax2d
(input: torch.Tensor, temperature: torch.Tensor = tensor(1.)) → torch.Tensor[source]¶ Applies the Softmax function over features in each image channel.
Note that this function behaves differently to torch.nn.Softmax2d, which instead applies Softmax over features at each spatial location.
Returns a 2D probability distribution per image channel.
- Parameters
input (torch.Tensor) – the input tensor.
temperature (torch.Tensor) – factor to apply to input, adjusting the “smoothness” of the output distribution. Default is 1.
- Shape:
Input: \((B, N, H, W)\)
Output: \((B, N, H, W)\)
Examples
>>> heatmaps = torch.tensor([[[ ... [0., 0., 0.], ... [0., 0., 0.], ... [0., 1., 2.]]]]) >>> spatial_softmax2d(heatmaps) tensor([[[[0.0585, 0.0585, 0.0585], [0.0585, 0.0585, 0.0585], [0.0585, 0.1589, 0.4319]]]])
-
spatial_expectation2d
(input: torch.Tensor, normalized_coordinates: bool = True) → torch.Tensor[source]¶ Computes the expectation of coordinate values using spatial probabilities.
The input heatmap is assumed to represent a valid spatial probability distribution, which can be achieved using
spatial_softmax2d
.Returns the expected value of the 2D coordinates. The output order of the coordinates is (x, y).
- Parameters
input (torch.Tensor) – the input tensor representing dense spatial probabilities.
normalized_coordinates (bool) – whether to return the coordinates normalized in the range of [-1, 1]. Otherwise, it will return the coordinates in the range of the input shape. Default is True.
- Shape:
Input: \((B, N, H, W)\)
Output: \((B, N, 2)\)
Examples
>>> heatmaps = torch.tensor([[[ ... [0., 0., 0.], ... [0., 0., 0.], ... [0., 1., 0.]]]]) >>> spatial_expectation2d(heatmaps, False) tensor([[[1., 2.]]])
-
render_gaussian2d
(mean: torch.Tensor, std: torch.Tensor, size: Tuple[int, int], normalized_coordinates: bool = True)[source]¶ Renders the PDF of a 2D Gaussian distribution.
- Parameters
mean (torch.Tensor) – the mean location of the Gaussian to render, \((\mu_x, \mu_y)\).
std (torch.Tensor) – the standard deviation of the Gaussian to render, \((\sigma_x, \sigma_y)\).
size (list) – the (height, width) of the output image.
normalized_coordinates – whether mean and std are assumed to use coordinates normalized in the range of [-1, 1]. Otherwise, coordinates are assumed to be in the range of the output shape. Default is True.
- Shape:
mean: \((*, 2)\)
std: \((*, 2)\). Should be able to be broadcast with mean.
Output: \((*, H, W)\)
Module¶
-
class
SpatialSoftArgmax2d
(temperature: torch.Tensor = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08)[source]¶ Module that computes the Spatial Soft-Argmax 2D of a given heatmap.
See
spatial_soft_argmax2d()
for details.
-
class
ConvSoftArgmax2d
(kernel_size: Tuple[int, int] = (3, 3), stride: Tuple[int, int] = (1, 1), padding: Tuple[int, int] = (1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = True, eps: float = 1e-08, output_value: bool = False)[source]¶ Module that calculates soft argmax 2d per window.
See
conv_soft_argmax2d()
for details.
-
class
ConvSoftArgmax3d
(kernel_size: Tuple[int, int, int] = (3, 3, 3), stride: Tuple[int, int, int] = (1, 1, 1), padding: Tuple[int, int, int] = (1, 1, 1), temperature: Union[torch.Tensor, float] = tensor(1.), normalized_coordinates: bool = False, eps: float = 1e-08, output_value: bool = True, strict_maxima_bonus: float = 0.0)[source]¶ Module that calculates soft argmax 3d per window.
See
conv_soft_argmax3d()
for details.
-
class
ConvQuadInterp3d
(strict_maxima_bonus: float = 10.0, eps: float = 1e-07)[source]¶ Module that calculates soft argmax 3d per window See
conv_quad_interp3d()
for details.