kornia.contrib

extract_tensor_patches(input: torch.Tensor, window_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int], None] = 1, padding: Union[int, Tuple[int, int], None] = 0) → torch.Tensor[source]

Function that extract patches from tensors and stack them.

See ExtractTensorPatches for details.

max_blur_pool2d(input: torch.Tensor, kernel_size: int) → torch.Tensor[source]

Creates a module that computes pools and blurs and downsample a given feature map.

See MaxBlurPool2d for details.

class ExtractTensorPatches(window_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int], None] = 1, padding: Union[int, Tuple[int, int], None] = 0)[source]

Module that extract patches from tensors and stack them.

Applies a 2D convolution over an input tensor to extract patches and stack them in the depth axis of the output tensor. The function applies a Depthwise Convolution by applying the same kernel for all the input planes.

In the simplest case, the output value of the operator with input size \((B, C, H, W)\) is \((B, N, C, H_{out}, W_{out})\).

where
  • \(B\) is the batch size.

  • \(N\) denotes the total number of extracted patches stacked in

  • \(C\) denotes the number of input channels.

  • \(H\), \(W\) the input height and width of the input in pixels.

  • \(H_{out}\), \(W_{out}\) denote to denote to the patch size defined in the function signature. left-right and top-bottom order.

  • window_size is the size of the sliding window and controls the shape of the output tensor and defines the shape of the output patch.

  • stride controls the stride to apply to the sliding window and regulates the overlapping between the extracted patches.

  • padding controls the amount of implicit zeros-paddings on both sizes at each dimension.

The parameters window_size, stride and padding can be either:

  • a single int – in which case the same value is used for the height and width dimension.

  • a tuple of two ints – in which case, the first int is used for the height dimension, and the second int for the width dimension.

Parameters
  • window_size (Union[int, Tuple[int, int]]) – the size of the convolving kernel and the output patch size.

  • stride (Optional[Union[int, Tuple[int, int]]]) – stride of the convolution. Default is 1.

  • padding (Optional[Union[int, Tuple[int, int]]]) – Zero-padding added to both side of the input. Default is 0.

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, N, C, H_{out}, W_{out})\)

Returns

the tensor with the extracted patches.

Return type

torch.Tensor

Examples

>>> input = torch.arange(9.).view(1, 1, 3, 3)
>>> patches = kornia.contrib.extract_tensor_patches(input, (2, 3))
>>> input
tensor([[[[0., 1., 2.],
          [3., 4., 5.],
          [6., 7., 8.]]]])
>>> patches[:, -1]
tensor([[[[3.0000, 4.0000, 5.0000],
          [6.0000, 7.0000, 8.0000]]]])
class MaxBlurPool2d(kernel_size: int)[source]

Creates a module that computes pools and blurs and downsample a given feature map.

See [Zha19] for more details.

Parameters

kernel_size (int) – the kernel size for max pooling..

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H / 2, W / 2)\)

Returns

the transformed tensor.

Return type

torch.Tensor

Examples

>>> input = torch.rand(1, 4, 4, 8)
>>> pool = kornia.contrib.MaxblurPool2d(kernel_size=3)
>>> output = pool(input)  # 1x4x2x4

Zha19

Richard Zhang. Making convolutional networks shift-invariant again. In ICML. 2019.