kornia.geometry.transform

The functions in this section perform various geometrical transformations of 2D images.

warp_perspective(src: torch.Tensor, M: torch.Tensor, dsize: Tuple[int, int], flags: str = 'bilinear', border_mode: str = 'zeros', align_corners: bool = False) → torch.Tensor[source]

Applies a perspective transformation to an image.

The function warp_perspective transforms the source image using the specified matrix:

\[\text{dst} (x, y) = \text{src} \left( \frac{M_{11} x + M_{12} y + M_{13}}{M_{31} x + M_{32} y + M_{33}} , \frac{M_{21} x + M_{22} y + M_{23}}{M_{31} x + M_{32} y + M_{33}} \right )\]
Parameters
  • src (torch.Tensor) – input image.

  • M (Tensor) – transformation matrix.

  • dsize (tuple) – size of the output image (height, width).

  • flags (str) – interpolation mode to calculate output values ‘bilinear’ | ‘nearest’. Default: ‘bilinear’.

  • border_mode (str) – padding mode for outside grid values ‘zeros’ | ‘border’ | ‘reflection’. Default: ‘zeros’.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

the warped input image.

Return type

Tensor

Shape:
  • Input: \((B, C, H, W)\) and \((B, 3, 3)\)

  • Output: \((B, C, H, W)\)

Note

See a working example here.

warp_affine(src: torch.Tensor, M: torch.Tensor, dsize: Tuple[int, int], flags: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = False) → torch.Tensor[source]

Applies an affine transformation to a tensor.

The function warp_affine transforms the source tensor using the specified matrix:

\[\text{dst}(x, y) = \text{src} \left( M_{11} x + M_{12} y + M_{13} , M_{21} x + M_{22} y + M_{23} \right )\]
Parameters
  • src (torch.Tensor) – input tensor of shape \((B, C, H, W)\).

  • M (torch.Tensor) – affine transformation of shape \((B, 2, 3)\).

  • dsize (Tuple[int, int]) – size of the output image (height, width).

  • mode (str) – interpolation mode to calculate output values ‘bilinear’ | ‘nearest’. Default: ‘bilinear’.

  • padding_mode (str) – padding mode for outside grid values ‘zeros’ | ‘border’ | ‘reflection’. Default: ‘zeros’.

  • align_corners (bool) – mode for grid_generation. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for details

Returns

the warped tensor.

Return type

torch.Tensor

Shape:
  • Output: \((B, C, H, W)\)

Note

See a working example here.

get_perspective_transform(src, dst)[source]

Calculates a perspective transform from four pairs of the corresponding points.

The function calculates the matrix of a perspective transform so that:

\[\begin{split}\begin{bmatrix} t_{i}x_{i}^{'} \\ t_{i}y_{i}^{'} \\ t_{i} \\ \end{bmatrix} = \textbf{map_matrix} \cdot \begin{bmatrix} x_{i} \\ y_{i} \\ 1 \\ \end{bmatrix}\end{split}\]

where

\[dst(i) = (x_{i}^{'},y_{i}^{'}), src(i) = (x_{i}, y_{i}), i = 0,1,2,3\]
Parameters
  • src (Tensor) – coordinates of quadrangle vertices in the source image.

  • dst (Tensor) – coordinates of the corresponding quadrangle vertices in the destination image.

Returns

the perspective transformation.

Return type

Tensor

Shape:
  • Input: \((B, 4, 2)\) and \((B, 4, 2)\)

  • Output: \((B, 3, 3)\)

get_rotation_matrix2d(center: torch.Tensor, angle: torch.Tensor, scale: torch.Tensor) → torch.Tensor[source]

Calculates an affine matrix of 2D rotation.

The function calculates the following matrix:

\[\begin{split}\begin{bmatrix} \alpha & \beta & (1 - \alpha) \cdot \text{x} - \beta \cdot \text{y} \\ -\beta & \alpha & \beta \cdot \text{x} + (1 - \alpha) \cdot \text{y} \end{bmatrix}\end{split}\]

where

\[\begin{split}\alpha = \text{scale} \cdot cos(\text{angle}) \\ \beta = \text{scale} \cdot sin(\text{angle})\end{split}\]

The transformation maps the rotation center to itself If this is not the target, adjust the shift.

Parameters
  • center (Tensor) – center of the rotation in the source image.

  • angle (Tensor) – rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner).

  • scale (Tensor) – isotropic scale factor.

Returns

the affine matrix of 2D rotation.

Return type

Tensor

Shape:
  • Input: \((B, 2)\), \((B)\) and \((B)\)

  • Output: \((B, 2, 3)\)

Example

>>> center = torch.zeros(1, 2)
>>> scale = torch.ones(1)
>>> angle = 45. * torch.ones(1)
>>> M = kornia.get_rotation_matrix2d(center, angle, scale)
tensor([[[ 0.7071,  0.7071,  0.0000],
         [-0.7071,  0.7071,  0.0000]]])
remap(tensor: torch.Tensor, map_x: torch.Tensor, map_y: torch.Tensor, align_corners: bool = False) → torch.Tensor[source]

Applies a generic geometrical transformation to a tensor.

The function remap transforms the source tensor using the specified map:

\[\text{dst}(x, y) = \text{src}(map_x(x, y), map_y(x, y))\]
Parameters
  • tensor (torch.Tensor) – the tensor to remap with shape (B, D, H, W). Where D is the number of channels.

  • map_x (torch.Tensor) – the flow in the x-direction in pixel coordinates. The tensor must be in the shape of (B, H, W).

  • map_y (torch.Tensor) – the flow in the y-direction in pixel coordinates. The tensor must be in the shape of (B, H, W).

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

the warped tensor.

Return type

torch.Tensor

Example

>>> grid = kornia.utils.create_meshgrid(2, 2, False)  # 1x2x2x2
>>> grid += 1  # apply offset in both directions
>>> input = torch.ones(1, 1, 2, 2)
>>> kornia.remap(input, grid[..., 0], grid[..., 1])   # 1x1x2x2
tensor([[[[1., 0.],
          [0., 0.]]]])
invert_affine_transform(matrix: torch.Tensor) → torch.Tensor[source]

Inverts an affine transformation.

The function computes an inverse affine transformation represented by 2×3 matrix:

\[\begin{split}\begin{bmatrix} a_{11} & a_{12} & b_{1} \\ a_{21} & a_{22} & b_{2} \\ \end{bmatrix}\end{split}\]

The result is also a 2×3 matrix of the same type as M.

Parameters

matrix (torch.Tensor) – original affine transform. The tensor must be in the shape of (B, 2, 3).

Returns

the reverse affine transform.

Return type

torch.Tensor

center_crop(tensor: torch.Tensor, size: Tuple[int, int], interpolation: str = 'bilinear', align_corners: bool = True) → torch.Tensor[source]

Crops the given tensor at the center.

Parameters
  • tensor (torch.Tensor) – the input tensor with shape (C, H, W) or (B, C, H, W).

  • size (Tuple[int, int]) – a tuple with the expected height and width of the output patch.

  • align_corners (bool) – mode for grid_generation. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for details

Returns

the output tensor with patches.

Return type

torch.Tensor

Examples

>>> input = torch.tensor([[
        [1., 2., 3., 4.],
        [5., 6., 7., 8.],
        [9., 10., 11., 12.],
        [13., 14., 15., 16.],
     ]])
>>> kornia.center_crop(input, (2, 4))
tensor([[[ 5.0000,  6.0000,  7.0000,  8.0000],
         [ 9.0000, 10.0000, 11.0000, 12.0000]]])
crop_and_resize(tensor: torch.Tensor, boxes: torch.Tensor, size: Tuple[int, int], interpolation: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Extracts crops from the input tensor and resizes them. :param tensor: the reference tensor of shape BxCxHxW. :type tensor: torch.Tensor :param boxes: a tensor containing the coordinates of the

bounding boxes to be extracted. The tensor must have the shape of Bx4x2, where each box is defined in the following (clockwise) order: top-left, top-right, bottom-right and bottom-left. The coordinates must be in the x, y order.

Parameters
  • size (Tuple[int, int]) – a tuple with the height and width that will be used to resize the extracted patches.

  • align_corners (bool) – mode for grid_generation. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for details

Returns

tensor containing the patches with shape BxN1xN2

Return type

torch.Tensor

Example

>>> input = torch.tensor([[
        [1., 2., 3., 4.],
        [5., 6., 7., 8.],
        [9., 10., 11., 12.],
        [13., 14., 15., 16.],
    ]])
>>> boxes = torch.tensor([[
        [1., 1.],
        [2., 1.],
        [2., 2.],
        [1., 2.],
    ]])  # 1x4x2
>>> kornia.crop_and_resize(input, boxes, (2, 2))
tensor([[[ 6.0000,  7.0000],
         [ 10.0000, 11.0000]]])
pyrdown(input: torch.Tensor, border_type: str = 'reflect', align_corners: bool = False) → torch.Tensor[source]

Blurs a tensor and downsamples it.

See PyrDown for details.

pyrup(input: torch.Tensor, border_type: str = 'reflect', align_corners: bool = False) → torch.Tensor[source]

Upsamples a tensor and then blurs it.

See PyrUp for details.

build_pyramid(input: torch.Tensor, max_level: int, border_type: str = 'reflect', align_corners: bool = False) → List[torch.Tensor][source]

Constructs the Gaussian pyramid for an image.

The function constructs a vector of images and builds the Gaussian pyramid by recursively applying pyrDown to the previously built pyramid layers.

Parameters
  • input (torch.Tensor) – the tensor to be used to constructuct the pyramid.

  • max_level (int) – 0-based index of the last (the smallest) pyramid layer. It must be non-negative.

  • border_type (str) – the padding mode to be applied before convolving. The expected modes are: 'constant', 'reflect', 'replicate' or 'circular'. Default: 'reflect'.

  • align_corners (bool) – interpolation flag. Default: False. See

Shape:
  • Input: \((B, C, H, W)\)

  • Output \([(B, NL, C, H, W), (B, NL, C, H/2, W/2), ...]\)

affine(tensor: torch.Tensor, matrix: torch.Tensor, mode: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Apply an affine transformation to the image.

Parameters
  • tensor (torch.Tensor) – The image tensor to be warped.

  • matrix (torch.Tensor) – The 2x3 affine transformation matrix.

  • mode (str) – ‘bilinear’ | ‘nearest’

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The warped image.

Return type

torch.Tensor

rotate(tensor: torch.Tensor, angle: torch.Tensor, center: Union[None, torch.Tensor] = None, mode: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Rotate the image anti-clockwise about the centre.

See Rotate for details.

translate(tensor: torch.Tensor, translation: torch.Tensor, align_corners: bool = False) → torch.Tensor[source]

Translate the tensor in pixel units.

See Translate for details.

scale(tensor: torch.Tensor, scale_factor: torch.Tensor, center: Union[None, torch.Tensor] = None, align_corners: bool = False) → torch.Tensor[source]

Scales the input image.

See Scale for details.

shear(tensor: torch.Tensor, shear: torch.Tensor, align_corners: bool = False) → torch.Tensor[source]

Shear the tensor.

See Shear for details.

hflip(input: torch.Tensor) → torch.Tensor[source]

Horizontally flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The horizontally flipped image tensor

Return type

torch.Tensor

vflip(input: torch.Tensor) → torch.Tensor[source]

Vertically flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The vertically flipped image tensor

Return type

torch.Tensor

rot180(input: torch.Tensor) → torch.Tensor[source]

Rotate a tensor image or a batch of tensor images 180 degrees. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The rotated image tensor

Return type

torch.Tensor

resize(input: torch.Tensor, size: Union[int, Tuple[int, int]], interpolation: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Resize the input torch.Tensor to the given size.

See Resize for details.

class Rotate(angle: torch.Tensor, center: Union[None, torch.Tensor] = None, align_corners: bool = False)[source]

Rotate the tensor anti-clockwise about the centre.

Parameters
  • angle (torch.Tensor) – The angle through which to rotate. The tensor must have a shape of (B), where B is batch size.

  • center (torch.Tensor) – The center through which to rotate. The tensor must have a shape of (B, 2), where B is batch size and last dimension contains cx and cy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The rotated tensor.

Return type

torch.Tensor

class Translate(translation: torch.Tensor, align_corners: bool = False)[source]

Translate the tensor in pixel units.

Parameters
  • translation (torch.Tensor) – tensor containing the amount of pixels to translate in the x and y direction. The tensor must have a shape of (B, 2), where B is batch size, last dimension contains dx dy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The translated tensor.

Return type

torch.Tensor

class Scale(scale_factor: torch.Tensor, center: Union[None, torch.Tensor] = None, align_corners: bool = False)[source]

Scale the tensor by a factor.

Parameters
  • scale_factor (torch.Tensor) – The scale factor apply. The tensor must have a shape of (B), where B is batch size.

  • center (torch.Tensor) – The center through which to scale. The tensor must have a shape of (B, 2), where B is batch size and last dimension contains cx and cy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The scaled tensor.

Return type

torch.Tensor

class Shear(shear: torch.Tensor, align_corners: bool = False)[source]

Shear the tensor.

Parameters
  • tensor (torch.Tensor) – The image tensor to be skewed.

  • shear (torch.Tensor) – tensor containing the angle to shear in the x and y direction. The tensor must have a shape of (B, 2), where B is batch size, last dimension contains shx shy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The skewed tensor.

Return type

torch.Tensor

class PyrDown(border_type: str = 'reflect', align_corners: bool = False)[source]

Blurs a tensor and downsamples it.

Parameters
  • border_type (str) – the padding mode to be applied before convolving. The expected modes are: 'constant', 'reflect', 'replicate' or 'circular'. Default: 'reflect'.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

the downsampled tensor.

Return type

torch.Tensor

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H / 2, W / 2)\)

Examples

>>> input = torch.rand(1, 2, 4, 4)
>>> output = kornia.transform.PyrDown()(input)  # 1x2x2x2
class PyrUp(border_type: str = 'reflect', align_corners: bool = False)[source]

Upsamples a tensor and then blurs it.

Parameters
  • borde_type (str) – the padding mode to be applied before convolving. The expected modes are: 'constant', 'reflect', 'replicate' or 'circular'. Default: 'reflect'.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

the upsampled tensor.

Return type

torch.Tensor

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H * 2, W * 2)\)

Examples

>>> input = torch.rand(1, 2, 4, 4)
>>> output = kornia.transform.PyrUp()(input)  # 1x2x8x8
class ScalePyramid(n_levels: int = 3, init_sigma: float = 1.6, min_size: int = 5, double_image: bool = False)[source]

Creates an scale pyramid of image, usually used for local feature detection. Images are consequently smoothed with Gaussian blur and downscaled. :param n_levels: number of the levels in octave. :type n_levels: int :param init_sigma: initial blur level. :type init_sigma: float :param min_size: the minimum size of the octave in pixels. Default is 5 :type min_size: int :param double_image: add 2x upscaled image as 1st level of pyramid. OpenCV SIFT does this. Default is False :type double_image: bool

Returns

1st output: images 2nd output: sigmas (coefficients for scale conversion) 3rd output: pixelDists (coefficients for coordinate conversion)

Return type

Tuple(List(Tensors), List(Tensors), List(Tensors))

Shape:
  • Input: \((B, C, H, W)\)

  • Output 1st: \([(B, NL, C, H, W), (B, NL, C, H/2, W/2), ...]\)

  • Output 2nd: \([(B, NL), (B, NL), (B, NL), ...]\)

  • Output 3rd: \([(B, NL), (B, NL), (B, NL), ...]\)

Examples::
>>> input = torch.rand(2, 4, 100, 100)
>>> sp, sigmas, pds = kornia.ScalePyramid(3, 15)(input)
class Hflip[source]

Horizontally flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The horizontally flipped image tensor

Return type

torch.Tensor

Examples

>>> input = torch.tensor([[[
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 1., 1.]]]])
>>> kornia.hflip(input)
tensor([[[0, 0, 0],
         [0, 0, 0],
         [1, 1, 0]]])
class Vflip[source]

Vertically flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The vertically flipped image tensor

Return type

torch.Tensor

Examples

>>> input = torch.tensor([[[
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 1., 1.]]]])
>>> kornia.vflip(input)
tensor([[[0, 1, 1],
         [0, 0, 0],
         [0, 0, 0]]])
class Rot180[source]

Rotate a tensor image or a batch of tensor images 180 degrees. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Examples

>>> input = torch.tensor([[[
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 1., 1.]]]])
>>> kornia.rot180(input)
tensor([[[1, 1, 0],
        [0, 0, 0],
        [0, 0, 0]]])
class Resize(size: Union[int, Tuple[int, int]], interpolation: str = 'bilinear', align_corners: bool = False)[source]

Resize the input torch.Tensor to the given size.

Parameters
  • size (int, tuple(int, int)) – Desired output size. If size is a sequence like (h, w),

  • size will be matched to this. If size is an int, smaller edge of the image will (output) –

  • matched to this number. i.e, if height > width, then image will be rescaled (be) –

  • to (size * height / width, size) –

  • interpolation (str) – algorithm used for upsampling: ‘nearest’ | ‘linear’ | ‘bilinear’ |

  • | 'trilinear' | 'area'. Default ('bicubic') – ‘bilinear’.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The resized tensor.

Return type

torch.Tensor