kornia.geometry.transform

The functions in this section perform various geometrical transformations of 2D images.

warp_perspective(src: torch.Tensor, M: torch.Tensor, dsize: Tuple[int, int], flags: str = 'bilinear', border_mode: str = 'zeros', align_corners: bool = False) → torch.Tensor[source]

Applies a perspective transformation to an image.

The function warp_perspective transforms the source image using the specified matrix:

\[\text{dst} (x, y) = \text{src} \left( \frac{M_{11} x + M_{12} y + M_{13}}{M_{31} x + M_{32} y + M_{33}} , \frac{M_{21} x + M_{22} y + M_{23}}{M_{31} x + M_{32} y + M_{33}} \right )\]
Parameters
  • src (torch.Tensor) – input image with shape \((B, C, H, W)\).

  • M (torch.Tensor) – transformation matrix with shape \((B, 3, 3)\).

  • dsize (tuple) – size of the output image (height, width).

  • flags (str) – interpolation mode to calculate output values ‘bilinear’ | ‘nearest’. Default: ‘bilinear’.

  • border_mode (str) – padding mode for outside grid values ‘zeros’ | ‘border’ | ‘reflection’. Default: ‘zeros’.

  • align_corners (bool) – interpolation flag. Default: False.

Returns

the warped input image \((B, C, H, W)\).

Return type

torch.Tensor

Note

See a working example here.

warp_affine(src: torch.Tensor, M: torch.Tensor, dsize: Tuple[int, int], flags: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = False) → torch.Tensor[source]

Applies an affine transformation to a tensor.

The function warp_affine transforms the source tensor using the specified matrix:

\[\text{dst}(x, y) = \text{src} \left( M_{11} x + M_{12} y + M_{13} , M_{21} x + M_{22} y + M_{23} \right )\]
Parameters
  • src (torch.Tensor) – input tensor of shape \((B, C, H, W)\).

  • M (torch.Tensor) – affine transformation of shape \((B, 2, 3)\).

  • dsize (Tuple[int, int]) – size of the output image (height, width).

  • mode (str) – interpolation mode to calculate output values ‘bilinear’ | ‘nearest’. Default: ‘bilinear’.

  • padding_mode (str) – padding mode for outside grid values ‘zeros’ | ‘border’ | ‘reflection’. Default: ‘zeros’.

  • align_corners (bool) – mode for grid_generation. Default: False.

Returns

the warped tensor with shape \((B, C, H, W)\).

Return type

torch.Tensor

Note

See a working example here.

warp_projective(src: torch.Tensor, M: torch.Tensor, dsize: Tuple[int, int, int], flags: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = True) → torch.Tensor[source]

Applies a projective transformation a to 3d tensor.

Warning

This API signature it is experimental and might suffer some changes in the future.

Parameters
  • src (torch.Tensor) – input tensor of shape \((B, C, D, H, W)\).

  • M (torch.Tensor) – projective transformation matrix of shape \((B, 3, 4)\).

  • dsize (Tuple[int, int, int]) – size of the output image (depth, height, width).

  • mode (str) – interpolation mode to calculate output values ‘bilinear’ | ‘nearest’. Default: ‘bilinear’.

  • padding_mode (str) – padding mode for outside grid values ‘zeros’ | ‘border’ | ‘reflection’. Default: ‘zeros’.

  • align_corners (bool) – mode for grid_generation. Default: True.

Returns

the warped 3d tensor with shape \((B, C, D, H, W)\).

Return type

torch.Tensor

get_perspective_transform(src, dst)[source]

Calculates a perspective transform from four pairs of the corresponding points.

The function calculates the matrix of a perspective transform so that:

\[\begin{split}\begin{bmatrix} t_{i}x_{i}^{'} \\ t_{i}y_{i}^{'} \\ t_{i} \\ \end{bmatrix} = \textbf{map_matrix} \cdot \begin{bmatrix} x_{i} \\ y_{i} \\ 1 \\ \end{bmatrix}\end{split}\]

where

\[dst(i) = (x_{i}^{'},y_{i}^{'}), src(i) = (x_{i}, y_{i}), i = 0,1,2,3\]
Parameters
  • src (Tensor) – coordinates of quadrangle vertices in the source image.

  • dst (Tensor) – coordinates of the corresponding quadrangle vertices in the destination image.

Returns

the perspective transformation.

Return type

Tensor

Shape:
  • Input: \((B, 4, 2)\) and \((B, 4, 2)\)

  • Output: \((B, 3, 3)\)

get_projective_transform(center: torch.Tensor, angles: torch.Tensor, scales: torch.Tensor) → torch.Tensor[source]

Calculates the projection matrix for a 3D rotation.

Warning

This API signature it is experimental and might suffer some changes in the future.

The function computes the projection matrix given the center and angles per axis.

Parameters
  • center (torch.Tensor) – center of the rotation in the source with shape \((B, 3)\).

  • angles (torch.Tensor) – angle axis vector containing the rotation angles in degrees in the form of (rx, ry, rz) with shape \((B, 3)\). Internally it calls Rodrigues to compute the rotation matrix from axis-angle.

  • scales (torch.Tensor) – isotropic scale factor.

Returns

the projection matrix of 3D rotation with shape \((B, 3, 4)\).

Return type

torch.Tensor

get_rotation_matrix2d(center: torch.Tensor, angle: torch.Tensor, scale: torch.Tensor) → torch.Tensor[source]

Calculates an affine matrix of 2D rotation.

The function calculates the following matrix:

\[\begin{split}\begin{bmatrix} \alpha & \beta & (1 - \alpha) \cdot \text{x} - \beta \cdot \text{y} \\ -\beta & \alpha & \beta \cdot \text{x} + (1 - \alpha) \cdot \text{y} \end{bmatrix}\end{split}\]

where

\[\begin{split}\alpha = \text{scale} \cdot cos(\text{angle}) \\ \beta = \text{scale} \cdot sin(\text{angle})\end{split}\]

The transformation maps the rotation center to itself If this is not the target, adjust the shift.

Parameters
  • center (Tensor) – center of the rotation in the source image.

  • angle (Tensor) – rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner).

  • scale (Tensor) – isotropic scale factor.

Returns

the affine matrix of 2D rotation.

Return type

Tensor

Shape:
  • Input: \((B, 2)\), \((B)\) and \((B)\)

  • Output: \((B, 2, 3)\)

Example

>>> center = torch.zeros(1, 2)
>>> scale = torch.ones(1)
>>> angle = 45. * torch.ones(1)
>>> M = kornia.get_rotation_matrix2d(center, angle, scale)
tensor([[[ 0.7071,  0.7071,  0.0000],
         [-0.7071,  0.7071,  0.0000]]])
remap(tensor: torch.Tensor, map_x: torch.Tensor, map_y: torch.Tensor, align_corners: bool = False) → torch.Tensor[source]

Applies a generic geometrical transformation to a tensor.

The function remap transforms the source tensor using the specified map:

\[\text{dst}(x, y) = \text{src}(map_x(x, y), map_y(x, y))\]
Parameters
  • tensor (torch.Tensor) – the tensor to remap with shape (B, D, H, W). Where D is the number of channels.

  • map_x (torch.Tensor) – the flow in the x-direction in pixel coordinates. The tensor must be in the shape of (B, H, W).

  • map_y (torch.Tensor) – the flow in the y-direction in pixel coordinates. The tensor must be in the shape of (B, H, W).

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

the warped tensor.

Return type

torch.Tensor

Example

>>> grid = kornia.utils.create_meshgrid(2, 2, False)  # 1x2x2x2
>>> grid += 1  # apply offset in both directions
>>> input = torch.ones(1, 1, 2, 2)
>>> kornia.remap(input, grid[..., 0], grid[..., 1])   # 1x1x2x2
tensor([[[[1., 0.],
          [0., 0.]]]])
invert_affine_transform(matrix: torch.Tensor) → torch.Tensor[source]

Inverts an affine transformation.

The function computes an inverse affine transformation represented by 2×3 matrix:

\[\begin{split}\begin{bmatrix} a_{11} & a_{12} & b_{1} \\ a_{21} & a_{22} & b_{2} \\ \end{bmatrix}\end{split}\]

The result is also a 2×3 matrix of the same type as M.

Parameters

matrix (torch.Tensor) – original affine transform. The tensor must be in the shape of (B, 2, 3).

Returns

the reverse affine transform.

Return type

torch.Tensor

projection_from_Rt(rmat: torch.Tensor, tvec: torch.Tensor) → torch.Tensor[source]

Compute the projection matrix from Rotation and translation.

Warning

This API signature it is experimental and might suffer some changes in the future.

Concatenates the batch of rotations and translations such that \(P = [R | t]\).

Parameters
  • rmat (torch.Tensor) – the rotation matrix with shape \((*, 3, 3)\).

  • tvec (torch.Tensor) – the translation vector with shape \((*, 3, 1)\).

Returns

the projection matrix with shape \((*, 3, 4)\).

Return type

torch.Tensor

center_crop(tensor: torch.Tensor, size: Tuple[int, int], interpolation: str = 'bilinear', align_corners: bool = True) → torch.Tensor[source]

Crops the given tensor at the center.

Parameters
Returns

the output tensor with patches.

Return type

torch.Tensor

Examples

>>> input = torch.tensor([[
        [1., 2., 3., 4.],
        [5., 6., 7., 8.],
        [9., 10., 11., 12.],
        [13., 14., 15., 16.],
     ]])
>>> kornia.center_crop(input, (2, 4))
tensor([[[ 5.0000,  6.0000,  7.0000,  8.0000],
         [ 9.0000, 10.0000, 11.0000, 12.0000]]])
crop_and_resize(tensor: torch.Tensor, boxes: torch.Tensor, size: Tuple[int, int], interpolation: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Extracts crops from the input tensor and resizes them.

Parameters
  • tensor (torch.Tensor) – the reference tensor of shape BxCxHxW.

  • boxes (torch.Tensor) – a tensor containing the coordinates of the bounding boxes to be extracted. The tensor must have the shape of Bx4x2, where each box is defined in the following (clockwise) order: top-left, top-right, bottom-right and bottom-left. The coordinates must be in the x, y order.

  • size (Tuple[int, int]) – a tuple with the height and width that will be used to resize the extracted patches.

  • align_corners (bool) – mode for grid_generation. Default: False. See https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for details

Returns

tensor containing the patches with shape BxN1xN2

Return type

torch.Tensor

Example

>>> input = torch.tensor([[
        [1., 2., 3., 4.],
        [5., 6., 7., 8.],
        [9., 10., 11., 12.],
        [13., 14., 15., 16.],
    ]])
>>> boxes = torch.tensor([[
        [1., 1.],
        [2., 1.],
        [2., 2.],
        [1., 2.],
    ]])  # 1x4x2
>>> kornia.crop_and_resize(input, boxes, (2, 2))
tensor([[[ 6.0000,  7.0000],
         [ 10.0000, 11.0000]]])
pyrdown(input: torch.Tensor, border_type: str = 'reflect', align_corners: bool = False) → torch.Tensor[source]

Blurs a tensor and downsamples it.

See PyrDown for details.

pyrup(input: torch.Tensor, border_type: str = 'reflect', align_corners: bool = False) → torch.Tensor[source]

Upsamples a tensor and then blurs it.

See PyrUp for details.

build_pyramid(input: torch.Tensor, max_level: int, border_type: str = 'reflect', align_corners: bool = False) → List[torch.Tensor][source]

Constructs the Gaussian pyramid for an image.

The function constructs a vector of images and builds the Gaussian pyramid by recursively applying pyrDown to the previously built pyramid layers.

Parameters
  • input (torch.Tensor) – the tensor to be used to constructuct the pyramid.

  • max_level (int) – 0-based index of the last (the smallest) pyramid layer. It must be non-negative.

  • border_type (str) – the padding mode to be applied before convolving. The expected modes are: 'constant', 'reflect', 'replicate' or 'circular'. Default: 'reflect'.

  • align_corners (bool) – interpolation flag. Default: False. See

Shape:
  • Input: \((B, C, H, W)\)

  • Output \([(B, NL, C, H, W), (B, NL, C, H/2, W/2), ...]\)

affine(tensor: torch.Tensor, matrix: torch.Tensor, mode: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Apply an affine transformation to the image.

Parameters
  • tensor (torch.Tensor) – The image tensor to be warped in shapes of \((H, W)\), \((D, H, W)\) and \((B, C, H, W)\).

  • matrix (torch.Tensor) – The 2x3 affine transformation matrix.

  • mode (str) – ‘bilinear’ | ‘nearest’

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The warped image.

Return type

torch.Tensor

rotate(tensor: torch.Tensor, angle: torch.Tensor, center: Union[None, torch.Tensor] = None, mode: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Rotate the image anti-clockwise about the centre.

See Rotate for details.

translate(tensor: torch.Tensor, translation: torch.Tensor, align_corners: bool = False) → torch.Tensor[source]

Translate the tensor in pixel units.

See Translate for details.

scale(tensor: torch.Tensor, scale_factor: torch.Tensor, center: Union[None, torch.Tensor] = None, align_corners: bool = False) → torch.Tensor[source]

Scales the input image.

See Scale for details.

shear(tensor: torch.Tensor, shear: torch.Tensor, align_corners: bool = False) → torch.Tensor[source]

Shear the tensor.

See Shear for details.

hflip(input: torch.Tensor) → torch.Tensor[source]

Horizontally flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The horizontally flipped image tensor

Return type

torch.Tensor

vflip(input: torch.Tensor) → torch.Tensor[source]

Vertically flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The vertically flipped image tensor

Return type

torch.Tensor

rot180(input: torch.Tensor) → torch.Tensor[source]

Rotate a tensor image or a batch of tensor images 180 degrees. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The rotated image tensor

Return type

torch.Tensor

resize(input: torch.Tensor, size: Union[int, Tuple[int, int]], interpolation: str = 'bilinear', align_corners: bool = False) → torch.Tensor[source]

Resize the input torch.Tensor to the given size.

See Resize for details.

class Rotate(angle: torch.Tensor, center: Union[None, torch.Tensor] = None, align_corners: bool = False)[source]

Rotate the tensor anti-clockwise about the centre.

Parameters
  • angle (torch.Tensor) – The angle through which to rotate. The tensor must have a shape of (B), where B is batch size.

  • center (torch.Tensor) – The center through which to rotate. The tensor must have a shape of (B, 2), where B is batch size and last dimension contains cx and cy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The rotated tensor.

Return type

torch.Tensor

class Translate(translation: torch.Tensor, align_corners: bool = False)[source]

Translate the tensor in pixel units.

Parameters
  • translation (torch.Tensor) – tensor containing the amount of pixels to translate in the x and y direction. The tensor must have a shape of (B, 2), where B is batch size, last dimension contains dx dy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The translated tensor.

Return type

torch.Tensor

class Scale(scale_factor: torch.Tensor, center: Union[None, torch.Tensor] = None, align_corners: bool = False)[source]

Scale the tensor by a factor.

Parameters
  • scale_factor (torch.Tensor) – The scale factor apply. The tensor must have a shape of (B), where B is batch size.

  • center (torch.Tensor) – The center through which to scale. The tensor must have a shape of (B, 2), where B is batch size and last dimension contains cx and cy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The scaled tensor.

Return type

torch.Tensor

class Shear(shear: torch.Tensor, align_corners: bool = False)[source]

Shear the tensor.

Parameters
  • tensor (torch.Tensor) – The image tensor to be skewed.

  • shear (torch.Tensor) – tensor containing the angle to shear in the x and y direction. The tensor must have a shape of (B, 2), where B is batch size, last dimension contains shx shy.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The skewed tensor.

Return type

torch.Tensor

class PyrDown(border_type: str = 'reflect', align_corners: bool = False)[source]

Blurs a tensor and downsamples it.

Parameters
  • border_type (str) – the padding mode to be applied before convolving. The expected modes are: 'constant', 'reflect', 'replicate' or 'circular'. Default: 'reflect'.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

the downsampled tensor.

Return type

torch.Tensor

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H / 2, W / 2)\)

Examples

>>> input = torch.rand(1, 2, 4, 4)
>>> output = kornia.transform.PyrDown()(input)  # 1x2x2x2
class PyrUp(border_type: str = 'reflect', align_corners: bool = False)[source]

Upsamples a tensor and then blurs it.

Parameters
  • borde_type (str) – the padding mode to be applied before convolving. The expected modes are: 'constant', 'reflect', 'replicate' or 'circular'. Default: 'reflect'.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

the upsampled tensor.

Return type

torch.Tensor

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H * 2, W * 2)\)

Examples

>>> input = torch.rand(1, 2, 4, 4)
>>> output = kornia.transform.PyrUp()(input)  # 1x2x8x8
class ScalePyramid(n_levels: int = 3, init_sigma: float = 1.6, min_size: int = 15, double_image: bool = False)[source]

Creates an scale pyramid of image, usually used for local feature detection. Images are consequently smoothed with Gaussian blur and downscaled. :param n_levels: number of the levels in octave. :type n_levels: int :param init_sigma: initial blur level. :type init_sigma: float :param min_size: the minimum size of the octave in pixels. Default is 5 :type min_size: int :param double_image: add 2x upscaled image as 1st level of pyramid. OpenCV SIFT does this. Default is False :type double_image: bool

Returns

1st output: images 2nd output: sigmas (coefficients for scale conversion) 3rd output: pixelDists (coefficients for coordinate conversion)

Return type

Tuple(List(Tensors), List(Tensors), List(Tensors))

Shape:
  • Input: \((B, C, H, W)\)

  • Output 1st: \([(B, C, NL, H, W), (B, C, NL, H/2, W/2), ...]\)

  • Output 2nd: \([(B, NL), (B, NL), (B, NL), ...]\)

  • Output 3rd: \([(B, NL), (B, NL), (B, NL), ...]\)

Examples::
>>> input = torch.rand(2, 4, 100, 100)
>>> sp, sigmas, pds = kornia.ScalePyramid(3, 15)(input)
class Hflip[source]

Horizontally flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The horizontally flipped image tensor

Return type

torch.Tensor

Examples

>>> input = torch.tensor([[[
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 1., 1.]]]])
>>> kornia.hflip(input)
tensor([[[0, 0, 0],
         [0, 0, 0],
         [1, 1, 0]]])
class Vflip[source]

Vertically flip a tensor image or a batch of tensor images. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Returns

The vertically flipped image tensor

Return type

torch.Tensor

Examples

>>> input = torch.tensor([[[
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 1., 1.]]]])
>>> kornia.vflip(input)
tensor([[[0, 1, 1],
         [0, 0, 0],
         [0, 0, 0]]])
class Rot180[source]

Rotate a tensor image or a batch of tensor images 180 degrees. Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters

input (torch.Tensor) – input tensor

Examples

>>> input = torch.tensor([[[
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 1., 1.]]]])
>>> kornia.rot180(input)
tensor([[[1, 1, 0],
        [0, 0, 0],
        [0, 0, 0]]])
class Resize(size: Union[int, Tuple[int, int]], interpolation: str = 'bilinear', align_corners: bool = False)[source]

Resize the input torch.Tensor to the given size.

Parameters
  • size (int, tuple(int, int)) – Desired output size. If size is a sequence like (h, w),

  • size will be matched to this. If size is an int, smaller edge of the image will (output) –

  • matched to this number. i.e, if height > width, then image will be rescaled (be) –

  • to (size * height / width, size) –

  • interpolation (str) – algorithm used for upsampling: ‘nearest’ | ‘linear’ | ‘bilinear’ |

  • | 'trilinear' | 'area'. Default ('bicubic') – ‘bilinear’.

  • align_corners (bool) – interpolation flag. Default: False. See

  • https – //pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate for detail

Returns

The resized tensor.

Return type

torch.Tensor