kornia.augmentation

The classes in this section perform various data augmentation operations.

Kornia provides Torchvision-like augmentation APIs while may not reproduce Torchvision, because Kornia is a library aligns to OpenCV functionalities, not PIL. Besides, pure floating computation is used in Kornia which gaurentees a better precision without any float -> uint8 conversions. To be specified, the different functions are:

  • AdjustContrast

  • AdjustBrightness

  • RandomRectangleErasing

For detailed comparision, please checkout the For detailed comparision, please checkout the Colab: Kornia vs. Torchvision.

Containers

This is the base class for creating a new transform. The user only needs to overrive: generate_parameters, apply_transform and optionally, compute_transformation.

class AugmentationBase(return_transform: bool = False)[source]
generate_parameters(input_shape: torch.Size) → Dict[str, torch.Tensor][source]
compute_transformation(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]
apply_transform(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Create your own transformation:

import torch
import kornia as K

from kornia.augmentation import AugmentationBase

class MyRandomTransform(AugmentationBase):
   def __init__(self, return_transform: bool = False) -> None:
      super(MyRandomTransform, self).__init__(return_transform)

   def generate_parameters(self, input_shape: torch.Size):
      # generate the random parameters for your use case.
      angles_rad torch.Tensor = torch.rand(batch_shape) * K.pi
      angles_deg = kornia.rad2deg(angles_rad) * self.angle
      return dict(angles=angles_deg)

   def compute_transformation(self, input, params):
      # compute transformation
      angles: torch.Tensor = params['angles'].type_as(input)
      center = torch.tensor([[W / 2, H / 2]]).type_as(input)
      transform = K.get_rotation_matrix2d(
         center, angles, torch.ones_like(angles))
      return transform

   def apply_transform(self, input, params):
      # compute transformation
      transform = self.compute_transform(input, params)

      # apply transformation and return
      output = K.warp_affine(input, transform, (H, W))
      return (output, transform)

Module

Kornia augmentation implementations can be easily used in a TorchVision style using nn.Sequential.

import kornia.augmentation as K
import torch.nn as nn

transform = nn.Sequential(
   K.RandomAffine(360),
   K.ColorJitter(0.2, 0.3, 0.2, 0.3)
)

Kornia augmentation implementations have two additional parameters compare to TorchVision, return_transform and same_on_batch. The former provides the ability of undoing one geometry transformation while the latter can be used to control the randomness for a batched transformation. To enable those behaviour, you may simply set the flags to True.

import kornia.augmentation as K

class MyAugmentationPipeline(nn.Module):
   def __init__(self) -> None:
      super(MyAugmentationPipeline, self).__init__()
      self.aff = K.RandomAffine(
         360, return_transform=True, same_on_batch=True
      )
      self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3, same_on_batch=True)

   def forward(self, input):
      input, transform = self.aff(input)
      input, transform = self.jit((input, transform))
      return input, transform

Example for semantic segmentation using low-level randomness control:

import kornia.augmentation as K

class MyAugmentationPipeline(nn.Module):
   def __init__(self) -> None:
      super(MyAugmentationPipeline, self).__init__()
      self.aff = K.RandomAffine(360)
      self.jit = K.ColorJitter(0.2, 0.3, 0.2, 0.3)

   def forward(self, input, mask):
      assert input.shape == mask.shape,
         f"Input shape should be consistent with mask shape, "
         f"while got {input.shape}, {mask.shape}"

      aff_params = self.aff.generate_parameters(input.shape)
      input = self.aff(input, aff_params)
      mask = self.aff(mask, aff_params)

      jit_params = self.jit.generate_parameters(input.shape)
      input = self.jit(input, jit_params)
      mask = self.jit(mask, jit_params)
      return input, mask
class CenterCrop(size: Union[int, Tuple[int, int]], return_transform: bool = False)[source]

Crops the given torch.Tensor at the center.

Parameters
  • size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.

  • return_transform (bool) – if True return the matrix describing the transformation applied to each. Default: False.

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.randn(1, 1, 3, 3)
>>> aug = CenterCrop(2)
>>> aug(inputs)
tensor([[[[-0.1425, -1.1266],
          [-0.0373, -0.6562]]]])
class ColorJitter(brightness: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, contrast: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, saturation: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, hue: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, return_transform: bool = False, same_on_batch: bool = False)[source]

Change the brightness, contrast, saturation and hue randomly given tensor image or a batch of tensor images. Input should be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters
  • brightness (float or tuple) – Default value is 0

  • contrast (float or tuple) – Default value is 0

  • saturation (float or tuple) – Default value is 0

  • hue (float or tuple) – Default value is 0

  • return_transform (bool) – if True return the matrix describing the transformation applied to each input tensor. If False and the input is a tuple the applied transformation wont be concatenated

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.ones(1, 3, 3, 3)
>>> aug = ColorJitter(0.1, 0.1, 0.1, 0.1)
>>> aug(inputs)
tensor([[[[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]],
<BLANKLINE>
         [[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]],
<BLANKLINE>
         [[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]]]])
class Denormalize(mean: Union[torch.Tensor, float], std: Union[torch.Tensor, float])[source]

Denormalize a tensor image or a batch of tensor images.

Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Given mean: (M1,...,Mn) and std: (S1,..,Sn) for n channels, this transform will denormalize each channel of the input torch.Tensor i.e. input[channel] = (input[channel] * std[channel]) + mean[channel]

Parameters
class Normalize(mean: Union[torch.Tensor, float], std: Union[torch.Tensor, float])[source]

Normalize a tensor image or a batch of tensor images with mean and standard deviation.

Input must be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Given mean: (M1,...,Mn) and std: (S1,..,Sn) for n channels, this transform will normalize each channel of the input torch.Tensor i.e. input[channel] = (input[channel] - mean[channel]) / std[channel]

Parameters
class RandomAffine(degrees: Union[float, Tuple[float, float]], translate: Optional[Tuple[float, float]] = None, scale: Optional[Tuple[float, float]] = None, shear: Union[float, Tuple[float, float], None] = None, resample: Union[str, int, kornia.constants.Resample] = 'BILINEAR', return_transform: bool = False, same_on_batch: bool = False, align_corners: bool = False)[source]

Random affine transformation of the image keeping center invariant.

Parameters
  • degrees (float or tuple) – Range of degrees to select from. If degrees is a number instead of sequence like (min, max), the range of degrees will be (-degrees, +degrees). Set to 0 to deactivate rotations.

  • translate (tuple, optional) – tuple of maximum absolute fraction for horizontal and vertical translations. For example translate=(a, b), then horizontal shift is randomly sampled in the range -img_width * a < dx < img_width * a and vertical shift is randomly sampled in the range -img_height * b < dy < img_height * b. Will not translate by default.

  • scale (tuple, optional) – scaling factor interval, e.g (a, b), then scale is randomly sampled from the range a <= scale <= b. Will keep original scale by default.

  • shear (sequence or float, optional) – Range of degrees to select from. If shear is a number, a shear parallel to the x axis in the range (-shear, +shear) will be apllied. Else if shear is a tuple or list of 2 values a shear parallel to the x axis in the range (shear[0], shear[1]) will be applied. Else if shear is a tuple or list of 4 values, a x-axis shear in (shear[0], shear[1]) and y-axis shear in (shear[2], shear[3]) will be applied. Will not apply shear by default

  • resample (int, str or kornia.Resample) – Default: Resample.BILINEAR

  • return_transform (bool) – if True return the matrix describing the transformation applied to each. Default: False.

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 3, 3)
>>> aug = RandomAffine((-15., 20.), return_transform=True)
>>> aug(input)
(tensor([[[[0.3961, 0.7310, 0.1574],
          [0.1781, 0.3074, 0.5648],
          [0.4804, 0.8379, 0.4234]]]]), tensor([[[ 0.9923, -0.1241,  0.1319],
         [ 0.1241,  0.9923, -0.1164],
         [ 0.0000,  0.0000,  1.0000]]]))
class RandomCrop(size: Tuple[int, int], padding: Union[int, Tuple[int, int], Tuple[int, int, int, int], None] = None, pad_if_needed: Optional[bool] = False, fill: int = 0, padding_mode: str = 'constant', return_transform: bool = False, same_on_batch: bool = False, align_corners: bool = False)[source]

Random Crop on given size.

Parameters
  • size (tuple) – Desired output size of the crop, like (h, w).

  • padding (int or sequence, optional) – Optional padding on each border of the image. Default is None, i.e no padding. If a sequence of length 4 is provided, it is used to pad left, top, right, bottom borders respectively. If a sequence of length 2 is provided, it is used to pad left/right, top/bottom borders, respectively.

  • pad_if_needed (boolean) – It will pad the image if smaller than the desired size to avoid raising an exception. Since cropping is done after padding, the padding seems to be done at a random offset.

  • fill – Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant

  • padding_mode – Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant.

  • return_transform (bool) – if True return the matrix describing the transformation applied to each input tensor. If False and the input is a tuple the applied transformation wont be concatenated

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.randn(1, 1, 3, 3)
>>> aug = RandomCrop((2, 2))
>>> aug(inputs)
tensor([[[[-0.6562, -1.0009],
          [ 0.2223, -0.5507]]]])
class RandomErasing(p: float = 0.5, scale: Tuple[float, float] = (0.02, 0.33), ratio: Tuple[float, float] = (0.3, 3.3), value: float = 0.0, return_transform: bool = False, same_on_batch: bool = False)[source]

Erases a random selected rectangle for each image in the batch, putting the value to zero. The rectangle will have an area equal to the original image area multiplied by a value uniformly sampled between the range [scale[0], scale[1]) and an aspect ratio sampled between [ratio[0], ratio[1])

Parameters
  • p (float) – probability that the random erasing operation will be performed.

  • scale (Tuple[float, float]) – range of proportion of erased area against input image.

  • ratio (Tuple[float, float]) – range of aspect ratio of erased area.

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.ones(1, 1, 3, 3)
>>> rec_er = RandomErasing(1.0, (.4, .8), (.3, 1/.3))
>>> rec_er(inputs)
tensor([[[[1., 0., 0.],
          [1., 0., 0.],
          [1., 0., 0.]]]])
class RandomGrayscale(p: float = 0.1, return_transform: bool = False, same_on_batch: bool = False)[source]

Random Grayscale transformation according to a probability p value

Parameters
  • p (float) – probability of the image to be transformed to grayscale. Default value is 0.1

  • return_transform (bool) – if True return the matrix describing the transformation applied to each input tensor. If False and the input is a tuple the applied transformation wont be concatenated

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.randn((1, 3, 3, 3))
>>> rec_er = RandomGrayscale(p=1.0)
>>> rec_er(inputs)
tensor([[[[-1.1344, -0.1330,  0.1517],
          [-0.0791,  0.6711, -0.1413],
          [-0.1717, -0.9023,  0.0819]],
<BLANKLINE>
         [[-1.1344, -0.1330,  0.1517],
          [-0.0791,  0.6711, -0.1413],
          [-0.1717, -0.9023,  0.0819]],
<BLANKLINE>
         [[-1.1344, -0.1330,  0.1517],
          [-0.0791,  0.6711, -0.1413],
          [-0.1717, -0.9023,  0.0819]]]])
class RandomHorizontalFlip(p: float = 0.5, return_transform: bool = False, same_on_batch: bool = False, align_corners: bool = False)[source]

Horizontally flip a tensor image or a batch of tensor images randomly with a given probability. Input should be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\). If Input is a tuple it is assumed that the first element contains the aforementioned tensors and the second, the corresponding transformation matrix that has been applied to them. In this case the module will Horizontally flip the tensors and concatenate the corresponding transformation matrix to the previous one. This is especially useful when using this functionality as part of an nn.Sequential module.

Parameters
  • p (float) – probability of the image being flipped. Default value is 0.5

  • return_transform (bool) – if True return the matrix describing the transformation applied to each input tensor. If False and the input is a tuple the applied transformation wont be concatenated

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> input = torch.tensor([[[[0., 0., 0.],
...                         [0., 0., 0.],
...                         [0., 1., 1.]]]])
>>> seq = nn.Sequential(RandomHorizontalFlip(p=1.0, return_transform=True),
...                     RandomHorizontalFlip(p=1.0, return_transform=True))
>>> seq(input)
(tensor([[[[0., 0., 0.],
          [0., 0., 0.],
          [0., 1., 1.]]]]), tensor([[[1., 0., 0.],
         [0., 1., 0.],
         [0., 0., 1.]]]))
class RandomMotionBlur(kernel_size: Union[int, Tuple[int, int]], angle: Union[float, Tuple[float, float]], direction: Union[float, Tuple[float, float]], border_type: Union[int, str, kornia.constants.BorderType] = 'CONSTANT', return_transform: bool = False)[source]

Blurs a tensor using the motion filter. Same transformation happens across batches.

Parameters
  • kernel_size (int or Tuple[int, int]) – motion kernel width and height (odd and positive). If int, the kernel will have a fixed size. If Tuple[int, int], it will randomly generate the value from the range.

  • angle (float or Tuple[float, float]) – angle of the motion blur in degrees (anti-clockwise rotation). If float, it will generate the value from (-angle, angle).

  • direction (float or Tuple[float, float]) – forward/backward direction of the motion blur. Lower values towards -1.0 will point the motion blur towards the back (with angle provided via angle), while higher values towards 1.0 will point the motion blur forward. A value of 0.0 leads to a uniformly (but still angled) motion blur. If float, it will generate the value from (-direction, direction). If Tuple[int, int], it will randomly generate the value from the range.

  • border_type (int, str or kornia.BorderType) – the padding mode to be applied before convolving. CONSTANT = 0, REFLECT = 1, REPLICATE = 2, CIRCULAR = 3. Default: BorderType.CONSTANT.

Shape:
  • Input: \((B, C, H, W)\)

  • Output: \((B, C, H, W)\)

Examples::
>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 5, 5)
>>> motion_blur = RandomMotionBlur(3, 35., 0.5)
>>> motion_blur(input)
tensor([[[[0.2724, 0.5235, 0.3796, 0.2433, 0.2210],
          [0.3233, 0.5494, 0.5746, 0.5407, 0.3910],
          [0.2101, 0.3865, 0.3072, 0.2510, 0.1902],
          [0.2973, 0.6174, 0.6530, 0.4360, 0.2797],
          [0.3804, 0.6217, 0.5535, 0.4855, 0.4249]]]])
class RandomPerspective(distortion_scale: float = 0.5, p: float = 0.5, interpolation: Union[str, int, kornia.constants.Resample] = 'BILINEAR', return_transform: bool = False, same_on_batch: bool = False, align_corners: bool = False)[source]

Performs Perspective transformation of the given torch.Tensor randomly with a given probability.

Parameters
  • p (float) – probability of the image being perspectively transformed. Default value is 0.5

  • distortion_scale (float) – it controls the degree of distortion and ranges from 0 to 1. Default value is 0.5.

  • interpolation (int, str or kornia.Resample) – Default: Resample.BILINEAR

  • return_transform (bool) – if True return the matrix describing the transformation applied to each. Default: False.

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> inputs= torch.tensor([[[[1., 0., 0.],
...                         [0., 1., 0.],
...                         [0., 0., 1.]]]])
>>> aug = RandomPerspective(0.5, 1.0)
>>> aug(inputs)
tensor([[[[0.0000, 0.2289, 0.0000],
          [0.0000, 0.4800, 0.0000],
          [0.0000, 0.0000, 0.0000]]]])
class RandomResizedCrop(size: Tuple[int, int], scale: Tuple[float, float] = (0.08, 1.0), ratio: Tuple[float, float] = (1.75, 1.3333333333333333), interpolation: Union[str, int, kornia.constants.Resample] = 'BILINEAR', return_transform: bool = False, same_on_batch: bool = False, align_corners: bool = False)[source]

Random Crop on given size and resizing the cropped patch to another.

Parameters
  • size (Tuple[int, int]) – expected output size of each edge

  • scale – range of size of the origin size cropped

  • ratio – range of aspect ratio of the origin aspect ratio cropped

  • interpolation (int, str or kornia.Resample) – Default: Resample.BILINEAR

  • return_transform (bool) – if True return the matrix describing the transformation applied to each input tensor. If False and the input is a tuple the applied transformation wont be concatenated

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Example

>>> rng = torch.manual_seed(0)
>>> inputs = torch.tensor([[[0., 1., 2.],
...                         [3., 4., 5.],
...                         [6., 7., 8.]]])
>>> aug = RandomResizedCrop(size=(3, 3), scale=(3., 3.), ratio=(2., 2.))
>>> aug(inputs)
tensor([[[[3.7500, 4.7500, 5.7500],
          [5.2500, 6.2500, 7.2500],
          [4.5000, 5.2500, 6.0000]]]])
class RandomRotation(degrees: Union[torch.Tensor, float, Tuple[float, float], List[float]], interpolation: Union[str, int, kornia.constants.Resample] = 'BILINEAR', return_transform: bool = False, same_on_batch: bool = False, align_corners: bool = False)[source]

Rotate a tensor image or a batch of tensor images a random amount of degrees. Input should be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\). If Input is a tuple it is assumed that the first element contains the aforementioned tensors and the second, the corresponding transformation matrix that has been applied to them. In this case the module will rotate the tensors and concatenate the corresponding transformation matrix to the previous one. This is especially useful when using this functionality as part of an nn.Sequential module.

Parameters
  • degrees (sequence or float or tensor) – range of degrees to select from. If degrees is a number the

  • of degrees to select from will be (range) –

  • interpolation (int, str or kornia.Resample) – Default: Resample.BILINEAR

  • return_transform (bool) – if True return the matrix describing the transformation applied to each input tensor. If False and the input is a tuple the applied transformation wont be concatenated

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.tensor([[1., 0., 0., 2.],
...                       [0., 0., 0., 0.],
...                       [0., 1., 2., 0.],
...                       [0., 0., 1., 2.]])
>>> seq = RandomRotation(degrees=45.0, return_transform=True)
>>> seq(input)
(tensor([[[[0.9824, 0.0088, 0.0000, 1.9649],
          [0.0000, 0.0029, 0.0000, 0.0176],
          [0.0029, 1.0000, 1.9883, 0.0000],
          [0.0000, 0.0088, 1.0117, 1.9649]]]]), tensor([[[ 1.0000, -0.0059,  0.0088],
         [ 0.0059,  1.0000, -0.0088],
         [ 0.0000,  0.0000,  1.0000]]]))
class RandomVerticalFlip(p: float = 0.5, return_transform: bool = False, same_on_batch: bool = False)[source]

Vertically flip a tensor image or a batch of tensor images randomly with a given probability. Input should be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\). If Input is a tuple it is assumed that the first element contains the aforementioned tensors and the second, the corresponding transformation matrix that has been applied to them. In this case the module will Vertically flip the tensors and concatenate the corresponding transformation matrix to the previous one. This is especially useful when using this functionality as part of an nn.Sequential module.

Parameters
  • p (float) – probability of the image being flipped. Default value is 0.5

  • return_transform (bool) – if True return the matrix describing the transformation applied to each input tensor. If False and the input is a tuple the applied transformation wont be concatenated

  • same_on_batch (bool) – apply the same transformation across the batch. Default: False

Examples

>>> input = torch.tensor([[[[0., 0., 0.],
...                         [0., 0., 0.],
...                         [0., 1., 1.]]]])
>>> seq = RandomVerticalFlip(p=1.0, return_transform=True)
>>> seq(input)
(tensor([[[[0., 1., 1.],
          [0., 0., 0.],
          [0., 0., 0.]]]]), tensor([[[ 1.,  0.,  0.],
         [ 0., -1.,  3.],
         [ 0.,  0.,  1.]]]))

Functional

apply_adjust_brightness(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Wrapper for adjust_brightness for Torchvision-like param settings.

Parameters
  • input (torch.Tensor) – Image/Input to be adjusted in the shape of (*, N).

  • params (Dict[str, torch.Tensor]) –

    • params[‘brightness_factor’]: Brightness adjust factor per element

    in the batch. 0 gives a black image, 1 does not modify the input image and 2 gives a white image, while any other number modify the brightness.

Returns

Adjusted image.

Return type

torch.Tensor

apply_adjust_contrast(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Wrapper for adjust_contrast for Torchvision-like param settings.

Parameters
  • input (torch.Tensor) – Image to be adjusted in the shape of (*, N).

  • params (Dict[str, torch.Tensor]) –

    • params[‘contrast_factor’]: Contrast adjust factor per element in the batch.

    0 generates a compleatly black image, 1 does not modify the input image while any other non-negative number modify the brightness by this factor.

Returns

Adjusted image.

Return type

torch.Tensor

apply_adjust_gamma(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Perform gamma correction on an image.

Parameters
  • input (torch.Tensor) – Image/Tensor to be adjusted in the shape of (*, N).

  • params (Dict[str, torch.Tensor]) –

    • params[‘gamma_factor’]: Non negative real number, same as γgammaγ in the equation.

    gamma larger than 1 make the shadows darker, while gamma smaller than 1 make dark regions lighter.

Returns

Adjusted image.

Return type

torch.Tensor

apply_adjust_hue(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Wrapper for adjust_hue for Torchvision-like param settings.

Parameters
  • input (torch.Tensor) – Image/Tensor to be adjusted in the shape of (*, N).

  • params (Dict[str, torch.Tensor]) –

    • params[‘hue_factor’]: How much to shift the hue channel. Should be in [-0.5, 0.5].

    0.5 and -0.5 give complete reversal of hue channel in HSV space in positive and negative direction respectively. 0 means no shift. Therefore, both -0.5 and 0.5 will give an image with complementary colors while 0 gives the original image.

Returns

Adjusted image.

Return type

torch.Tensor

apply_adjust_saturation(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Wrapper for adjust_saturation for Torchvision-like param settings.

Parameters
  • input (torch.Tensor) – Image/Tensor to be adjusted in the shape of (*, N).

  • params (Dict[str, torch.Tensor]) –

    • params[‘saturation_factor’]: How much to adjust the saturation. 0 will give a black

    and white image, 1 will give the original image while 2 will enhance the saturation by a factor of 2.

Returns

Adjusted image.

Return type

torch.Tensor

apply_affine(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Random affine transformation of the image keeping center invariant.

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘angle’]: Degrees of rotation.

    • params[‘translations’]: Horizontal and vertical translations.

    • params[‘center’]: Rotation center.

    • params[‘scale’]: Scaling params.

    • params[‘sx’]: Shear param toward x-axis.

    • params[‘sy’]: Shear param toward y-axis.

    • params[‘resample’]: Integer tensor. NEAREST = 0, BILINEAR = 1.

    • params[‘align_corners’]: Boolean tensor.

Returns

The transfromed input

Return type

torch.Tensor

apply_color_jitter(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Apply Color Jitter on a tensor image or a batch of tensor images with given random parameters. Input should be a tensor of shape (H, W), (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘brightness_factor’]: The brightness factor.

    • params[‘contrast_factor’]: The contrast factor.

    • params[‘hue_factor’]: The hue factor.

    • params[‘saturation_factor’]: The saturation factor.

    • params[‘order’]: The order of applying color transforms.

    0 is brightness, 1 is contrast, 2 is saturation, 4 is hue.

Returns

The color jitterred input

Return type

torch.Tensor

apply_crop(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Apply cropping by src bounding box and dst bounding box. Order: top-left, top-right, bottom-right and bottom-left. The coordinates must be in the x, y order.

Parameters
  • input (torch.Tensor) – input image.

  • params (Dict[str, torch.Tensor]) –

    • params[‘src’]: The applied cropping src matrix :math: (*, 4, 2).

    • params[‘dst’]: The applied cropping dst matrix :math: (*, 4, 2).

    • params[‘interpolation’]: Integer tensor. NEAREST = 0, BILINEAR = 1.

    • params[‘align_corners’]: Boolean tensor.

Returns

The cropped input.

Return type

torch.Tensor

apply_erase_rectangles(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Generate a {0, 1} mask with drawed rectangle having parameters defined by params and size by input.size()

Parameters
  • input (torch.Tensor) – input image.

  • params (Dict[str, torch.Tensor]) –

    • params[‘widths’]: widths tensor

    • params[‘heights’]: heights tensor

    • params[‘xs’]: x positions tensor

    • params[‘ys’]: y positions tensor

    • params[‘values’]: the value to fill in

Returns

Erased image.

Return type

torch.Tensor

apply_grayscale(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Apply Gray Scale on a tensor image or a batch of tensor images with given random parameters. Input should be a tensor of shape (3, H, W) or a batch of tensors \((*, 3, H, W)\).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor that indicating whether if to transform an image in a batch.

Returns

The grayscaled input

Return type

torch.Tensor

apply_hflip(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Apply Horizontally flip on a tensor image or a batch of tensor images with given random parameters. Input should be a tensor of shape (H, W), (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor thatindicating whether if to transform an image in a batch.

Returns

The horizontally flipped input

Return type

torch.Tensor

apply_motion_blur(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Perform motion blur on an image

The input image is expected to be in the range of [0, 1].

Parameters
  • input (torch.Tensor) – Image/Tensor to be adjusted in the shape of (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘ksize_factor’]: motion kernel width and height (odd and positive).

    • params[‘angle_factor’]: angle of the motion blur in degrees (anti-clockwise rotation).

    • params[‘direction_factor’]: forward/backward direction of the motion blur. Lower values towards -1.0 will point the motion blur towards the back (with angle provided via angle), while higher values towards 1.0 will point the motion blur forward. A value of 0.0 leads to a uniformly (but still angled) motion blur.

    • params[‘border_type’]: the padding mode to be applied before convolving. CONSTANT = 0, REFLECT = 1, REPLICATE = 2, CIRCULAR = 3. Default: BorderType.CONSTANT.

Returns

Adjusted image with the shape as the inpute (*, C, H, W).

Return type

torch.Tensor

apply_perspective(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Perform perspective transform of the given torch.Tensor or batch of tensors.

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor thatindicating whether if to transform an image in a batch.

    • params[‘start_points’]: Tensor containing [top-left, top-right, bottom-right,

    bottom-left] of the orignal image with shape Bx4x2. - params[‘end_points’]: Tensor containing [top-left, top-right, bottom-right, bottom-left] of the transformed image with shape Bx4x2. - params[‘interpolation’]: Integer tensor. NEAREST = 0, BILINEAR = 1. - params[‘align_corners’]: Boolean tensor.

Returns

Perspectively transformed tensor.

Return type

torch.Tensor

apply_rotation(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Rotate a tensor image or a batch of tensor images a random amount of degrees. Input should be a tensor of shape (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters
Returns

The cropped input

Return type

torch.Tensor

apply_vflip(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Apply vertically flip on a tensor image or a batch of tensor images with given random parameters. Input should be a tensor of shape (H, W), (C, H, W) or a batch of tensors \((*, C, H, W)\).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor thatindicating whether if to transform an image in a batch.

Returns

The vertically flipped input

Return type

torch.Tensor

color_jitter(input: torch.Tensor, brightness: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, contrast: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, saturation: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, hue: Union[torch.Tensor, float, Tuple[float, float], List[float]] = 0.0, return_transform: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Generate params and apply operation on input tensor.

See random_color_jitter_generator() for details. See apply_color_jitter() for details.

compute_affine_transformation(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Compute the applied transformation matrix :math: (*, 3, 3).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘angle’]: Degrees of rotation.

    • params[‘translations’]: Horizontal and vertical translations.

    • params[‘center’]: Rotation center.

    • params[‘scale’]: Scaling params.

    • params[‘sx’]: Shear param toward x-axis.

    • params[‘sy’]: Shear param toward y-axis.

    • params[‘resample’]: Integer tensor. NEAREST = 0, BILINEAR = 1.

    • params[‘align_corners’]: Boolean tensor.

Returns

The applied transformation matrix :math: (*, 3, 3)

Return type

torch.Tensor

compute_crop_transformation(input: torch.Tensor, params: Dict[str, torch.Tensor])[source]

Compute the applied transformation matrix :math: (*, 3, 3).

Parameters
  • input (torch.Tensor) – input image.

  • params (Dict[str, torch.Tensor]) –

    • params[‘src’]: The applied cropping src matrix :math: (*, 4, 2).

    • params[‘dst’]: The applied cropping dst matrix :math: (*, 4, 2).

Returns

The applied transformation matrix :math: (*, 3, 3)

Return type

torch.Tensor

compute_hflip_transformation(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Compute the applied transformation matrix :math: (*, 3, 3).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor thatindicating whether if to transform an image in a batch.

Returns

The applied transformation matrix :math: (*, 3, 3)

Return type

torch.Tensor

compute_intensity_transformation(input: torch.Tensor, params: Dict[str, torch.Tensor])[source]

Compute the applied transformation matrix :math: (*, 3, 3).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor that indicating whether if to transform an image in a batch.

Returns

The applied transformation matrix :math: (*, 3, 3). Returns identity transformations.

Return type

torch.Tensor

compute_perspective_transformation(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Compute the applied transformation matrix :math: (*, 3, 3).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor thatindicating whether if to transform an image in a batch.

    • params[‘start_points’]: Tensor containing [top-left, top-right, bottom-right,

    bottom-left] of the orignal image with shape Bx4x2. - params[‘end_points’]: Tensor containing [top-left, top-right, bottom-right, bottom-left] of the transformed image with shape Bx4x2.

Returns

The applied transformation matrix :math: (*, 3, 3)

Return type

torch.Tensor

compute_rotate_tranformation(input: torch.Tensor, params: Dict[str, torch.Tensor])[source]

Compute the applied transformation matrix :math: (*, 3, 3).

Parameters
Returns

The applied transformation matrix :math: (*, 3, 3)

Return type

torch.Tensor

compute_vflip_transformation(input: torch.Tensor, params: Dict[str, torch.Tensor]) → torch.Tensor[source]

Compute the applied transformation matrix :math: (*, 3, 3).

Parameters
  • input (torch.Tensor) – Tensor to be transformed with shape (H, W), (C, H, W), (*, C, H, W).

  • params (Dict[str, torch.Tensor]) –

    • params[‘batch_prob’]: A boolean tensor thatindicating whether if to transform an image in a batch.

Returns

The applied transformation matrix :math: (*, 3, 3)

Return type

torch.Tensor

random_affine(input: torch.Tensor, degrees: Union[float, Tuple[float, float]], translate: Optional[Tuple[float, float]] = None, scale: Optional[Tuple[float, float]] = None, shear: Union[float, Tuple[float, float], None] = None, resample: Union[str, int, kornia.constants.Resample] = 'BILINEAR', return_transform: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Generate params and apply operation on input tensor.

See random_affine_generator() for details. See apply_affine() for details.

random_grayscale(input: torch.Tensor, p: float = 0.5, return_transform: bool = False)[source]

Generate params and apply operation on input tensor.

See random_prob_generator() for details. See apply_grayscale() for details.

random_hflip(input: torch.Tensor, p: float = 0.5, return_transform: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Generate params and apply operation on input tensor.

See random_prob_generator() for details. See apply_hflip() for details.

random_perspective(input: torch.Tensor, distortion_scale: float = 0.5, p: float = 0.5, return_transform: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Generate params and apply operation on input tensor.

See random_perspective_generator() for details. See apply_perspective() for details.

random_rectangle_erase(input: torch.Tensor, p: float = 0.5, scale: Tuple[float, float] = (0.02, 0.33), ratio: Tuple[float, float] = (0.3, 3.3), return_transform: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Function that erases a random selected rectangle for each image in the batch, putting the value to zero. The rectangle will have an area equal to the original image area multiplied by a value uniformly sampled between the range [scale[0], scale[1]) and an aspect ratio sampled between [aspect_ratio_range[0], aspect_ratio_range[1])

Parameters
  • input (torch.Tensor) – input images.

  • scale (Tuple[float, float]) – range of proportion of erased area against input image.

  • ratio (Tuple[float, float]) – range of aspect ratio of erased area.

See random_rectangles_params_generator() for details. See apply_erase_rectangles() for details.

random_rotation(input: torch.Tensor, degrees: Union[torch.Tensor, float, Tuple[float, float], List[float]], return_transform: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Generate params and apply operation on input tensor.

See random_rotation_generator() for details. See apply_rotation() for details.

random_vflip(input: torch.Tensor, p: float = 0.5, return_transform: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Generate params and apply operation on input tensor.

See random_prob_generator() for details. See apply_vflip() for details.