Image Augmentations¶

Transforms2D¶

Set of operators to perform data augmentation on 2D image tensors.

Intensity¶

class kornia.augmentation.ColorJiggle(brightness=0.0, contrast=0.0, saturation=0.0, hue=0.0, same_on_batch=False, p=1.0, keepdim=False)[source]¶

Apply a random transformation to the brightness, contrast, saturation and hue of a torch.Tensor image.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 1.0
brightness (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The brightness factor to apply. Default: 0.0
contrast (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The contrast factor to apply. Default: 0.0
saturation (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The saturation factor to apply. Default: 0.0
hue (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The hue factor to apply. Default: 0.0
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.adjust_brightness(), kornia.enhance.adjust_contrast(). kornia.enhance.adjust_saturation(), kornia.enhance.adjust_hue().

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.ones(1, 3, 3, 3)
>>> aug = ColorJiggle(0.1, 0.1, 0.1, 0.1, p=1.)
>>> aug(inputs)
tensor([[[[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]],

         [[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]],

         [[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = ColorJiggle(0.1, 0.1, 0.1, 0.1, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.ColorJitter(brightness=0.0, contrast=0.0, saturation=0.0, hue=0.0, same_on_batch=False, p=1.0, keepdim=False, order=None)[source]¶

Apply a random transformation to the brightness, contrast, saturation and hue of a torch.Tensor image.

This implementation aligns PIL. Hence, the output is close to TorchVision. However, it does not follow the color theory and is not be actively maintained. Prefer using kornia.augmentation.ColorJiggle()

Parameters:

brightness (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The brightness factor to apply. Default: 0.0
contrast (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The contrast factor to apply. Default: 0.0
saturation (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The saturation factor to apply. Default: 0.0
hue (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The hue factor to apply. Default: 0.0
silence_instantiation_warning – if True, silence the warning at instantiation.
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.adjust_brightness_accumulative(), kornia.enhance.adjust_contrast_with_mean_subtraction(), kornia.enhance.adjust_saturation_with_gray_subtraction(), kornia.enhance.adjust_hue().

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.ones(1, 3, 3, 3)
>>> aug = ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.)
>>> aug(inputs)
tensor([[[[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]],

         [[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]],

         [[0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993],
          [0.9993, 0.9993, 0.9993]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomAutoContrast(clip_output=True, same_on_batch=False, p=1.0, keepdim=False)[source]¶

Apply a random auto-contrast of a torch.Tensor image.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 1.0
clip_output (bool, optional) – if true clip output Default: True
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.normalize_min_max()

class kornia.augmentation.RandomBoxBlur(kernel_size=(3, 3), border_type='reflect', normalized=True, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add random blur with a box filter to an image tensor.

Parameters:

kernel_size (Tuple[int, int], optional) – the blurring kernel size. Default: (3, 3)
border_type (str, optional) – the padding mode to be applied before convolving. The expected modes are: constant, reflect, replicate or circular. Default: "reflect"
normalized (bool, optional) – if True, L1 norm of the kernel is set to 1. Default: True
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Note

This function internally uses kornia.filters.box_blur().

Examples

>>> img = torch.ones(1, 1, 24, 24)
>>> out = RandomBoxBlur((7, 7))(img)
>>> out.shape
torch.Size([1, 1, 24, 24])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomBoxBlur((7, 7), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomBrightness(brightness=(1.0, 1.0), clip_output=True, same_on_batch=False, p=1.0, keepdim=False)[source]¶

Apply a random transformation to the brightness of a torch.Tensor image.

This implementation aligns PIL. Hence, the output is close to TorchVision.

Parameters:

brightness (Tuple[float, float], optional) – the brightness factor to apply Default: (1.0, 1.0)
clip_output (bool, optional) – if true clip output Default: True
silence_instantiation_warning – if True, silence the warning at instantiation.
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.adjust_brightness()

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.rand(1, 3, 3, 3)
>>> aug = RandomBrightness(brightness = (0.5,2.),p=1.)
>>> aug(inputs)
tensor([[[[0.0505, 0.3225, 0.0000],
          [0.0000, 0.0000, 0.1883],
          [0.0443, 0.4507, 0.0099]],

         [[0.1866, 0.0000, 0.0000],
          [0.0000, 0.0000, 0.0000],
          [0.0728, 0.2519, 0.3543]],

         [[0.0000, 0.0000, 0.2359],
          [0.4694, 0.0000, 0.4284],
          [0.0000, 0.1072, 0.5070]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomBrightness((0.8,1.2), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomChannelDropout(num_drop_channels=1, fill_value=0.0, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply random channel dropout to a batch of images.

Parameters:

num_drop_channels (int, optional) – Number of channels to drop randomly. Default is 1. Default: 1
fill_value (float, optional) – Value to fill the dropped channels with. Default is 0.0. Default: 0.0
same_on_batch (bool, optional) – Apply the same transformation across the batch. Defaults to False. Default: False
p (float, optional) – Probability of applying the transformation. Defaults to 0.5. Default: 0.5
keepdim (bool, optional) – Whether to keep the output shape the same as input True or broadcast it to the batch form False. Defaults to False. Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((C, H, W)\) or \((B, C, H, W)\)

Note

If num_drop_channels is set to 1, it means that for each image in the batch,: we will randomly choose one channel to drop.
If num_drop_channels is set to 2, it means that for each image in the batch,: we will randomly choose two channels to drop.
If num_drop_channels is set to 3, it means that for each image in the batch,: we will randomly choose three channels to drop (all image).

Examples

>>> rng = torch.manual_seed(1)
>>> img = torch.ones(1, 3, 3, 3)
>>> aug = RandomChannelDropout(num_drop_channels=1, fill_value=0.0, p=1.0)
>>> aug(img)
tensor([[[[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]],

         [[0., 0., 0.],
          [0., 0., 0.],
          [0., 0., 0.]],

         [[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomChannelDropout(num_drop_channels=1, fill_value=0.0, p=1.0)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomChannelShuffle(same_on_batch=False, p=0.5, keepdim=False)[source]¶

Shuffle the channels of a batch of multi-dimensional images.

Parameters:

same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> img = torch.arange(1*2*2*2.).view(1,2,2,2)
>>> RandomChannelShuffle()(img)
tensor([[[[4., 5.],
          [6., 7.]],

         [[0., 1.],
          [2., 3.]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomChannelShuffle(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomClahe(clip_limit=(40.0, 40.0), grid_size=(8, 8), slow_and_differentiable=False, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply CLAHE equalization on the input torch.Tensor randomly.

Parameters:

clip_limit (tuple[float, float], optional) – threshold value for contrast limiting. If 0 clipping is disabled. Default: (40.0, 40.0)
grid_size (tuple[int, int], optional) – number of tiles to be cropped in each direction (GH, GW). Default: (8, 8)
slow_and_differentiable (bool, optional) – flag to select implementation Default: False
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Note

This function internally uses kornia.enhance.equalize_clahe().

Examples

>>> img = torch.rand(1, 10, 20)
>>> aug = RandomClahe()
>>> res = aug(img)
>>> res.shape
torch.Size([1, 1, 10, 20])

>>> img = torch.rand(2, 3, 10, 20)
>>> aug = RandomClahe()
>>> res = aug(img)
>>> res.shape
torch.Size([2, 3, 10, 20])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomClahe(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomContrast(contrast=(1.0, 1.0), clip_output=True, same_on_batch=False, p=1.0, keepdim=False)[source]¶

Apply a random transformation to the contrast of a torch.Tensor image.

This implementation aligns PIL. Hence, the output is close to TorchVision.

Parameters:

contrast (Tuple[float, float], optional) – the contrast factor to apply. Default: (1.0, 1.0)
clip_output (bool, optional) – if true clip output. Default: True
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.adjust_contrast()

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.rand(1, 3, 3, 3)
>>> aug = RandomContrast(contrast = (0.5, 2.), p = 1.)
>>> aug(inputs)
tensor([[[[0.2750, 0.4258, 0.0490],
          [0.0732, 0.1704, 0.3514],
          [0.2716, 0.4969, 0.2525]],

         [[0.3505, 0.1934, 0.2227],
          [0.0124, 0.0936, 0.1629],
          [0.2874, 0.3867, 0.4434]],

         [[0.0893, 0.1564, 0.3778],
          [0.5072, 0.2201, 0.4845],
          [0.2325, 0.3064, 0.5281]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomContrast((0.8,1.2), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomEqualize(same_on_batch=False, p=0.5, keepdim=False)[source]¶

Equalize given tensor image or a batch of tensor images randomly.

Parameters:

p (float, optional) – Probability to equalize an image. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.equalize().

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 5, 5)
>>> equalize = RandomEqualize(p=1.)
>>> equalize(input)
tensor([[[[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
          [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
          [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
          [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
          [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomEqualize(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomDissolving(step_range=(100, 500), version='1.5', p=0.5, keepdim=False, **kwargs)[source]¶

Perform dissolving transformation using StableDiffusion models.

Based on [SZZ+24], the dissolving transformation is essentially applying one-step reverse diffusion. Our implementation currently supports HuggingFace implementations of SD 1.4, 1.5 and 2.1. SD 1.X tends to remove more details than SD2.1.

Title¶
SD 1.4	SD 1.5	SD xl
figure:: https://raw.githubusercontent.com/kornia/data/main/dslv-sd-1.4.png	figure:: https://raw.githubusercontent.com/kornia/data/main/dslv-sd-1.5.png	figure:: https://raw.githubusercontent.com/kornia/data/main/dslv-sd-2.1.png

Parameters:

p (float, optional) – probability of applying the transformation. Default: 0.5
version (str, optional) – the version of the stable diffusion model. Default: "1.5"
step_range (Tuple[float, float], optional) – the step range of the diffusion model steps. Higher the step, stronger the dissolving effects. Default: (100, 500)
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False
**kwargs (Any) – additional arguments for .from_pretrained for HF StableDiffusionPipeline.

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\).
Output: \((B, C, H, W)\)

class kornia.augmentation.RandomGamma(gamma=(1.0, 1.0), gain=(1.0, 1.0), same_on_batch=False, p=1.0, keepdim=False)[source]¶

Apply a random transformation to the gamma of a torch.Tensor image.

This implementation aligns PIL. Hence, the output is close to TorchVision.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 1.0
gamma (Tuple[float, float], optional) – the gamma factor to apply. Default: (1.0, 1.0)
gain (Tuple[float, float], optional) – the gain factor to apply. Default: (1.0, 1.0)
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.adjust_gamma()

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.rand(1, 3, 3, 3)
>>> aug = RandomGamma((0.5,2.),(1.5,1.5),p=1.)
>>> aug(inputs)
tensor([[[[1.0000, 1.0000, 0.3912],
          [0.4883, 0.7801, 1.0000],
          [1.0000, 1.0000, 0.9702]],

         [[1.0000, 0.8368, 0.9048],
          [0.1824, 0.5597, 0.7609],
          [1.0000, 1.0000, 1.0000]],

         [[0.5452, 0.7441, 1.0000],
          [1.0000, 0.8990, 1.0000],
          [0.9267, 1.0000, 1.0000]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomGamma((0.8,1.2), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomGaussianBlur(kernel_size, sigma, border_type='reflect', separable=True, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply gaussian blur given tensor image or a batch of tensor images randomly.

The standard deviation is sampled for each instance.

Parameters:

kernel_size (Union[Tuple[int, int], int]) – the size of the kernel.
sigma (Union[Tuple[float, float], Tensor]) – the range for the standard deviation of the kernel.
border_type (str, optional) – the padding mode to be applied before convolving. The expected modes are: constant, reflect, replicate or circular. Default: "reflect"
separable (bool, optional) – run as composition of two 1d-convolutions. Default: True
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False
silence_instantiation_warning – if True, silence the warning at instantiation.

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.filters.gaussian_blur2d().

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 5, 5)
>>> blur = RandomGaussianBlur((3, 3), (0.1, 2.0), p=1.)
>>> blur(input)
tensor([[[[0.5941, 0.5833, 0.5022, 0.4384, 0.3934],
          [0.5310, 0.4964, 0.4113, 0.3637, 0.3472],
          [0.4991, 0.4997, 0.4312, 0.3620, 0.3081],
          [0.6082, 0.5667, 0.4954, 0.3825, 0.3508],
          [0.7042, 0.6849, 0.6275, 0.4753, 0.4105]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomGaussianBlur((3, 3), (0.1, 2.0), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomGaussianIllumination(gain=(0.01, 0.15), center=(0.1, 0.9), sigma=(0.2, 1.0), sign=(-1.0, 1.0), p=0.5, same_on_batch=False, keepdim=False)[source]¶

Applies random 2D Gaussian illumination patterns to a batch of images.

Parameters:

gain (Union[float, Tuple[float, float], None], optional) – Range for the gain factor (intensity) applied to the generated illumination. Default: (0.01, 0.15)
center (Union[float, Tuple[float, float], None], optional) – The center coordinates of the Gaussian distribution are expressed as a Default: (0.1, 0.9)
dimensions (percentage of the spatial) – math:(H, W).
sigma (Union[float, Tuple[float, float], None], optional) – The sigma values (standard deviation) of the Gaussian distribution are expressed as a Default: (0.2, 1.0)
dimensions – math:(H, W).
sign (Union[float, Tuple[float, float], None], optional) – Range for the sign of the Gaussian distribution. If only one sign is needed, Default: (-1.0, 1.0)
float. (insert only as a tuple or)
p (float, optional) – Probability of applying the transformation. Default: 0.5
same_on_batch (bool, optional) – If True, apply the same transformation across the entire batch. Default is False. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

The generated random numbers are not reproducible across different devices and dtypes. By default, the parameters will be generated on CPU. This can be changed by calling self.set_rng_device_and_dtype(device="cuda", dtype=torch.float64).

Examples

>>> rng = torch.manual_seed(1)
>>> input = torch.ones(1, 3, 3, 3) * 0.5
>>> aug = RandomGaussianIllumination(gain=0.5, p=1.)
>>> aug(input)
tensor([[[[0.7266, 1.0000, 0.7266],
          [0.6621, 0.9121, 0.6621],
          [0.5000, 0.6911, 0.5000]],

         [[0.7266, 1.0000, 0.7266],
          [0.6621, 0.9121, 0.6621],
          [0.5000, 0.6911, 0.5000]],

         [[0.7266, 1.0000, 0.7266],
          [0.6621, 0.9121, 0.6621],
          [0.5000, 0.6911, 0.5000]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomGaussianIllumination(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomGaussianNoise(mean=0.0, std=1.0, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add gaussian noise to a batch of multi-dimensional images.

Parameters:

mean (float, optional) – The mean of the gaussian distribution. Default: 0.0
std (float, optional) – The standard deviation of the gaussian distribution. Default: 1.0
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> img = torch.ones(1, 1, 2, 2)
>>> RandomGaussianNoise(mean=0., std=1., p=1.)(img)
tensor([[[[ 2.5410,  0.7066],
          [-1.1788,  1.5684]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomGaussianNoise(mean=0., std=1., p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomGrayscale(rgb_weights=None, same_on_batch=False, p=0.1, keepdim=False)[source]¶

Apply random transformation to Grayscale according to a probability p value.

Works for multispectral imagery too (e.g. satellite data with 4-13+ bands): for a non-RGB channel count the grayscale is the weighted average across all channels, broadcast back to the input channel count. This makes the augmentation usable outside the 3-channel RGB regime.

Parameters:

rgb_weights (Optional[Tensor], optional) – Per-channel weights applied when reducing to grayscale — one weight per input channel (three, for the usual RGB case). If None, RGB inputs use the standard luminance weights and multispectral inputs weight every band equally. The weights should sum to one. Default: None
p (float, optional) – probability of the image to be transformed to grayscale. Default: 0.1
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

For 3-channel RGB inputs this uses kornia.color.rgb_to_grayscale(); multispectral inputs use a weighted channel average.

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.randn((1, 3, 3, 3))
>>> aug = RandomGrayscale(p=1.0)
>>> aug(inputs)
tensor([[[[-1.1344, -0.1330,  0.1517],
          [-0.0791,  0.6711, -0.1413],
          [-0.1717, -0.9023,  0.0819]],

         [[-1.1344, -0.1330,  0.1517],
          [-0.0791,  0.6711, -0.1413],
          [-0.1717, -0.9023,  0.0819]],

         [[-1.1344, -0.1330,  0.1517],
          [-0.0791,  0.6711, -0.1413],
          [-0.1717, -0.9023,  0.0819]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomGrayscale(p=1.0)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomHue(hue=(0.0, 0.0), same_on_batch=False, p=1.0, keepdim=False)[source]¶

Apply a random transformation to the hue of a torch.Tensor image.

This implementation aligns PIL. Hence, the output is close to TorchVision.

Parameters:

hue (Tuple[float, float], optional) – the saturation factor to apply. Default: (0.0, 0.0)
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.adjust_hue()

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.rand(1, 3, 3, 3)
>>> aug = RandomHue(hue = (-0.5,0.5),p=1.)
>>> aug(inputs)
tensor([[[[0.3993, 0.2823, 0.6816],
          [0.6117, 0.2090, 0.4081],
          [0.4693, 0.5529, 0.9527]],

         [[0.1610, 0.5962, 0.4971],
          [0.9152, 0.3971, 0.8742],
          [0.4194, 0.6771, 0.7162]],

         [[0.6323, 0.7682, 0.0885],
          [0.0223, 0.1689, 0.2939],
          [0.5185, 0.8964, 0.4556]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomHue((-0.2,0.2), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomInvert(max_val=1.0, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Invert the tensor images values randomly.

Parameters:

max_val (Union[float, Tensor], optional) – The expected maximum value in the input tensor. The shape has to according to the input tensor shape, or at least has to work with broadcasting. Default: 1.0
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Note

This function internally uses kornia.enhance.invert().

Examples

>>> rng = torch.manual_seed(0)
>>> img = torch.rand(1, 1, 5, 5)
>>> inv = RandomInvert()
>>> inv(img)
tensor([[[[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
          [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
          [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
          [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
          [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomInvert(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomJPEG(jpeg_quality=50.0, same_on_batch=False, p=1.0, keepdim=False)[source]¶

Applies random (differentiable) JPEG coding to a torch.Tensor image.

Parameters:

jpeg_quality (Union[Tensor, float, Tuple[float, float], List[float]], optional) – The range of compression rates to be applied. Default: 50.0
p (float, optional) – probability of applying the transformation. Default: 1.0
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.jpeg_codec_differentiable().

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> images = 0.1904 * torch.ones(2, 3, 32, 32)
>>> aug = RandomJPEG(jpeg_quality=(1.0, 50.0), p=1.)
>>> images_jpeg = aug(images)

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> images = 0.1904 * torch.ones(2, 3, 32, 32)
>>> aug = RandomJPEG(jpeg_quality=20.0, p=1.)  # Samples a JPEG quality from the range [30.0, 70.0]
>>> (aug(images) == aug(images, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomLinearCornerIllumination(gain=(0.01, 0.2), sign=(-1.0, 1.0), p=0.5, same_on_batch=False, keepdim=False)[source]¶

Applies random 2D Linear from corner illumination patterns to a batch of images.

_images/RandomLinearCornerIllumination.png

Parameters:

gain (Union[float, Tuple[float, float], None], optional) – Range for the gain factor (intensity) applied to the generated illumination. Default: (0.01, 0.2)
sign (Union[float, Tuple[float, float], None], optional) – Range for the sign of the distribution. If only one sign is needed, Default: (-1.0, 1.0)
float. (insert only as a tuple or)
p (float, optional) – Probability of applying the transformation. Default: 0.5
same_on_batch (bool, optional) – If True, apply the same transformation across the entire batch. Default is False. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

Examples

>>> rng = torch.manual_seed(1)
>>> input = torch.ones(1, 3, 3, 3) * 0.5
>>> aug = RandomLinearCornerIllumination(gain=0.25, p=1.)
>>> aug(input)
tensor([[[[0.3750, 0.4375, 0.5000],
          [0.3125, 0.3750, 0.4375],
          [0.2500, 0.3125, 0.3750]],

         [[0.3750, 0.4375, 0.5000],
          [0.3125, 0.3750, 0.4375],
          [0.2500, 0.3125, 0.3750]],

         [[0.3750, 0.4375, 0.5000],
          [0.3125, 0.3750, 0.4375],
          [0.2500, 0.3125, 0.3750]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomLinearCornerIllumination(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomLinearIllumination(gain=(0.01, 0.2), sign=(-1.0, 1.0), p=0.5, same_on_batch=False, keepdim=False)[source]¶

Applies random 2D Linear illumination patterns to a batch of images.

Parameters:

gain (Union[float, Tuple[float, float], None], optional) – Range for the gain factor (intensity) applied to the generated illumination. Default: (0.01, 0.2)
sign (Union[float, Tuple[float, float], None], optional) – Range for the sign of the distribution. If only one sign is needed, Default: (-1.0, 1.0)
float. (insert only as a tuple or)
p (float, optional) – Probability of applying the transformation. Default: 0.5
same_on_batch (bool, optional) – If True, apply the same transformation across the entire batch. Default is False. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

Examples

>>> rng = torch.manual_seed(1)
>>> input = torch.ones(1, 3, 3, 3) * 0.5
>>> aug = RandomLinearIllumination(gain=0.25, p=1.)
>>> aug(input)
tensor([[[[0.2500, 0.2500, 0.2500],
          [0.3750, 0.3750, 0.3750],
          [0.5000, 0.5000, 0.5000]],

         [[0.2500, 0.2500, 0.2500],
          [0.3750, 0.3750, 0.3750],
          [0.5000, 0.5000, 0.5000]],

         [[0.2500, 0.2500, 0.2500],
          [0.3750, 0.3750, 0.3750],
          [0.5000, 0.5000, 0.5000]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomLinearIllumination(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomMedianBlur(kernel_size=(3, 3), same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add random blur with a median filter to an image tensor.

Parameters:

kernel_size (Tuple[int, int], optional) – the blurring kernel size. Default: (3, 3)
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Note

This function internally uses kornia.filters.median_blur().

Examples

>>> img = torch.ones(1, 1, 4, 4)
>>> out = RandomMedianBlur((3, 3), p = 1)(img)
>>> out.shape
torch.Size([1, 1, 4, 4])
>>> out
tensor([[[[0., 1., 1., 0.],
          [1., 1., 1., 1.],
          [1., 1., 1., 1.],
          [0., 1., 1., 0.]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomMedianBlur((7, 7), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomMotionBlur(kernel_size, angle, direction, border_type=BorderType.CONSTANT.name, resample=Resample.NEAREST.name, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Perform motion blur on 2D images (4D torch.Tensor).

Parameters:

p (float, optional) – probability of applying the transformation. Default: 0.5
kernel_size (Union[int, Tuple[int, int]]) – motion kernel size (odd and positive). If int, the kernel will have a fixed size. If Tuple[int, int], it will randomly generate the value from the range batch-wisely.
angle (Union[Tensor, float, Tuple[float, float]]) – angle of the motion blur in degrees (anti-clockwise rotation). If float, it will generate the value from (-angle, angle).
direction (Union[Tensor, float, Tuple[float, float]]) – forward/backward direction of the motion blur. Lower values towards -1.0 will point the motion blur towards the back (with angle provided via angle), while higher values towards 1.0 will point the motion blur forward. A value of 0.0 leads to a uniformly (but still angled) motion blur. If float, it will generate the value from (-direction, direction). If Tuple[int, int], it will randomly generate the value from the range.
border_type (Union[int, str, BorderType], optional) – the padding mode to be applied before convolving. CONSTANT = 0, REFLECT = 1, REPLICATE = 2, CIRCULAR = 3. Default: BorderType.CONSTANT.name
resample (Union[str, int, Resample], optional) – the interpolation mode. Default: Resample.NEAREST.name
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

Input torch.Tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation torch.Tensor (\((B, 3, 3)\)), then the applied transformation will be merged int to the input transformation torch.Tensor and returned.

Please set resample to 'bilinear' if more meaningful gradients wanted.

Note

This function internally uses kornia.filters.motion_blur().

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.ones(1, 1, 5, 5)
>>> motion_blur = RandomMotionBlur(3, 35., 0.5, p=1.)
>>> motion_blur(input)
tensor([[[[0.5773, 1.0000, 1.0000, 1.0000, 0.7561],
          [0.5773, 1.0000, 1.0000, 1.0000, 0.7561],
          [0.5773, 1.0000, 1.0000, 1.0000, 0.7561],
          [0.5773, 1.0000, 1.0000, 1.0000, 0.7561],
          [0.5773, 1.0000, 1.0000, 1.0000, 0.7561]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomMotionBlur(3, 35., 0.5, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomPlanckianJitter(mode='blackbody', select_from=None, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply planckian jitter transformation to input torch.Tensor.

This is physics based color augmentation, that creates realistic variations in chromaticity, this can simulate the illumination changes in the scene.

See [ZBTvdW22] for more details.

Parameters:

mode (str, optional) – ‘blackbody’ or ‘CIED’. Default: "blackbody"
select_from (Union[int, List[int], None], optional) – choose a list of jitters to apply from. blackbody range [0-24], CIED range [0-22] Default: None
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability that the random erasing operation will be performed. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

Input torch.Tensor must be float and normalized into [0, 1].

Examples

To apply planckian jitter based on mode

>>> rng = torch.manual_seed(0)
>>> input = torch.randn(1, 3, 2, 2)
>>> aug = RandomPlanckianJitter(mode='CIED')
>>> aug(input)
tensor([[[[ 1.0000, -0.2389],
          [-1.7740,  0.4628]],

         [[-1.0845, -1.3986],
          [ 0.4033,  0.8380]],

         [[-0.9228, -0.5175],
          [-0.7654,  0.2335]]]])

To apply planckian jitter on image(s) from list of interested jitters

>>> rng = torch.manual_seed(0)
>>> input = torch.randn(2, 3, 2, 2)
>>> aug = RandomPlanckianJitter(mode='blackbody', select_from=[23, 24, 1, 2])
>>> aug(input)
tensor([[[[-1.1258, -1.1524],
          [-0.2506, -0.4339]],

         [[ 0.8487,  0.6920],
          [-0.3160, -2.1152]],

         [[ 0.4681, -0.1577],
          [ 1.4437,  0.2660]]],


        [[[ 0.1268,  0.6658],
          [-0.1093, -0.0850]],

         [[ 0.9318,  1.0000],
          [ 1.0000,  0.0537]],

         [[ 0.9134, -0.6101],
          [-1.2430, -3.4228]]]])

class kornia.augmentation.RandomPlasmaBrightness(roughness=(0.1, 0.7), intensity=(0.0, 1.0), same_on_batch=False, p=0.5, keepdim=False)[source]¶

Adds brightness to the image based on a fractal map generated by the diamond square algorithm.

This is based on the original paper: TorMentor: Deterministic dynamic-path, data augmentations with fractals. See: [NCR+22] for more details.

Note

This function internally uses kornia.contrib.diamond_square().

Parameters:

roughness (Tuple[float, float], optional) – value to scale during the recursion in the generation of the fractal map. Default: (0.1, 0.7)
intensity (Tuple[float, float], optional) – value that scales the intensity values of the generated maps. Default: (0.0, 1.0)
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> img = torch.ones(1, 1, 3, 4)
>>> RandomPlasmaBrightness(roughness=(0.1, 0.7), p=1.)(img)
tensor([[[[0.6415, 1.0000, 0.3142, 0.6836],
          [1.0000, 0.5593, 0.5556, 0.4566],
          [0.5809, 1.0000, 0.7005, 1.0000]]]])

class kornia.augmentation.RandomPlasmaContrast(roughness=(0.1, 0.7), same_on_batch=False, p=0.5, keepdim=False)[source]¶

Adds contrast to the image based on a fractal map generated by the diamond square algorithm.

This is based on the original paper: TorMentor: Deterministic dynamic-path, data augmentations with fractals. See: [NCR+22] for more details.

Note

This function internally uses kornia.contrib.diamond_square().

Parameters:

roughness (Tuple[float, float], optional) – value to scale during the recursion in the generation of the fractal map. Default: (0.1, 0.7)
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> img = torch.ones(1, 1, 3, 4)
>>> RandomPlasmaContrast(roughness=(0.1, 0.7), p=1.)(img)
tensor([[[[0.9651, 1.0000, 1.0000, 1.0000],
          [1.0000, 0.9103, 0.8038, 0.9263],
          [0.6882, 1.0000, 0.9544, 1.0000]]]])

class kornia.augmentation.RandomPlasmaShadow(roughness=(0.1, 0.7), shade_intensity=(-1.0, 0.0), shade_quantity=(0.0, 1.0), same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add gaussian noise to a batch of multi-dimensional images.

This is based on the original paper: TorMentor: Deterministic dynamic-path, data augmentations with fractals. See: [NCR+22] for more details.

Note

This function internally uses kornia.contrib.diamond_square().

Parameters:

roughness (Tuple[float, float], optional) – value to scale during the recursion in the generation of the fractal map. Default: (0.1, 0.7)
shade_intensity (Tuple[float, float], optional) – value that scales the intensity values of the generated maps. Default: (-1.0, 0.0)
shade_quantity (Tuple[float, float], optional) – value to select the pixels to mask. Default: (0.0, 1.0)
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Examples

>>> rng = torch.manual_seed(0)
>>> img = torch.ones(1, 1, 3, 4)
>>> RandomPlasmaShadow(roughness=(0.1, 0.7), p=1.)(img)
tensor([[[[0.7682, 1.0000, 1.0000, 1.0000],
          [1.0000, 1.0000, 1.0000, 1.0000],
          [1.0000, 1.0000, 1.0000, 1.0000]]]])

class kornia.augmentation.RandomPosterize(bits=3, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Posterize given torch.Tensor image or a batch of torch.Tensor images randomly.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 0.5
bits (Union[float, Tuple[float, float], Tensor], optional) – Integer that ranged from (0, 8], in which 0 gives black image and 8 gives the original. If int x, bits will be generated from (x, 8) then convert to int. If tuple (x, y), bits will be generated from (x, y) then convert to int. Default: 3
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.posterize().

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 5, 5)
>>> posterize = RandomPosterize(3., p=1.)
>>> posterize(input)
tensor([[[[0.4863, 0.7529, 0.0784, 0.1255, 0.2980],
          [0.6275, 0.4863, 0.8941, 0.4549, 0.6275],
          [0.3451, 0.3922, 0.0157, 0.1569, 0.2824],
          [0.5176, 0.6902, 0.8000, 0.1569, 0.2667],
          [0.6745, 0.9098, 0.3922, 0.8627, 0.4078]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomPosterize(3., p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomRain(number_of_drops=(1000, 2000), drop_height=(5, 20), drop_width=(-5, 5), same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add Random Rain to the image.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 0.5
number_of_drops (tuple[int, int], optional) – number of drops per image Default: (1000, 2000)
drop_height (tuple[int, int], optional) – height of the drop in image(same for each drops in one image) Default: (5, 20)
drop_width (tuple[int, int], optional) – width of the drop in image(same for each drops in one image) Default: (-5, 5)

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 5, 5)
>>> rain = RandomRain(p=1,drop_height=(1,2),drop_width=(1,2),number_of_drops=(1,1))
>>> rain(input)
tensor([[[[0.4963, 0.7843, 0.0885, 0.1320, 0.3074],
          [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
          [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
          [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
          [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]]]])

class kornia.augmentation.RandomRGBShift(r_shift_limit=0.5, g_shift_limit=0.5, b_shift_limit=0.5, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Randomly shift each channel of an image.

Parameters:

r_shift_limit (float, optional) – maximum value up to which the shift value can be generated for red channel; recommended interval - [0, 1], should always be positive Default: 0.5
g_shift_limit (float, optional) – maximum value up to which the shift value can be generated for green channel; recommended interval - [0, 1], should always be positive Default: 0.5
b_shift_limit (float, optional) – maximum value up to which the shift value can be generated for blue channel; recommended interval - [0, 1], should always be positive Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

Note

Input torch.Tensor must be float and normalized into [0, 1].

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> inp = torch.rand(1, 3, 5, 5)
>>> aug = RandomRGBShift(0, 0, 0)
>>> ((inp == aug(inp)).double()).all()
tensor(True)

>>> rng = torch.manual_seed(0)
>>> inp = torch.rand(1, 3, 5, 5)
>>> inp
tensor([[[[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
          [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
          [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
          [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
          [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]],

         [[0.5529, 0.9527, 0.0362, 0.1852, 0.3734],
          [0.3051, 0.9320, 0.1759, 0.2698, 0.1507],
          [0.0317, 0.2081, 0.9298, 0.7231, 0.7423],
          [0.5263, 0.2437, 0.5846, 0.0332, 0.1387],
          [0.2422, 0.8155, 0.7932, 0.2783, 0.4820]],

         [[0.8198, 0.9971, 0.6984, 0.5675, 0.8352],
          [0.2056, 0.5932, 0.1123, 0.1535, 0.2417],
          [0.7262, 0.7011, 0.2038, 0.6511, 0.7745],
          [0.4369, 0.5191, 0.6159, 0.8102, 0.9801],
          [0.1147, 0.3168, 0.6965, 0.9143, 0.9351]]]])
>>> aug = RandomRGBShift(p=1.)
>>> aug(inp)
tensor([[[[0.9374, 1.0000, 0.5297, 0.5732, 0.7486],
          [1.0000, 0.9313, 1.0000, 0.8968, 1.0000],
          [0.7901, 0.8429, 0.4635, 0.6100, 0.7351],
          [0.9597, 1.0000, 1.0000, 0.6022, 0.7234],
          [1.0000, 1.0000, 0.8383, 1.0000, 0.8606]],

         [[0.6524, 1.0000, 0.1357, 0.2847, 0.4729],
          [0.4046, 1.0000, 0.2754, 0.3693, 0.2502],
          [0.1312, 0.3076, 1.0000, 0.8226, 0.8418],
          [0.6258, 0.3432, 0.6841, 0.1327, 0.2382],
          [0.3417, 0.9150, 0.8927, 0.3778, 0.5815]],

         [[0.3850, 0.5623, 0.2636, 0.1328, 0.4005],
          [0.0000, 0.1584, 0.0000, 0.0000, 0.0000],
          [0.2914, 0.2663, 0.0000, 0.2163, 0.3397],
          [0.0021, 0.0843, 0.1811, 0.3754, 0.5453],
          [0.0000, 0.0000, 0.2617, 0.4795, 0.5003]]]])

class kornia.augmentation.RandomSaltAndPepperNoise(amount=(0.01, 0.06), salt_vs_pepper=(0.4, 0.6), p=0.5, same_on_batch=False, keepdim=False)[source]¶

Apply random Salt and Pepper noise to input images.

Parameters:

amount (Union[float, Tuple[float, float], None], optional) – A float or a tuple representing the range for the amount of noise to apply. Default: (0.01, 0.06)
salt_vs_pepper (Union[float, Tuple[float, float], None], optional) – A float or a tuple representing the range for the ratio of Salt to Pepper noise. Default: (0.4, 0.6)
p (float, optional) – The probability of applying the transformation. Default is 0.5. Default: 0.5
same_on_batch (bool, optional) – If True, apply the same transformation across the entire batch. Default is False. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

The amount parameter controls the intensity of the noise, while salt_vs_pepper controls the ratio of Salt to Pepper noise.

The values for amount and salt_vs_pepper should be between 0 and 1. The recommended value for salt_vs_pepper is 0.5, and for amount, values less than 0.2 are recommended.

If amount and salt_vs_pepper are floats (unique values), the transformation is applied with these exact values, rather than randomly sampling from the specified range. However, the masks are still generated randomly using these exact parameters.

Examples

>>> rng = torch.manual_seed(5)
>>> inputs = torch.rand(1, 3, 3, 3)
>>> aug = RandomSaltAndPepperNoise(amount=0.5, salt_vs_pepper=0.5, p=1.)
>>> aug(inputs)
tensor([[[[1.0000, 0.0000, 0.0000],
          [1.0000, 1.0000, 0.1166],
          [0.1644, 0.7379, 0.0000]],

         [[1.0000, 0.0000, 0.0000],
          [1.0000, 1.0000, 0.7150],
          [0.5793, 0.9809, 0.0000]],

         [[1.0000, 0.0000, 0.0000],
          [1.0000, 1.0000, 0.7850],
          [0.9752, 0.0903, 0.0000]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomSaltAndPepperNoise(amount=0.05, salt_vs_pepper=0.5, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomSaturation(saturation=(1.0, 1.0), same_on_batch=False, p=1.0, keepdim=False)[source]¶

Apply a random transformation to the saturation of a torch.Tensor image.

This implementation aligns PIL. Hence, the output is close to TorchVision.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 1.0
saturation (Tuple[float, float], optional) – the saturation factor to apply. Default: (1.0, 1.0)
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.adjust_saturation()

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.rand(1, 3, 3, 3)
>>> aug = RandomSaturation(saturation = (0.5,2.),p=1.)
>>> aug(inputs)
tensor([[[[0.5569, 0.7682, 0.3529],
          [0.4811, 0.3474, 0.7411],
          [0.5028, 0.8964, 0.6772]],

         [[0.6323, 0.5358, 0.5265],
          [0.4203, 0.2706, 0.5525],
          [0.5185, 0.7863, 0.8681]],

         [[0.3711, 0.4989, 0.6816],
          [0.9152, 0.3971, 0.8742],
          [0.4636, 0.7060, 0.9527]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32)
>>> aug = RandomSaturation((0.8,1.2), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomSharpness(sharpness=0.5, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Sharpen given torch.Tensor image or a batch of torch.Tensor images randomly.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 0.5
sharpness (Union[Tensor, float, Tuple[float, float]], optional) – factor of sharpness strength. Must be above 0. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.sharpness().

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 5, 5)
>>> sharpness = RandomSharpness(1., p=1.)
>>> sharpness(input)
tensor([[[[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
          [0.6341, 0.4810, 0.7367, 0.4177, 0.6323],
          [0.3489, 0.4428, 0.1562, 0.2443, 0.2939],
          [0.5185, 0.6462, 0.7050, 0.2288, 0.2823],
          [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomSharpness(1., p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomSnow(snow_coefficient=(0.5, 0.5), brightness=(2, 2), same_on_batch=False, p=1.0, keepdim=False)[source]¶

Generates snow effect on given torch.Tensor image or a batch torch.Tensor images.

Parameters:

snow_coefficient (Tuple[float, float], optional) – A tuple of floats (lower and upper bound) between 0 and 1 that control Default: (0.5, 0.5)
image (the amount of snow to add to the)
snow. (brightness of the)
brightness (Tuple[float, float], optional) – A tuple of floats (lower and upper bound) greater than 1 that controls the Default: (2, 2)
snow.
same_on_batch (bool, optional) – If True, apply the same transformation to each image in a batch. Default: False.
p (float, optional) – Probability of applying the transformation. Default: 0.5.
keepdim (bool, optional) – Keep the output torch.Tensor with the same shape as input. Default: False.

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Examples

>>> inputs = torch.rand(2, 3, 4, 4)
>>> snow = kornia.augmentation.RandomSnow(p=1.0, snow_coefficient=(0.1, 0.6), brightness=(1.0, 5.0))
>>> output = snow(inputs)
>>> output.shape
torch.Size([2, 3, 4, 4])

class kornia.augmentation.RandomSolarize(thresholds=0.1, additions=0.1, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Solarize given torch.Tensor image or a batch of torch.Tensor images randomly.

Parameters:

p (float, optional) – probability of applying the transformation. Default: 0.5
thresholds (Union[Tensor, float, Tuple[float, float], List[float]], optional) – If float x, threshold will be generated from (0.5 - x, 0.5 + x). If tuple (x, y), threshold will be generated from (x, y). Default: 0.1
additions (Union[Tensor, float, Tuple[float, float], List[float]], optional) – If float x, addition will be generated from (-x, x). If tuple (x, y), addition will be generated from (x, y). Default: 0.1
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.enhance.solarize().

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 5, 5)
>>> solarize = RandomSolarize(0.1, 0.1, p=1.)
>>> solarize(input)
tensor([[[[0.4132, 0.1412, 0.1790, 0.2226, 0.3980],
          [0.2754, 0.4194, 0.0130, 0.4538, 0.2771],
          [0.4394, 0.4923, 0.1129, 0.2594, 0.3844],
          [0.3909, 0.2118, 0.1094, 0.2516, 0.3728],
          [0.2278, 0.0000, 0.4876, 0.0353, 0.5100]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomSolarize(0.1, 0.1, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

Geometric¶

class kornia.augmentation.CenterCrop(size, align_corners=True, resample=Resample.BILINEAR.name, p=1.0, keepdim=False, cropping_mode='slice')[source]¶

Crop a given image torch.Tensor at the center.

Parameters:

size (Union[int, Tuple[int, int]]) – Desired output size (out_h, out_w) of the crop. If integer, out_h = out_w = size. If Tuple[int, int], out_h = size[0], out_w = size[1].
align_corners (bool, optional) – interpolation flag. Default: True
resample (Union[str, int, Resample], optional) – The interpolation mode. Default: Resample.BILINEAR.name
p (float, optional) – probability of applying the transformation for the whole batch. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False
cropping_mode (str, optional) – The used algorithm to crop. slice will use advanced slicing to extract the torch.Tensor based on the sampled indices. resample will use warp_affine using the affine transformation to extract and resize at once. Use slice for efficiency, or resample for proper differentiability. Default: "slice"

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, out_h, out_w)\)

Note

This function internally uses kornia.geometry.transform.crop_by_boxes().

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> inputs = torch.randn(1, 1, 4, 4)
>>> inputs
tensor([[[[-1.1258, -1.1524, -0.2506, -0.4339],
          [ 0.8487,  0.6920, -0.3160, -2.1152],
          [ 0.3223, -1.2633,  0.3500,  0.3081],
          [ 0.1198,  1.2377,  1.1168, -0.2473]]]])
>>> aug = CenterCrop(2, p=1., cropping_mode="resample")
>>> out = aug(inputs)
>>> out
tensor([[[[ 0.6920, -0.3160],
          [-1.2633,  0.3500]]]])
>>> aug.inverse(out, padding_mode="border")
tensor([[[[ 0.6920,  0.6920, -0.3160, -0.3160],
          [ 0.6920,  0.6920, -0.3160, -0.3160],
          [-1.2633, -1.2633,  0.3500,  0.3500],
          [-1.2633, -1.2633,  0.3500,  0.3500]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = CenterCrop(2, p=1., cropping_mode="resample")
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.PadTo(size, pad_mode='constant', pad_value=0, keepdim=False)[source]¶

Pad the given sample to a specific size. Always occurs (p=1.0).

Parameters:

size (Tuple[int, int]) – a tuple of ints in the format (height, width) that give the spatial dimensions to pad inputs to.
pad_mode (str, optional) – the type of padding to perform on the image (valid values are those accepted by torch.nn.functional.pad) Default: "constant"
pad_value (float, optional) – fill value for ‘constant’ padding applied to the image Default: 0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses torch.nn.functional.pad().

Examples

>>> import torch
>>> img = torch.tensor([[[[0., 0., 0.],
...                       [0., 0., 0.],
...                       [0., 0., 0.]]]])
>>> pad = PadTo((4, 5), pad_value=1.)
>>> out = pad(img)
>>> out
tensor([[[[0., 0., 0., 1., 1.],
          [0., 0., 0., 1., 1.],
          [0., 0., 0., 1., 1.],
          [1., 1., 1., 1., 1.]]]])
>>> pad.inverse(out)
tensor([[[[0., 0., 0.],
          [0., 0., 0.],
          [0., 0., 0.]]]])

class kornia.augmentation.RandomAffine(degrees, translate=None, scale=None, shear=None, resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=False, padding_mode=SamplePadding.ZEROS.name, fill_value=None, p=0.5, keepdim=False)[source]¶

Apply a random 2D affine transformation to a torch.Tensor image.

The transformation is computed so that the image center is kept invariant.

Parameters:

degrees (Union[Tensor, float, Tuple[float, float]]) – Range of degrees to select from. If degrees is a number instead of sequence like (min, max), the range of degrees will be (-degrees, +degrees). Set to 0 to deactivate rotations.
translate (Union[Tensor, Tuple[float, float], None], optional) – tuple of maximum absolute fraction for horizontal and vertical translations. For example translate=(a, b), then horizontal shift is randomly sampled in the range -img_width * a < dx < img_width * a and vertical shift is randomly sampled in the range -img_height * b < dy < img_height * b. Will not translate by default. Default: None
scale (Union[Tensor, Tuple[float, float], Tuple[float, float, float, float], None], optional) – scaling factor interval. If (a, b) represents isotropic scaling, the scale is randomly sampled from the range a <= scale <= b. If (a, b, c, d), the scale is randomly sampled from the range a <= scale_x <= b, c <= scale_y <= d. Will keep original scale by default. Default: None
shear (Union[Tensor, float, Tuple[float, float], None], optional) – Range of degrees to select from. If float, a shear parallel to the x axis in the range (-shear, +shear) will be applied. If (a, b), a shear parallel to the x axis in the range (-shear, +shear) will be applied. If (a, b, c, d), then x-axis shear in (shear[0], shear[1]) and y-axis shear in (shear[2], shear[3]) will be applied. Will not apply shear by default. Default: None
resample (Union[str, int, Resample], optional) – resample mode from “nearest” (0) or “bilinear” (1). Default: Resample.BILINEAR.name
padding_mode (Union[str, int, SamplePadding], optional) – padding mode from “torch.zeros” (0), “border” (1), “reflection” (2) or “fill” (3). Default: SamplePadding.ZEROS.name
fill_value (Union[float, int, Tensor, None], optional) – the value to be filled in the padding area when padding_mode=”fill”. Can be a float, int, or a torch.Tensor of shape (C) or (1). Default: None
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.geometry.transform.warp_affine().

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 3, 3)
>>> aug = RandomAffine((-15., 20.), p=1.)
>>> out = aug(input)
>>> out, aug.transform_matrix
(tensor([[[[0.3961, 0.7310, 0.1574],
          [0.1781, 0.3074, 0.5648],
          [0.4804, 0.8379, 0.4234]]]]), tensor([[[ 0.9923, -0.1241,  0.1319],
         [ 0.1241,  0.9923, -0.1164],
         [ 0.0000,  0.0000,  1.0000]]]))
>>> aug.inverse(out)
tensor([[[[0.3890, 0.6573, 0.1865],
          [0.2063, 0.3074, 0.5459],
          [0.3892, 0.7896, 0.4224]]]])
>>> input
tensor([[[[0.4963, 0.7682, 0.0885],
          [0.1320, 0.3074, 0.6341],
          [0.4901, 0.8964, 0.4556]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomAffine((-15., 20.), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomCrop(size, padding=None, pad_if_needed=False, fill=0, padding_mode='constant', resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=True, p=1.0, keepdim=False, cropping_mode='slice')[source]¶

Crop random patches of a torch.Tensor image on a given size.

Parameters:

size (Tuple[int, int]) – Desired output size (out_h, out_w) of the crop. Must be Tuple[int, int], then out_h = size[0], out_w = size[1].
padding (Union[int, Tuple[int, int], Tuple[int, int, int, int], None], optional) – Optional padding on each border of the image. Default is None, i.e no padding. If a sequence of length 4 is provided, it is used to F.pad left, top, right, bottom borders respectively. If a sequence of length 2 is provided, it is used to F.pad left/right, top/bottom borders, respectively. Default: None
pad_if_needed (Optional[bool], optional) – It will F.pad the image if smaller than the desired size to avoid raising an exception. Since cropping is done after padding, the padding seems to be done at a random offset. Default: False
fill (int, optional) – Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant. Default: 0
padding_mode (str, optional) – Type of padding. Should be: constant, reflect, replicate. Default: "constant"
resample (Union[str, int, Resample], optional) – the interpolation mode. Default: Resample.BILINEAR.name
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: True
p (float, optional) – probability of applying the transformation for the whole batch. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False
cropping_mode (str, optional) – The used algorithm to crop. slice will use advanced slicing to extract the torch.Tensor based on the sampled indices. resample will use warp_affine using the affine transformation to extract and resize at once. Use slice for efficiency, or resample for proper differentiability. Default: "slice"

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, out_h, out_w)\)

Note

Examples

>>> import torch
>>> _ = torch.manual_seed(0)
>>> inputs = torch.arange(1*1*3*3.).view(1, 1, 3, 3)
>>> aug = RandomCrop((2, 2), p=1., cropping_mode="resample")
>>> out = aug(inputs)
>>> out
tensor([[[[3., 4.],
          [6., 7.]]]])
>>> aug.inverse(out, padding_mode="replicate")
tensor([[[[3., 4., 4.],
          [3., 4., 4.],
          [6., 7., 7.]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomCrop((2, 2), p=1., cropping_mode="resample")
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomElasticTransform(kernel_size=(63, 63), sigma=(32.0, 32.0), alpha=(1.0, 1.0), align_corners=False, resample=Resample.BILINEAR.name, padding_mode='zeros', same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add random elastic transformation to a torch.Tensor image.

Parameters:

kernel_size (Tuple[int, int], optional) – the size of the Gaussian kernel. Default: (63, 63)
sigma (Tuple[float, float], optional) – The standard deviation of the Gaussian in the y and x directions, respectively. Larger sigma results in smaller pixel displacements. Default: (32.0, 32.0)
alpha (Tuple[float, float], optional) – The scaling factor that controls the intensity of the deformation in the y and x directions, respectively. Default: (1.0, 1.0)
align_corners (bool, optional) – Interpolation flag used by grid_sample. Default: False
resample (Union[str, int, Resample], optional) – Interpolation mode used by grid_sample. Either ‘nearest’ (0) or ‘bilinear’ (1). Default: Resample.BILINEAR.name
padding_mode (str, optional) – The padding used by `grid_sample`. Either ‘torch.zeros’, ‘border’ or ‘refection’. Default: "zeros"
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Note

This function internally uses kornia.geometry.transform.elastic_transform2d().

Examples

>>> import torch
>>> img = torch.ones(1, 1, 2, 2)
>>> out = RandomElasticTransform()(img)
>>> out.shape
torch.Size([1, 1, 2, 2])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomElasticTransform(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomErasing(scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0.0, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Erase a random rectangle of a torch.Tensor image according to a probability p value.

The operator removes image parts and fills them with zero values at a selected rectangle for each of the images in the batch.

The rectangle will have an area equal to the original image area multiplied by a value uniformly sampled between the range [scale[0], scale[1]) and an aspect ratio sampled between [ratio[0], ratio[1])

Parameters:

scale (Union[Tensor, Tuple[float, float]], optional) – range of proportion of erased area against input image. Default: (0.02, 0.33)
ratio (Union[Tensor, Tuple[float, float]], optional) – range of aspect ratio of erased area. Default: (0.3, 3.3)
value (float, optional) – the value to fill the erased area. Default: 0.0
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability that the random erasing operation will be performed. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

Examples

>>> rng = torch.manual_seed(0)
>>> inputs = torch.ones(1, 1, 3, 3)
>>> aug = RandomErasing((.4, .8), (.3, 1/.3), p=0.5)
>>> aug(inputs)
tensor([[[[1., 0., 0.],
          [1., 0., 0.],
          [1., 0., 0.]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomErasing((.4, .8), (.3, 1/.3), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomFisheye(center_x, center_y, gamma, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add random camera radial distortion.

Parameters:

center_x (Tensor) – Ranges to sample respect to x-coordinate center with shape (2,).
center_y (Tensor) – Ranges to sample respect to y-coordinate center with shape (2,).
gamma (Tensor) – Ranges to sample for the gamma values respect to optical center with shape (2,).
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Examples

>>> import torch
>>> img = torch.ones(1, 1, 2, 2)
>>> center_x = torch.tensor([-.3, .3])
>>> center_y = torch.tensor([-.3, .3])
>>> gamma = torch.tensor([.9, 1.])
>>> out = RandomFisheye(center_x, center_y, gamma)(img)
>>> out.shape
torch.Size([1, 1, 2, 2])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomFisheye(center_x, center_y, gamma, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomHorizontalFlip(p=0.5, p_batch=1.0, same_on_batch=False, keepdim=False)[source]¶

Apply a random horizontal flip to a torch.Tensor image or a batch of torch.Tensor images.

The flip is applied with a given probability.

Input should be a torch.Tensor of shape (C, H, W) or a batch of tensors \((B, C, H, W)\). If Input is a tuple it is assumed that the first element contains the aforementioned tensors and the second, the corresponding transformation matrix that has been applied to them. In this case the module will Horizontally flip the tensors and torch.cat the corresponding transformation matrix to the previous one. This is especially useful when using this functionality as part of an nn.Sequential module.

Parameters:

p (float, optional) – probability of the image being flipped. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.geometry.transform.hflip().

Examples

>>> import torch
>>> input = torch.tensor([[[[0., 0., 0.],
...                         [0., 0., 0.],
...                         [0., 1., 1.]]]])
>>> seq = RandomHorizontalFlip(p=1.0)
>>> seq(input), seq.transform_matrix
(tensor([[[[0., 0., 0.],
          [0., 0., 0.],
          [1., 1., 0.]]]]), tensor([[[-1.,  0.,  2.],
         [ 0.,  1.,  0.],
         [ 0.,  0.,  1.]]]))
>>> seq.inverse(seq(input)).equal(input)
True

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> seq = RandomHorizontalFlip(p=1.0)
>>> (seq(input) == seq(input, params=seq._params)).all()
tensor(True)

class kornia.augmentation.RandomPerspective(distortion_scale=0.5, resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=False, p=0.5, keepdim=False, sampling_method='basic')[source]¶

Apply a random perspective transformation to an image torch.Tensor with a given probability.

Parameters:

distortion_scale (Union[Tensor, float], optional) – the degree of distortion, ranged from 0 to 1. Default: 0.5
resample (Union[str, int, Resample], optional) – the interpolation method to use. Default: Resample.BILINEAR.name
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False.
align_corners (bool, optional) – interpolation flag. Default: False
p (float, optional) – probability of the image being perspectively transformed. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False
sampling_method (str, optional) – 'basic' | 'area_preserving'. Default: 'basic' If 'basic', samples by translating the image corners randomly inwards. If 'area_preserving', samples by randomly translating the image corners in any direction. Preserves area on average. See https://arxiv.org/abs/2104.03308 for further details.

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.geometry.transform.warp_pespective().

Examples

>>> rng = torch.manual_seed(0)
>>> inputs= torch.tensor([[[[1., 0., 0.],
...                         [0., 1., 0.],
...                         [0., 0., 1.]]]])
>>> aug = RandomPerspective(0.5, p=0.5)
>>> out = aug(inputs)
>>> out
tensor([[[[0.0000, 0.2289, 0.0000],
          [0.0000, 0.4800, 0.0000],
          [0.0000, 0.0000, 0.0000]]]])
>>> aug.inverse(out)
tensor([[[[0.0500, 0.0961, 0.0000],
          [0.2011, 0.3144, 0.0000],
          [0.0031, 0.0130, 0.0053]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomPerspective(0.5, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(3.0 / 4.0, 4.0 / 3.0), resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=True, p=1.0, keepdim=False, cropping_mode='slice')[source]¶

Crop random patches in an image torch.Tensor and resizes to a given size.

Parameters:

size (Tuple[int, int]) – Desired output size (out_h, out_w) of each edge. Must be Tuple[int, int], then out_h = size[0], out_w = size[1].
scale (Union[Tensor, Tuple[float, float]], optional) – range of size of the origin size cropped. Default: (0.08, 1.0)
ratio (Union[Tensor, Tuple[float, float]], optional) – range of aspect ratio of the origin aspect ratio cropped. Default: (3.0 / 4.0, 4.0 / 3.0)
resample (Union[str, int, Resample], optional) – the interpolation mode. Default: Resample.BILINEAR.name
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: True
p (float, optional) – probability of the augmentation been applied. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False
cropping_mode (str, optional) – The used algorithm to crop. slice will use advanced slicing to extract the torch.Tensor based on the sampled indices. resample will use warp_affine using the affine transformation to extract and resize at once. Use slice for efficiency, or resample for proper differentiability. Default: "slice"

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, out_h, out_w)\)

Note

Example

>>> rng = torch.manual_seed(0)
>>> inputs = torch.tensor([[[0., 1., 2.],
...                         [3., 4., 5.],
...                         [6., 7., 8.]]])
>>> aug = RandomResizedCrop(size=(3, 3), scale=(3., 3.), ratio=(2., 2.), p=1., cropping_mode="resample")
>>> out = aug(inputs)
>>> out
tensor([[[[1.0000, 1.5000, 2.0000],
          [4.0000, 4.5000, 5.0000],
          [7.0000, 7.5000, 8.0000]]]])
>>> aug.inverse(out, padding_mode="border")
tensor([[[[1., 1., 2.],
          [4., 4., 5.],
          [7., 7., 8.]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomResizedCrop(size=(3, 3), scale=(3., 3.), ratio=(2., 2.), p=1., cropping_mode="resample")
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomRotation90(times, resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=True, p=0.5, keepdim=False)[source]¶

Apply a random 90 * n degree rotation to a torch.Tensor image or a batch of torch.Tensor images.

Parameters:

times (tuple[int, int]) – the range of n times 90 degree rotation needs to be applied.
resample (Union[str, int, Resample], optional) – Default: the interpolation mode.
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: True
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.geometry.transform.affine(). This version is relatively slow as it operates based on affine transformations.

Examples

>>> rng = torch.manual_seed(1)
>>> input = torch.tensor([[1., 0., 0., 2.],
...                       [0., 0., 0., 0.],
...                       [0., 1., 2., 0.],
...                       [0., 0., 1., 2.]])
>>> aug = RandomRotation90(times=(1, 1), p=1.)
>>> out = aug(input)
>>> out
tensor([[[[2.0000e+00, 0.0000e+00, 0.0000e+00, 2.0000e+00],
          [0.0000e+00, 0.0000e+00, 2.0000e+00, 1.0000e+00],
          [5.9605e-08, 0.0000e+00, 1.0000e+00, 0.0000e+00],
          [1.0000e+00, 5.9605e-08, 0.0000e+00, 0.0000e+00]]]])
>>> aug.transform_matrix
tensor([[[-4.3711e-08,  1.0000e+00,  1.1921e-07],
         [-1.0000e+00, -4.3711e-08,  3.0000e+00],
         [ 0.0000e+00,  0.0000e+00,  1.0000e+00]]])
>>> inv = aug.inverse(out)

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomRotation90(times=(-1, 1), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomRotation(degrees, resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=True, p=0.5, keepdim=False)[source]¶

Apply a random rotation to a torch.Tensor image or a batch of torch.Tensor images given an amount of degrees.

Parameters:

degrees (Union[Tensor, float, Tuple[float, float], List[float]]) – range of degrees to select from. If degrees is a number the range of degrees to select from will be (-degrees, +degrees).
resample (Union[str, int, Resample], optional) – Default: the interpolation mode.
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: True
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.geometry.transform.affine().

Examples

>>> rng = torch.manual_seed(0)
>>> input = torch.tensor([[1., 0., 0., 2.],
...                       [0., 0., 0., 0.],
...                       [0., 1., 2., 0.],
...                       [0., 0., 1., 2.]])
>>> aug = RandomRotation(degrees=45.0, p=1.)
>>> out = aug(input)
>>> out
tensor([[[[0.9824, 0.0088, 0.0000, 1.9649],
          [0.0000, 0.0029, 0.0000, 0.0176],
          [0.0029, 1.0000, 1.9883, 0.0000],
          [0.0000, 0.0088, 1.0117, 1.9649]]]])
>>> aug.transform_matrix
tensor([[[ 1.0000, -0.0059,  0.0088],
         [ 0.0059,  1.0000, -0.0088],
         [ 0.0000,  0.0000,  1.0000]]])
>>> inv = aug.inverse(out)

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomRotation(degrees=45.0, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomShear(shear, resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=False, padding_mode=SamplePadding.ZEROS.name, p=0.5, keepdim=False)[source]¶

Apply a random 2D shear transformation to a torch.Tensor image.

The transformation is computed so that the image center is kept invariant.

Parameters:

shear (Union[Tensor, float, Tuple[float, float], Tuple[float, float, float, float]]) – Range of degrees to select from. If float, a shear parallel to the x axis in the range (-shear, +shear) will be applied. If (a, b), a shear parallel to the x axis in the range (-shear, +shear) will be applied. If (a, b, c, d), then x-axis shear in (shear[0], shear[1]) and y-axis shear in (shear[2], shear[3]) will be applied. Will not apply shear by default.
resample (Union[str, int, Resample], optional) – resample mode from “nearest” (0) or “bilinear” (1). Default: Resample.BILINEAR.name
padding_mode (Union[str, int, SamplePadding], optional) – padding mode from “torch.zeros” (0), “border” (1) or “reflection” (2). Default: SamplePadding.ZEROS.name
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.geometry.transform.warp_affine().

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 3, 3)
>>> aug = RandomShear((-5., 2., 5., 10.), p=1.)
>>> out = aug(input)
>>> out, aug.transform_matrix
(tensor([[[[0.4403, 0.7614, 0.1516],
          [0.1753, 0.3074, 0.6127],
          [0.4438, 0.8924, 0.4061]]]]), tensor([[[ 1.0000,  0.0100, -0.0100],
         [-0.1183,  0.9988,  0.1194],
         [ 0.0000,  0.0000,  1.0000]]]))
>>> aug.inverse(out)
tensor([[[[0.4045, 0.7577, 0.1393],
          [0.2071, 0.3074, 0.5582],
          [0.3958, 0.8868, 0.4265]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomShear((-15., 20.), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomThinPlateSpline(scale=0.2, align_corners=False, padding_mode=SamplePadding.ZEROS.name, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Add random noise to the Thin Plate Spline algorithm.

Parameters:

scale (float, optional) – the scale factor to apply to the destination points. Default: 0.2
align_corners (bool, optional) – Interpolation flag used by grid_sample. Default: False
mode – Interpolation mode used by grid_sample. Either ‘bilinear’ or ‘nearest’.
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
p (float, optional) – probability of applying the transformation. Default: 0.5
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Note

This function internally uses kornia.geometry.transform.warp_image_tps().

Examples

>>> img = torch.ones(1, 1, 2, 2)
>>> out = RandomThinPlateSpline()(img)
>>> out.shape
torch.Size([1, 1, 2, 2])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> aug = RandomThinPlateSpline(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomVerticalFlip(p=0.5, p_batch=1.0, same_on_batch=False, keepdim=False)[source]¶

Apply a random vertical flip to a torch.Tensor image or a batch of torch.Tensor images with a given probability.

Parameters:

p (float, optional) – probability of the image being flipped. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, H, W)\) or \((B, C, H, W)\), Optional: \((B, 3, 3)\)
Output: \((B, C, H, W)\)

Note

This function internally uses kornia.geometry.transform.vflip().

Examples

>>> import torch
>>> input = torch.tensor([[[[0., 0., 0.],
...                         [0., 0., 0.],
...                         [0., 1., 1.]]]])
>>> seq = RandomVerticalFlip(p=1.0)
>>> seq(input), seq.transform_matrix
(tensor([[[[0., 1., 1.],
          [0., 0., 0.],
          [0., 0., 0.]]]]), tensor([[[ 1.,  0.,  0.],
         [ 0., -1.,  2.],
         [ 0.,  0.,  1.]]]))
>>> seq.inverse(seq(input)).equal(input)
True

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.randn(1, 3, 32, 32)
>>> seq = RandomVerticalFlip(p=1.0)
>>> (seq(input) == seq(input, params=seq._params)).all()
tensor(True)

Mix¶

class kornia.augmentation.RandomCutMixV2(num_mix=1, cut_size=None, beta=None, same_on_batch=False, p=1.0, keepdim=False, data_keys=None, use_correct_lambda=False)[source]¶

Apply CutMix augmentation to a batch of torch.Tensor images.

Implementation for CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features [YHO+19].

The function returns (inputs, labels), in which the inputs is the torch.Tensor that contains the mixup images while the labels is a \((\text{num_mixes}, B, 3)\) torch.Tensor that contains (label_permuted_batch, lambda) for each cutmix.

The implementation referred to the following repository: https://github.com/clovaai/CutMix-PyTorch.

Parameters:

height – the width of the input image.
width – the width of the input image.
p (float, optional) – probability for applying an augmentation to a batch. This param controls the augmentation probabilities batch-wisely. Default: 1.0
num_mix (int, optional) – cut mix times. Default: 1
beta (Union[Tensor, float, None], optional) – hyperparameter for generating cut size from beta distribution. Beta cannot be set to 0 after torch 1.8.0. If None, it will be set to 1. Default: None
cut_size (Union[Tensor, Tuple[float, float], None], optional) – controlling the minimum and maximum cut ratio from [0, 1]. If None, it will be set to [0, 1], which means no restriction. Default: None
same_on_batch (bool, optional) – apply the same transformation across the batch. This flag will not maintain permutation order. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False
use_correct_lambda (bool, optional) – if True, compute lambda according to the CutMix paper (lam = 1 - area_ratio). Defaults to False (lam = area_ratio) for backward compatibility, but will raise a deprecation warning when False. Default: False

Inputs:

Input image tensors, shape of \((B, C, H, W)\).
Raw labels, shape of \((B)\).

Returns:

Adjusted image, shape of \((B, C, H, W)\).
Raw labels, permuted labels and lambdas for each mix, shape of \((B, num_mix, 3)\).

Return type:

Tuple[torch.Tensor, torch.Tensor]

Note

This implementation would randomly cutmix images in a batch. Ideally, the larger batch size would be preferred.

Examples

>>> rng = torch.manual_seed(3)
>>> input = torch.rand(2, 1, 3, 3)
>>> input[0] = torch.ones((1, 3, 3))
>>> label = torch.tensor([0, 1])
>>> cutmix = RandomCutMixV2(data_keys=["input", "class"], use_correct_lambda=True)
>>> cutmix(input, label)
[tensor([[[[0.8879, 0.4510, 1.0000],
          [0.1498, 0.4015, 1.0000],
          [1.0000, 1.0000, 1.0000]]],


        [[[1.0000, 1.0000, 0.7995],
          [1.0000, 1.0000, 0.0542],
          [0.4594, 0.1756, 0.9492]]]]), tensor([[[0.0000, 1.0000, 0.5556],
         [1.0000, 0.0000, 0.5556]]])]

class kornia.augmentation.RandomJigsaw(grid=(4, 4), data_keys=None, p=0.5, same_on_batch=False, keepdim=False, ensure_perm=True)[source]¶

RandomJigsaw augmentation.

Make Jigsaw puzzles for each image individually. To mix with different images in a batch, referring to kornia.augmentation.RandomMosic.

Parameters:

grid (Tuple[int, int], optional) – the Jigsaw puzzle grid. e.g. (2, 2) means each output will mix image patches in a 2x2 grid. Default: (4, 4)
ensure_perm (bool, optional) – to ensure the nonidentical patch permutation generation against the original one. Default: True
data_keys (Optional[List[Union[str, int, DataKey]]], optional) – the input type sequential for applying augmentations. Accepts “input”, “image”, “mask”, “bbox”, “bbox_xyxy”, “bbox_xywh”, “keypoints”, “class”, “label”. Default: None
p (float, optional) – probability of applying the transformation for the whole batch. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

Examples

>>> jigsaw = RandomJigsaw((4, 4))
>>> input = torch.randn(8, 3, 256, 256)
>>> out = jigsaw(input)
>>> out.shape
torch.Size([8, 3, 256, 256])

class kornia.augmentation.RandomMixUpV2(lambda_val=None, same_on_batch=False, p=1.0, keepdim=False, data_keys=None)[source]¶

Apply MixUp augmentation to a batch of torch.Tensor images.

Implementation for mixup: BEYOND EMPIRICAL RISK MINIMIZATION [ZnYNDLP18].

The function returns (inputs, labels), in which the inputs is the torch.Tensor that contains the mixup images while the labels is a \((B, 3)\) torch.Tensor that contains (label_batch, label_permuted_batch, lambda) for each image.

The implementation is on top of the following repository: https://github.com/hongyi-zhang/mixup/blob/master/cifar/utils.py.

The loss and accuracy are computed as:

def loss_mixup(y, logits):
    criterion = F.cross_entropy
    loss_a = criterion(logits, y[:, 0].long(), reduction='none')
    loss_b = criterion(logits, y[:, 1].long(), reduction='none')
    return ((1 - y[:, 2]) * loss_a + y[:, 2] * loss_b).mean()

def acc_mixup(y, logits):
    pred = torch.argmax(logits, dim=1).to(y.device)
    return (1 - y[:, 2]) * pred.eq(y[:, 0]).float() + y[:, 2] * pred.eq(y[:, 1]).float()

Parameters:

p (float, optional) – probability for applying an augmentation to a batch. This param controls the augmentation probabilities batch-wisely. Default: 1.0
lambda_val (Union[Tensor, Tuple[float, float], None], optional) – min-max value of mixup strength. Default is 0-1. Default: None
same_on_batch (bool, optional) – apply the same transformation across the batch. This flag will not maintain permutation order. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Inputs:

Input image tensors, shape of \((B, C, H, W)\).
Label: raw labels, shape of \((B)\).

Returns:

Adjusted image, shape of \((B, C, H, W)\).
Raw labels, permuted labels and lambdas for each mix, shape of \((B, 3)\).

Return type:

Tuple[torch.Tensor, torch.Tensor]

Note

This implementation would randomly mixup images in a batch. Ideally, the larger batch size would be preferred.

Examples

>>> rng = torch.manual_seed(1)
>>> input = torch.rand(2, 1, 3, 3)
>>> label = torch.tensor([0, 1])
>>> mixup = RandomMixUpV2(data_keys=["input", "class"])
>>> mixup(input, label)
[tensor([[[[0.7576, 0.2793, 0.4031],
          [0.7347, 0.0293, 0.7999],
          [0.3971, 0.7544, 0.5695]]],


        [[[0.4388, 0.6387, 0.5247],
          [0.6826, 0.3051, 0.4635],
          [0.4550, 0.5725, 0.4980]]]]), tensor([[0.0000, 0.0000, 0.1980],
        [1.0000, 1.0000, 0.4162]])]

class kornia.augmentation.RandomMosaic(output_size=None, mosaic_grid=(2, 2), start_ratio_range=(0.3, 0.7), min_bbox_size=0.0, data_keys=None, p=0.7, keepdim=False, padding_mode='constant', resample=Resample.BILINEAR.name, align_corners=True, cropping_mode='slice')[source]¶

Mosaic augmentation.

https://raw.githubusercontent.com/kornia/data/main/random_mosaic.png

Given a certain number of images, mosaic transform combines them into one output image. The output image is composed of the parts from each sub-image. To mess up each image individually, referring to kornia.augmentation.RandomJigsaw.

The mosaic transform steps are as follows:

Concate selected images into a super-image.

Crop out the outcome image according to the top-left corner and crop size.

Parameters:

output_size (Optional[Tuple[int, int]], optional) – the output torch.Tensor width and height after mosaicing. Default: None
start_ratio_range (Tuple[float, float], optional) – top-left (x, y) position for cropping the mosaic images. Default: (0.3, 0.7)
mosaic_grid (Tuple[int, int], optional) – the number of images and image arrangement. e.g. (2, 2) means each output will mix 4 images in a 2x2 grid. Default: (2, 2)
min_bbox_size (float, optional) – minimum area of bounding boxes. Default to 0. Default: 0.0
data_keys (Optional[List[Union[str, int, DataKey]]], optional) – the input type sequential for applying augmentations. Accepts “input”, “image”, “mask”, “bbox”, “bbox_xyxy”, “bbox_xywh”, “keypoints”, “class”, “label”. Default: None
p (float, optional) – probability of applying the transformation for the whole batch. Default: 0.7
keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False
padding_mode (str, optional) – Type of padding. Should be: constant, reflect, replicate. Default: "constant"
resample (Union[str, int, Resample], optional) – the interpolation mode. Default: Resample.BILINEAR.name
align_corners (bool, optional) – interpolation flag. Default: True
cropping_mode (str, optional) – The used algorithm to crop. slice will use advanced slicing to extract the torch.Tensor based on the sampled indices. resample will use warp_affine using the affine transformation to extract and resize at once. Use slice for efficiency, or resample for proper differentiability. Default: "slice"

Examples

>>> mosaic = RandomMosaic((300, 300), data_keys=["input", "bbox_xyxy"])
>>> boxes = torch.tensor([[
...     [70, 5, 150, 100],
...     [60, 180, 175, 220],
... ]]).repeat(8, 1, 1)
>>> input = torch.randn(8, 3, 224, 224)
>>> out = mosaic(input, boxes)
>>> out[0].shape, out[1].shape
(torch.Size([8, 3, 300, 300]), torch.Size([8, 8, 4]))

class kornia.augmentation.RandomTransplantation(excluded_labels=None, p=0.5, p_batch=1.0, data_keys=None)[source]¶

RandomTransplantation augmentation.

Randomly transplant (copy and paste) image features and corresponding segmentation masks between images in a batch. The transplantation transform works as follows:

Based on the parameter p, a certain number of images in the batch are selected as acceptor of a transplantation.

For each acceptor, the image below in the batch is selected as donor (via circling: \(i - 1 \mod B\)).

From the donor, a random label is selected and the corresponding image features and segmentation mask are transplanted to the acceptor.

The augmentation is described in Semantic segmentation of surgical hyperspectral images under geometric domain shifts [SSSF+23].

Parameters:

excluded_labels (Union[Sequence[int], Tensor, None], optional) – sequence of labels which should not be transplanted from a donor. This can be useful if only parts of the image are annotated and the non-annotated regions (with a specific label index) should be excluded from the augmentation. If no label is left in the donor image, nothing is transplanted. Default: None
p (float, optional) – probability for applying an augmentation to an image. This parameter controls how many images in a batch receive a transplant. Default: 0.5
p_batch (float, optional) – probability for applying an augmentation to a batch. This param controls the augmentation probabilities batch-wise. Default: 1.0
data_keys (Optional[list[str | int | DataKey]], optional) – the input type sequential for applying augmentations. There must be at least one “mask” torch.Tensor. If no data keys are given, the first torch.Tensor is assumed to be DataKey.INPUT and the second torch.Tensor DataKey.MASK. Accepts “input”, “mask”. Default: None

Note

This augmentation requires that segmentation masks are available for all images in the batch and that at least some objects in the image are annotated.
When using this class directly (RandomTransplantation()(…)), it works for arbitrary spatial dimensions including 2D and 3D images. When wrapping in kornia.augmentation.AugmentationSequential, use kornia.augmentation.RandomTransplantation for 2D and kornia.augmentation.RandomTransplantation3D for 3D images.

Inputs:

Segmentation mask torch.Tensor which is used to determine the objects for transplantation: \((B, *)\).
(optional) Additional image or mask tensors where the features are transplanted based on the first segmentation mask: \((B, C, *)\) (DataKey.INPUT) or \((B, *)\) (DataKey.MASK).

Returns:

torch.Tensor:

Augmented mask tensors: \((B, *)\).

list[torch.Tensor]:

Augmented mask tensors: \((B, *)\).
Additional augmented image or mask tensors: \((B, C, *)\) (DataKey.INPUT) or \((B, *)\) (DataKey.MASK).

Return type:

torch.Tensor | list[torch.Tensor]

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> aug = RandomTransplantation(p=1.)
>>> image = torch.randn(2, 3, 5, 5)
>>> mask = torch.randint(0, 3, (2, 5, 5))
>>> mask
tensor([[[0, 0, 1, 1, 0],
         [1, 2, 0, 0, 0],
         [1, 2, 1, 1, 0],
         [0, 0, 0, 0, 2],
         [2, 2, 2, 0, 2]],

        [[2, 0, 0, 2, 1],
         [2, 1, 0, 2, 1],
         [2, 0, 1, 0, 2],
         [2, 2, 2, 0, 2],
         [2, 1, 0, 0, 0]]])
>>> image_out, mask_out = aug(image, mask)
>>> image_out.shape
torch.Size([2, 3, 5, 5])
>>> mask_out.shape
torch.Size([2, 5, 5])
>>> mask_out
tensor([[[2, 0, 1, 2, 0],
         [2, 2, 0, 2, 0],
         [2, 2, 1, 1, 2],
         [2, 2, 2, 0, 2],
         [2, 2, 2, 0, 2]],

        [[0, 0, 0, 2, 0],
         [2, 1, 0, 0, 0],
         [2, 0, 1, 0, 0],
         [0, 0, 0, 0, 2],
         [2, 1, 0, 0, 0]]])
>>> aug._params["selected_labels"]  # Image 0 received label 2 from image 1 and image 1 label 0 from image 0
tensor([2, 0])

You can apply the same augmentation again in which case the same objects get transplanted between the images:

>>> aug._params["selection"]  # The pixels (objects) which get transplanted
tensor([[[ True, False, False,  True, False],
         [ True, False, False,  True, False],
         [ True, False, False, False,  True],
         [ True,  True,  True, False,  True],
         [ True, False, False, False, False]],

        [[ True,  True, False, False,  True],
         [False, False,  True,  True,  True],
         [False, False, False, False,  True],
         [ True,  True,  True,  True, False],
         [False, False, False,  True, False]]])
>>> image2 = torch.zeros(2, 3, 5, 5)
>>> image2[1] = 1
>>> image2[:, 0]
tensor([[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]]])
>>> image_out2, mask_out2 = aug(image2, mask, params=aug._params)
>>> image_out2[:, 0]
tensor([[[1., 0., 0., 1., 0.],
         [1., 0., 0., 1., 0.],
         [1., 0., 0., 0., 1.],
         [1., 1., 1., 0., 1.],
         [1., 0., 0., 0., 0.]],

        [[0., 0., 1., 1., 0.],
         [1., 1., 0., 0., 0.],
         [1., 1., 1., 1., 0.],
         [0., 0., 0., 0., 1.],
         [1., 1., 1., 0., 1.]]])

Transforms3D¶

Set of operators to perform data augmentation on 3D volumetric tensors.

Geometric¶

class kornia.augmentation.CenterCrop3D(size, align_corners=True, resample=Resample.BILINEAR.name, p=1.0, keepdim=False)[source]¶

Apply center crop on 3D volumes (5D torch.Tensor).

Parameters:

p (float, optional) – probability of applying the transformation for the whole batch. Default: 1.0
size (Tuple[int, int, int] or int) – Desired output size (out_d, out_h, out_w) of the crop. If integer, out_d = out_h = out_w = size. If Tuple[int, int, int], out_d = size[0], out_h = size[1], out_w = size[2].
resample (Union[str, int, Resample], optional) – resample mode from “nearest” (0) or “bilinear” (1). Default: Resample.BILINEAR.name
align_corners (bool, optional) – interpolation flag. Default: True
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, out_d, out_h, out_w)\)

Note

Input torch.Tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation torch.Tensor (\((B, 4, 4)\)), then the applied transformation will be merged int to the input transformation torch.Tensor and returned.

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> inputs = torch.randn(1, 1, 2, 4, 6)
>>> inputs
tensor([[[[[-1.1258, -1.1524, -0.2506, -0.4339,  0.8487,  0.6920],
           [-0.3160, -2.1152,  0.3223, -1.2633,  0.3500,  0.3081],
           [ 0.1198,  1.2377,  1.1168, -0.2473, -1.3527, -1.6959],
           [ 0.5667,  0.7935,  0.5988, -1.5551, -0.3414,  1.8530]],

          [[ 0.7502, -0.5855, -0.1734,  0.1835,  1.3894,  1.5863],
           [ 0.9463, -0.8437, -0.6136,  0.0316, -0.4927,  0.2484],
           [ 0.4397,  0.1124,  0.6408,  0.4412, -0.1023,  0.7924],
           [-0.2897,  0.0525,  0.5229,  2.3022, -1.4689, -1.5867]]]]])
>>> aug = CenterCrop3D(2, p=1.)
>>> aug(inputs)
tensor([[[[[ 0.3223, -1.2633],
           [ 1.1168, -0.2473]],

          [[-0.6136,  0.0316],
           [ 0.6408,  0.4412]]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = CenterCrop3D(24, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomAffine3D(degrees, translate=None, scale=None, shears=None, resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=False, p=0.5, keepdim=False)[source]¶

Apply affine transformation 3D volumes (5D torch.Tensor).

The transformation is computed so that the center is kept invariant.

Parameters:

degrees (Union[Tensor, float, Tuple[float, float], Tuple[float, float, float], Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]]]) – Range of yaw (x-axis), pitch (y-axis), roll (z-axis) to select from. If degrees is a number, then yaw, pitch, roll will be generated from the range of (-degrees, +degrees). If degrees is a tuple of (min, max), then yaw, pitch, roll will be generated from the range of (min, max). If degrees is a list of floats [a, b, c], then yaw, pitch, roll will be generated from (-a, a), (-b, b) and (-c, c). If degrees is a list of tuple ((a, b), (m, n), (x, y)), then yaw, pitch, roll will be generated from (a, b), (m, n) and (x, y). Set to 0 to deactivate rotations.
translate (Union[Tensor, Tuple[float, float, float], None], optional) – tuple of maximum absolute fraction for horizontal, vertical and depthical translations (dx,dy,dz). For example translate=(a, b, c), then horizontal shift will be randomly sampled in the range -img_width * a < dx < img_width * a vertical shift will be randomly sampled in the range -img_height * b < dy < img_height * b. depthical shift will be randomly sampled in the range -img_depth * c < dz < img_depth * c. Will not translate by default. Default: None
scale (Union[Tensor, Tuple[float, float], Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]], None], optional) – scaling factor interval. If (a, b) represents isotropic scaling, the scale is randomly sampled from the range a <= scale <= b. If ((a, b), (c, d), (e, f)), the scale is randomly sampled from the range a <= scale_x <= b, c <= scale_y <= d, e <= scale_z <= f. Will keep original scale by default. Default: None
shears (Union[None, Tensor, float, Tuple[float, float], Tuple[float, float, float, float, float, float], Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float], Tuple[float, float], Tuple[float, float], Tuple[float, float]]], optional) – Range of degrees to select from. If shear is a number, a shear to the 6 facets in the range (-shear, +shear) will be applied. If shear is a tuple of 2 values, a shear to the 6 facets in the range (shear[0], shear[1]) will be applied. If shear is a tuple of 6 values, a shear to the i-th facet in the range (-shear[i], shear[i]) will be applied. If shear is a tuple of 6 tuples, a shear to the i-th facet in the range (-shear[i, 0], shear[i, 1]) will be applied. Default: None
resample (Union[str, int, Resample], optional) – resample mode from “nearest” (0) or “bilinear” (1). Default: Resample.BILINEAR.name
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False.

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, D, H, W)\)

Note

Input torch.Tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation torch.Tensor (\((B, 4, 4)\)), then the applied transformation will be merged int to the input transformation torch.Tensor and returned.

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 3, 3, 3)
>>> aug = RandomAffine3D((15., 20., 20.), p=1.)
>>> aug(input), aug.transform_matrix
(tensor([[[[[0.4503, 0.4763, 0.1680],
           [0.2029, 0.4267, 0.3515],
           [0.3195, 0.5436, 0.3706]],

          [[0.5255, 0.3508, 0.4858],
           [0.0795, 0.1689, 0.4220],
           [0.5306, 0.7234, 0.6879]],

          [[0.2971, 0.2746, 0.3471],
           [0.4924, 0.4960, 0.6460],
           [0.3187, 0.4556, 0.7596]]]]]), tensor([[[ 0.9722, -0.0603,  0.2262, -0.1381],
         [ 0.1131,  0.9669, -0.2286,  0.1486],
         [-0.2049,  0.2478,  0.9469,  0.0102],
         [ 0.0000,  0.0000,  0.0000,  1.0000]]]))

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomAffine3D((15., 20., 20.), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomCrop3D(size, padding=None, pad_if_needed=False, fill=0, padding_mode='constant', resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=True, p=1.0, keepdim=False)[source]¶

Apply random crop on 3D volumes (5D torch.Tensor).

Crops random sub-volumes on a given size.

Parameters:

p (float, optional) – probability of applying the transformation for the whole batch. Default: 1.0
size (Tuple[int, int, int]) – Desired output size (out_d, out_h, out_w) of the crop. Must be Tuple[int, int, int], then out_d = size[0], out_h = size[1], out_w = size[2].
padding (Union[int, Tuple[int, int, int], Tuple[int, int, int, int, int, int], None], optional) – Optional padding on each border of the image. Default is None, i.e no padding. If a sequence of length 6 is provided, it is used to F.pad left, top, right, bottom, front, back borders respectively. If a sequence of length 3 is provided, it is used to F.pad left/right, top/bottom, front/back borders, respectively. Default: None
pad_if_needed (Optional[bool], optional) – It will F.pad the image if smaller than the desired size to avoid raising an exception. Since cropping is done after padding, the padding seems to be done at a random offset. Default: False
fill (int, optional) – Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant. Default: 0
padding_mode (str, optional) – Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant. Default: "constant"
resample (Union[str, int, Resample], optional) – resample mode from “nearest” (0) or “bilinear” (1). Default: Resample.BILINEAR.name
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: True
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, , out_d, out_h, out_w)\)

Note

Input torch.Tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation torch.Tensor (\((B, 4, 4)\)), then the applied transformation will be merged int to the input transformation torch.Tensor and returned.

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> inputs = torch.randn(1, 1, 3, 3, 3)
>>> aug = RandomCrop3D((2, 2, 2), p=1.)
>>> aug(inputs)
tensor([[[[[-1.1258, -1.1524],
           [-0.4339,  0.8487]],

          [[-1.2633,  0.3500],
           [ 0.1665,  0.8744]]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomCrop3D((24, 24, 24), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomDepthicalFlip3D(same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply random flip along the depth axis of 3D volumes (5D tensor).

Input should be a tensor of shape \((C, D, H, W)\) or a batch of tensors \((*, C, D, H, W)\). If Input is a tuple it is assumed that the first element contains the aforementioned tensors and the second, the corresponding transformation matrix that has been applied to them. In this case the module will Depthically flip the tensors and concatenate the corresponding transformation matrix to the previous one. This is especially useful when using this functionality as part of an nn.Sequential module.

Parameters:

p (float, optional) – probability of the image being flipped. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, D, H, W)\)

Note

Input tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation torch.tensor(\((B, 4, 4)\)), then the applied transformation will be merged int to the input transformation tensor and returned.

Examples

>>> import torch
>>> x = torch.eye(3).repeat(3, 1, 1)
>>> seq = RandomDepthicalFlip3D(p=1.0)
>>> seq(x), seq.transform_matrix
(tensor([[[[[1., 0., 0.],
           [0., 1., 0.],
           [0., 0., 1.]],

          [[1., 0., 0.],
           [0., 1., 0.],
           [0., 0., 1.]],

          [[1., 0., 0.],
           [0., 1., 0.],
           [0., 0., 1.]]]]]), tensor([[[ 1.,  0.,  0.,  0.],
         [ 0.,  1.,  0.,  0.],
         [ 0.,  0., -1.,  2.],
         [ 0.,  0.,  0.,  1.]]]))

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomDepthicalFlip3D(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomHorizontalFlip3D(same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply random horizontal flip to 3D volumes (5D tensor).

Parameters:

p (float, optional) – probability of the image being flipped. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, D, H, W)\)

Note

Examples

>>> import torch
>>> x = torch.eye(3).repeat(3, 1, 1)
>>> seq = RandomHorizontalFlip3D(p=1.0)
>>> seq(x), seq.transform_matrix
(tensor([[[[[0., 0., 1.],
           [0., 1., 0.],
           [1., 0., 0.]],

          [[0., 0., 1.],
           [0., 1., 0.],
           [1., 0., 0.]],

          [[0., 0., 1.],
           [0., 1., 0.],
           [1., 0., 0.]]]]]), tensor([[[-1.,  0.,  0.,  2.],
         [ 0.,  1.,  0.,  0.],
         [ 0.,  0.,  1.,  0.],
         [ 0.,  0.,  0.,  1.]]]))

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomHorizontalFlip3D(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomRotation3D(degrees, resample=Resample.BILINEAR.name, same_on_batch=False, align_corners=False, p=0.5, keepdim=False)[source]¶

Apply random rotations to 3D volumes (5D torch.Tensor).

Input should be a torch.Tensor of shape (C, D, H, W) or a batch of tensors \((B, C, D, H, W)\). If Input is a tuple it is assumed that the first element contains the aforementioned tensors and the second, the corresponding transformation matrix that has been applied to them. In this case the module will rotate the tensors and torch.cat the corresponding transformation matrix to the previous one. This is especially useful when using this functionality as part of an nn.Sequential module.

Parameters:

degrees (Union[Tensor, float, Tuple[float, float, float], Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]]]) – Range of degrees to select from. If degrees is a number, then yaw, pitch, roll will be generated from the range of (-degrees, +degrees). If degrees is a tuple of (min, max), then yaw, pitch, roll will be generated from the range of (min, max). If degrees is a list of floats [a, b, c], then yaw, pitch, roll will be generated from (-a, a), (-b, b) and (-c, c). If degrees is a list of tuple ((a, b), (m, n), (x, y)), then yaw, pitch, roll will be generated from (a, b), (m, n) and (x, y). Set to 0 to deactivate rotations.
resample (Union[str, int, Resample], optional) – resample mode from “nearest” (0) or “bilinear” (1). Default: Resample.BILINEAR.name
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
align_corners (bool, optional) – interpolation flag. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, D, H, W)\)

Note

Input torch.Tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation torch.Tensor (\((B, 4, 4)\)), then the applied transformation will be merged int to the input transformation torch.Tensor and returned.

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 3, 3, 3)
>>> aug = RandomRotation3D((15., 20., 20.), p=1.0)
>>> aug(input), aug.transform_matrix
(tensor([[[[[0.3819, 0.4886, 0.2111],
           [0.1196, 0.3833, 0.4722],
           [0.3432, 0.5951, 0.4223]],

          [[0.5553, 0.4374, 0.2780],
           [0.2423, 0.1689, 0.4009],
           [0.4516, 0.6376, 0.7327]],

          [[0.1605, 0.3112, 0.3673],
           [0.4931, 0.4620, 0.5700],
           [0.3505, 0.4685, 0.8092]]]]]), tensor([[[ 0.9722,  0.1131, -0.2049,  0.1196],
         [-0.0603,  0.9669,  0.2478, -0.1545],
         [ 0.2262, -0.2286,  0.9469,  0.0556],
         [ 0.0000,  0.0000,  0.0000,  1.0000]]]))

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomRotation3D((15., 20., 20.), p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomVerticalFlip3D(same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply random vertical flip to 3D volumes (5D tensor).

Input should be a tensor of shape \((C, D, H, W)\) or a batch of tensors \((*, C, D, H, W)\). If Input is a tuple it is assumed that the first element contains the aforementioned tensors and the second, the corresponding transformation matrix that has been applied to them. In this case the module will Vertically flip the tensors and concatenate the corresponding transformation matrix to the previous one. This is especially useful when using this functionality as part of an nn.Sequential module.

Parameters:

p (float, optional) – probability of the image being flipped. Default: 0.5
same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False
keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, D, H, W)\)

Note

Examples

>>> import torch
>>> x = torch.eye(3).repeat(3, 1, 1)
>>> seq = RandomVerticalFlip3D(p=1.0)
>>> seq(x), seq.transform_matrix
(tensor([[[[[0., 0., 1.],
           [0., 1., 0.],
           [1., 0., 0.]],

          [[0., 0., 1.],
           [0., 1., 0.],
           [1., 0., 0.]],

          [[0., 0., 1.],
           [0., 1., 0.],
           [1., 0., 0.]]]]]), tensor([[[ 1.,  0.,  0.,  0.],
         [ 0., -1.,  0.,  2.],
         [ 0.,  0.,  1.,  0.],
         [ 0.,  0.,  0.,  1.]]]))

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomVerticalFlip3D(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

Intensity¶

class kornia.augmentation.RandomEqualize3D(p=0.5, same_on_batch=False, keepdim=False)[source]¶

Apply random equalization to 3D volumes (5D tensor).

Parameters:

p (float, optional) – probability of the image being equalized. Default: 0.5
same_on_batch) – apply the same transformation across the batch.
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, D, H, W)\)

Note

Input tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation tensor (\((B, 4, 4)\)), then the applied transformation will be merged int to the input transformation tensor and returned.

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 3, 3, 3)
>>> aug = RandomEqualize3D(p=1.0)
>>> aug(input)
tensor([[[[[0.4963, 0.7682, 0.0885],
           [0.1320, 0.3074, 0.6341],
           [0.4901, 0.8964, 0.4556]],

          [[0.6323, 0.3489, 0.4017],
           [0.0223, 0.1689, 0.2939],
           [0.5185, 0.6977, 0.8000]],

          [[0.1610, 0.2823, 0.6816],
           [0.9152, 0.3971, 0.8742],
           [0.4194, 0.5529, 0.9527]]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomEqualize3D(p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

class kornia.augmentation.RandomMotionBlur3D(kernel_size, angle, direction, border_type=BorderType.CONSTANT.name, resample=Resample.NEAREST.name, same_on_batch=False, p=0.5, keepdim=False)[source]¶

Apply random motion blur on 3D volumes (5D torch.Tensor).

Parameters:

p (float, optional) – probability of applying the transformation. Default: 0.5
kernel_size (Union[int, Tuple[int, int]]) – motion kernel size (odd and positive). If int, the kernel will have a fixed size. If Tuple[int, int], it will randomly generate the value from the range batch-wisely.
angle (Union[Tensor, float, Tuple[float, float, float], Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]]]) – Range of degrees to select from. If angle is a number, then yaw, pitch, roll will be generated from the range of (-angle, +angle). If angle is a tuple of (min, max), then yaw, pitch, roll will be generated from the range of (min, max). If angle is a list of floats [a, b, c], then yaw, pitch, roll will be generated from (-a, a), (-b, b) and (-c, c). If angle is a list of tuple ((a, b), (m, n), (x, y)), then yaw, pitch, roll will be generated from (a, b), (m, n) and (x, y). Set to 0 to deactivate rotations.
direction (Union[Tensor, float, Tuple[float, float]]) – forward/backward direction of the motion blur. Lower values towards -1.0 will point the motion blur towards the back (with angle provided via angle), while higher values towards 1.0 will point the motion blur forward. A value of 0.0 leads to a uniformly (but still angled) motion blur. If float, it will generate the value from (-direction, direction). If Tuple[int, int], it will randomly generate the value from the range.
border_type (Union[int, str, BorderType], optional) – the padding mode to be applied before convolving. CONSTANT = 0, REFLECT = 1, REPLICATE = 2, CIRCULAR = 3. Default: BorderType.CONSTANT.
resample (Union[str, int, Resample], optional) – resample mode from “nearest” (0) or “bilinear” (1). Default: Resample.NEAREST.name
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Shape:

Input: \((C, D, H, W)\) or \((B, C, D, H, W)\), Optional: \((B, 4, 4)\)
Output: \((B, C, D, H, W)\)

Note

Input torch.Tensor must be float and normalized into [0, 1] for the best differentiability support. Additionally, this function accepts another transformation torch.Tensor (\((B, 4, 4)\)), then the applied transformation will be merged int to the input transformation torch.Tensor and returned.

Examples

>>> import torch
>>> rng = torch.manual_seed(0)
>>> input = torch.rand(1, 1, 3, 5, 5)
>>> motion_blur = RandomMotionBlur3D(3, 35., 0.5, p=1.)
>>> motion_blur(input)
tensor([[[[[0.1654, 0.4772, 0.2004, 0.3566, 0.2613],
           [0.4557, 0.3131, 0.4809, 0.2574, 0.2696],
           [0.2721, 0.5998, 0.3956, 0.5363, 0.1541],
           [0.3006, 0.4773, 0.6395, 0.2856, 0.3989],
           [0.4491, 0.5595, 0.1836, 0.3811, 0.1398]],

          [[0.1843, 0.4240, 0.3370, 0.1231, 0.2186],
           [0.4047, 0.3332, 0.1901, 0.5329, 0.3023],
           [0.3070, 0.3088, 0.4807, 0.4928, 0.2590],
           [0.2416, 0.4614, 0.7091, 0.5237, 0.1433],
           [0.1582, 0.4577, 0.2749, 0.1369, 0.1607]],

          [[0.2733, 0.4040, 0.4396, 0.2284, 0.3319],
           [0.3856, 0.6730, 0.4624, 0.3878, 0.3076],
           [0.4307, 0.4217, 0.2977, 0.5086, 0.5406],
           [0.3686, 0.2778, 0.5228, 0.7592, 0.6455],
           [0.2033, 0.3014, 0.4898, 0.6164, 0.3117]]]]])

To apply the exact augmenation again, you may take the advantage of the previous parameter state:

>>> input = torch.rand(1, 3, 32, 32, 32)
>>> aug = RandomMotionBlur3D(3, 35., 0.5, p=1.)
>>> (aug(input) == aug(input, params=aug._params)).all()
tensor(True)

Mix¶

class kornia.augmentation.RandomTransplantation3D(excluded_labels=None, p=0.5, p_batch=1.0, data_keys=None)[source]¶

RandomTransplantation3D augmentation.

3D version of the kornia.augmentation.RandomTransplantation augmentation intended to be used with kornia.augmentation.AugmentationSequential. The interface is identical to the 2D version.

Normalizations¶

Normalization operations are shape-agnostic for both 2D and 3D tensors.

class kornia.augmentation.Denormalize(mean, std, p=1.0, keepdim=False)[source]¶

Denormalize tensor images with mean and standard deviation.

\[\text{input[channel] = (input[channel] * std[channel]) + mean[channel]}\]

Where mean is \((M_1, ..., M_n)\) and std \((S_1, ..., S_n)\) for n channels,

Parameters:

mean (Union[Tensor, Tuple[float], List[float], float]) – Mean for each channel.
std (Union[Tensor, Tuple[float], List[float], float]) – Standard deviations for each channel.
same_on_batch – apply the same transformation across the batch.
p (float, optional) – probability of applying the transformation. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Returns:

Denormalised tensor with same size as input \((*, C, H, W)\).

Note

This function internally uses kornia.enhance.denormalize().

Examples

>>> norm = Denormalize(mean=torch.zeros(1, 4), std=torch.ones(1, 4))
>>> x = torch.rand(1, 4, 3, 3)
>>> out = norm(x)
>>> out.shape
torch.Size([1, 4, 3, 3])

class kornia.augmentation.Normalize(mean, std, p=1.0, keepdim=False)[source]¶

Normalize tensor images with mean and standard deviation.

\[\text{input[channel] = (input[channel] - mean[channel]) / std[channel]}\]

Where mean is \((M_1, ..., M_n)\) and std \((S_1, ..., S_n)\) for n channels,

Parameters:

mean (Tensor | tuple[float, ...] | list[float] | float) – Mean for each channel.
std (Tensor | tuple[float, ...] | list[float] | float) – Standard deviations for each channel.
p (float, optional) – probability of applying the transformation. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

Returns:

Normalised tensor with same size as input \((*, C, H, W)\).

Note

This function internally uses kornia.enhance.normalize().

Examples

>>> norm = Normalize(mean=torch.zeros(4), std=torch.ones(4))
>>> x = torch.rand(1, 4, 3, 3)
>>> out = norm(x)
>>> out.shape
torch.Size([1, 4, 3, 3])

Image Resize¶

class kornia.augmentation.LongestMaxSize(max_size, resample=Resample.BILINEAR.name, align_corners=True, p=1.0)[source]¶

Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:: max_size (int) – maximum size of the image after the transformation.

class kornia.augmentation.Resize(size, side='short', resample=Resample.BILINEAR.name, align_corners=True, antialias=False, p=1.0, keepdim=False)[source]¶

Resize to size.

Parameters:

size (Union[int, Tuple[int, int]]) – Size (h, w) in pixels of the resized region or just one side.
side (str, optional) – Which side to resize, if size is only of type int. Default: "short"
resample (Union[str, int, Resample], optional) – Resampling mode. Default: Resample.BILINEAR.name
align_corners (bool, optional) – interpolation flag. Default: True
antialias (bool, optional) – if True, then image will be filtered with Gaussian before downscaling. No effect for upscaling. Default: False
p (float, optional) – probability of the augmentation been applied. Default: 1.0
keepdim (bool, optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). Default: False

class kornia.augmentation.SmallestMaxSize(max_size, resample=Resample.BILINEAR.name, align_corners=True, p=1.0)[source]¶

Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:: max_size (int) – maximum size of the image after the transformation.