Base Classes#

This is the base class for creating a new transform on top the predefined routine of kornia.augmentation. Specifically, an any given augmentation can be recognized as either rigid (e.g. affine transformations that manipulate images with standard transformation matrice), or non-rigid (e.g. cut out a random area). At image-level, Kornia supports rigid transformation like GeometricAugmentationBase2D that modifies the geometric location of image pixels and IntensityAugmentationBase2D that preserves the pixel locations, as well as generic AugmentationBase2D that allows higher freedom for customized augmentation design.

The Predefined Augmentation Routine#

Kornia augmentation follows the simplest sample-apply routine for all the augmentations.

  • sample: Kornia aims at flexible tensor-level augmentations that augment all images in a tensor with

    different augmentations and probabilities. The sampling operation firstly samples a suite of random parameters. Then all the sampled augmentation state (parameters) is stored inside _param of the augmentation, the users can hereby reproduce the same augmentation results.

  • apply: With generated or passed parameters, the augmentation will be performed accordingly.

    Apart from performing image tensor operations, Kornia also supports inverse operations that to revert the transform operations. Meanwhile, other data modalities (datakeys in Kornia) like masks, keypoints, and bounding boxes. Such features are better supported with AugmentationSequential. Notably, the augmentation pipeline for rigid operations are implemented already without further efforts. For non-rigid operations, the user may implement customized inverse and data modality operations, e.g. apply_mask_transform for applying transformations on mask tensors.

Custom Augmentation Classes#

For rigid transformations, IntensityAugmentationBase2D and GeometricAugmentationBase2D are sharing the exact same logic apart from the transformation matrix computations. Namely, the intensity augmentation always results in identity transformation matrices, without changing the geometric location for each pixel.

If it is a rigid geometric operation, compute_transformation and apply_transform need to be implemented, as well as compute_inverse_transformation and inverse_transform to compute its inverse.

class kornia.augmentation.GeometricAugmentationBase2D(p=0.5, p_batch=1.0, same_on_batch=False, keepdim=False)#

GeometricAugmentationBase2D base class for customized geometric augmentation implementations.

Parameters:
  • p (float, optional) – probability for applying an augmentation. This param controls the augmentation probabilities element-wise for a batch. Default: 0.5

  • p_batch (float, optional) – probability for applying an augmentation to a batch. This param controls the augmentation probabilities batch-wise. Default: 1.0

  • same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False

  • keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

compute_transformation(input, params, flags)#
Return type:

Tensor

apply_transform(input, params, flags, transform=None)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Return type:

Tensor

compute_inverse_transformation(transform)#

Compute the inverse transform of given transformation matrices.

Return type:

Tensor

inverse_transform(input, flags, transform=None, size=None)#

By default, the exact transformation as apply_transform will be used.

Return type:

Tensor

For IntensityAugmentationBase2D, the user only needs to override apply_transform.

class kornia.augmentation.IntensityAugmentationBase2D(p=0.5, p_batch=1.0, same_on_batch=False, keepdim=False)#

IntensityAugmentationBase2D base class for customized intensity augmentation implementations.

Parameters:
  • p (float, optional) – probability for applying an augmentation. This param controls the augmentation probabilities element-wise for a batch. Default: 0.5

  • p_batch (float, optional) – probability for applying an augmentation to a batch. This param controls the augmentation probabilities batch-wise. Default: 1.0

  • same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False

  • keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

apply_transform(input, params, flags, transform=None)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Return type:

Tensor

A minimal example to create your own rigid geometric augmentations with the following snippet:

import torch
import kornia as K

from kornia.augmentation import GeometricAugmentationBase2D
from kornia.augmentation import random_generator as rg


class MyRandomTransform(GeometricAugmentationBase2D):

   def __init__(
      self,
      factor=(0., 1.),
      same_on_batch: bool = False,
      p: float = 1.0,
      keepdim: bool = False,
   ) -> None:
      super().__init__(p=p, same_on_batch=same_on_batch, keepdim=keepdim)
      self._param_generator = rg.PlainUniformGenerator((factor, "factor", None, None))

   def compute_transformation(self, input, params):
      # a simple identity transformation example
      factor = params["factor"].to(input) * 0. + 1
      return K.eyelike(input, 3) * factor

   def apply_transform(
      self, input: Tensor, params: Dict[str, Tensor], flags: Dict[str, Any], transform: Optional[Tensor] = None
   ) -> Tensor:
      factor = params["factor"].to(input)
      return input * factor

For non-rigid augmentations, the user may implement the apply_transform* and apply_non_transform* APIs to meet the needs. Specifically, apply_transform* applies to the elements of a tensor that need to be transformed, while apply_non_transform* applies to the elements of a tensor that are skipped from augmentation. For example, a crop operation may change the tensor size partially, while we need to resize the rest to maintain the whole tensor as an integrated one with the same size.

class kornia.augmentation.AugmentationBase2D(p=0.5, p_batch=1.0, same_on_batch=False, keepdim=False)#

AugmentationBase2D base class for customized augmentation implementations.

AugmentationBase2D aims at offering a generic base class for a greater level of customization. If the subclass contains routined matrix-based transformations, RigidAffineAugmentationBase2D might be a better fit.

Parameters:
  • p (float, optional) – probability for applying an augmentation. This param controls the augmentation probabilities element-wise for a batch. Default: 0.5

  • p_batch (float, optional) – probability for applying an augmentation to a batch. This param controls the augmentation probabilities batch-wise. Default: 1.0

  • same_on_batch (bool, optional) – apply the same transformation across the batch. Default: False

  • keepdim (bool, optional) – whether to keep the output shape the same as input True or broadcast it to the batch form False. Default: False

apply_transform(input, params, flags, transform=None)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Return type:

Tensor

apply_non_transform(input, params, flags, transform=None)#
Return type:

Tensor

apply_transform_mask(input, params, flags, transform=None)#

Process masks corresponding to the inputs that are transformed.

Return type:

Tensor

apply_non_transform_mask(input, params, flags, transform=None)#

Process masks corresponding to the inputs that are no transformation applied.

Return type:

Tensor

apply_transform_box(input, params, flags, transform=None)#

Process boxes corresponding to the inputs that are transformed.

Return type:

Boxes

apply_non_transform_box(input, params, flags, transform=None)#

Process boxes corresponding to the inputs that are no transformation applied.

Return type:

Boxes

apply_transform_keypoint(input, params, flags, transform=None)#

Process keypoints corresponding to the inputs that are transformed.

Return type:

Keypoints

apply_non_transform_keypoint(input, params, flags, transform=None)#

Process keypoints corresponding to the inputs that are no transformation applied.

Return type:

Keypoints

apply_transform_class(input, params, flags, transform=None)#

Process class tags corresponding to the inputs that are transformed.

Return type:

Tensor

apply_non_transform_class(input, params, flags, transform=None)#

Process class tags corresponding to the inputs that are no transformation applied.

Return type:

Tensor

The similar logic applies to 3D augmentations as well.

Some Further Notes#

Probabilities#

Kornia supports two types of randomness for element-level randomness p and batch-level randomness p_batch, as in _BasicAugmentationBase. Under the hood, operations like crop, resize are implemented with a fixed element-level randomness of p=1 that only maintains batch-level randomness.

Random Generators#

For automatically generating the corresponding __repr__ with full customized parameters, you may need to implement _param_generator by inheriting RandomGeneratorBase for generating random parameters and put all static parameters inside self.flags. You may take the advantage of PlainUniformGenerator to generate simple uniform parameters with less boilerplate code.

Random Reproducibility#

Plain augmentation base class without the functionality of transformation matrix calculations. By default, the random computations will be happened on CPU with torch.get_default_dtype(). To change this behaviour, please use set_rng_device_and_dtype.