Augmentation Containers

The classes in this section are containers for augmenting different data formats (e.g. images, videos).

Augmentation Sequential

Kornia augmentations provides simple on-device augmentation framework with the support of various syntax sugars (e.g. return transformation matrix, inverse geometric transform). Therefore, we provide advanced augmentation container to ease the pain of building augmenation pipelines. This API would also provide predefined routines for automating the processing of masks, bounding boxes, and keypoints.

class kornia.augmentation.container.AugmentationSequential(*args, data_keys=[DataKey.INPUT], same_on_batch=None, return_transform=None, keepdim=None, random_apply=False)[source]

AugmentationSequential for handling multiple input types like inputs, masks, keypoints at once.

https://kornia-tutorials.readthedocs.io/en/latest/_images/data_augmentation_sequential_5_1.png https://kornia-tutorials.readthedocs.io/en/latest/_images/data_augmentation_sequential_7_0.png
Parameters
  • *args – a list of kornia augmentation modules.

  • data_keys (List[Union[str, int, DataKey]], optional) – the input type sequential for applying augmentations. Accepts “input”, “mask”, “bbox”, “bbox_xyxy”, “bbox_xywh”, “keypoints”. Default: [DataKey.INPUT]

  • same_on_batch (Optional[bool], optional) – apply the same transformation across the batch. If None, it will not overwrite the function-wise settings. Default: None

  • return_transform (Optional[bool], optional) – if True return the matrix describing the transformation applied to each. If None, it will not overwrite the function-wise settings. Default: None

  • keepdim (Optional[bool], optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). If None, it will not overwrite the function-wise settings. Default: None

  • random_apply (Union[int, bool, Tuple[int, int]], optional) – randomly select a sublist (order agnostic) of args to apply transformation. If int, a fixed number of transformations will be selected. If (a,), x number of transformations (a <= x <= len(args)) will be selected. If (a, b), x number of transformations (a <= x <= b) will be selected. If True, the whole list of args will be processed as a sequence in a random order. If False, the whole list of args will be processed as a sequence in original order. Default: False

Note

Mix augmentations (e.g. RandomMixUp, RandomCutMix) can only be working with “input” data key. It is not clear how to deal with the conversions of masks, bounding boxes and keypoints.

Note

See a working example here.

Examples

>>> import kornia
>>> input = torch.randn(2, 3, 5, 6)
>>> bbox = torch.tensor([[
...     [1., 1.],
...     [2., 1.],
...     [2., 2.],
...     [1., 2.],
... ]]).expand(2, -1, -1)
>>> points = torch.tensor([[[1., 1.]]]).expand(2, -1, -1)
>>> aug_list = AugmentationSequential(
...     kornia.augmentation.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0),
...     kornia.augmentation.RandomAffine(360, p=1.0),
...     data_keys=["input", "mask", "bbox", "keypoints"],
...     return_transform=False,
...     same_on_batch=False,
...     random_apply=10,
... )
>>> out = aug_list(input, input, bbox, points)
>>> [o.shape for o in out]
[torch.Size([2, 3, 5, 6]), torch.Size([2, 3, 5, 6]), torch.Size([2, 4, 2]), torch.Size([2, 1, 2])]
>>> out_inv = aug_list.inverse(*out)
>>> [o.shape for o in out_inv]
[torch.Size([2, 3, 5, 6]), torch.Size([2, 3, 5, 6]), torch.Size([2, 4, 2]), torch.Size([2, 1, 2])]

This example demonstrates the integration of VideoSequential and AugmentationSequential.

Examples

>>> import kornia
>>> input = torch.randn(2, 3, 5, 6)[None]
>>> bbox = torch.tensor([[
...     [1., 1.],
...     [2., 1.],
...     [2., 2.],
...     [1., 2.],
... ]]).expand(2, -1, -1)[None]
>>> points = torch.tensor([[[1., 1.]]]).expand(2, -1, -1)[None]
>>> aug_list = AugmentationSequential(
...     VideoSequential(
...         kornia.augmentation.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0),
...         kornia.augmentation.RandomAffine(360, p=1.0),
...     ),
...     data_keys=["input", "mask", "bbox", "keypoints"]
... )
>>> out = aug_list(input, input, bbox, points)
>>> [o.shape for o in out]
[torch.Size([1, 2, 3, 5, 6]), torch.Size([1, 2, 3, 5, 6]), torch.Size([1, 2, 4, 2]), torch.Size([1, 2, 1, 2])]
forward(*args, label=None, params=None, data_keys=None)[source]

Compute multiple tensors simultaneously according to self.data_keys.

Return type

Union[Tensor, Tuple[Tensor, Tensor], Tuple[Union[Tensor, Tuple[Tensor, Tensor]], Optional[Tensor]], List[Union[Tensor, Tuple[Tensor, Tensor]]], Tuple[List[Union[Tensor, Tuple[Tensor, Tensor]]], Optional[Tensor]]]

inverse(*args, params=None, data_keys=None)[source]

Reverse the transformation applied.

Number of input tensors must align with the number of``data_keys``. If data_keys is not set, use self.data_keys by default.

Return type

Union[Tensor, List[Tensor]]

ImageSequential

Kornia augmentations provides simple on-device augmentation framework with the support of various syntax sugars (e.g. return transformation matrix, inverse geometric transform). Additionally, ImageSequential supports the mix usage of both image processing and augmentation modules.

class kornia.augmentation.container.ImageSequential(*args, same_on_batch=None, return_transform=None, keepdim=None, random_apply=False, if_unsupported_ops='raise')[source]

Sequential for creating kornia image processing pipeline.

Parameters
  • *args – a list of kornia augmentation and image operation modules.

  • same_on_batch (Optional[bool], optional) – apply the same transformation across the batch. If None, it will not overwrite the function-wise settings. Default: None

  • return_transform (Optional[bool], optional) – if True return the matrix describing the transformation applied to each. If None, it will not overwrite the function-wise settings. Default: None

  • keepdim (Optional[bool], optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). If None, it will not overwrite the function-wise settings. Default: None

  • random_apply (Union[int, bool, Tuple[int, int]], optional) – randomly select a sublist (order agnostic) of args to apply transformation. If int, a fixed number of transformations will be selected. If (a,), x number of transformations (a <= x <= len(args)) will be selected. If (a, b), x number of transformations (a <= x <= b) will be selected. If True, the whole list of args will be processed as a sequence in a random order. If False, the whole list of args will be processed as a sequence in original order. Default: False

Note

Transformation matrix returned only considers the transformation applied in kornia.augmentation module. Those transformations in kornia.geometry will not be taken into account.

Examples

>>> _ = torch.manual_seed(77)
>>> import kornia
>>> input, label = torch.randn(2, 3, 5, 6), torch.tensor([0, 1])
>>> aug_list = ImageSequential(
...     kornia.color.BgrToRgb(),
...     kornia.augmentation.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0),
...     kornia.filters.MedianBlur((3, 3)),
...     kornia.augmentation.RandomAffine(360, p=1.0),
...     kornia.enhance.Invert(),
...     kornia.augmentation.RandomMixUp(p=1.0),
...     return_transform=True,
...     same_on_batch=True,
...     random_apply=10,
... )
>>> out, lab = aug_list(input, label=label)
>>> lab
tensor([[0.0000, 1.0000, 0.1214],
        [1.0000, 0.0000, 0.1214]])
>>> out[0].shape, out[1].shape
(torch.Size([2, 3, 5, 6]), torch.Size([2, 3, 3]))

Reproduce with provided params. >>> out2, lab2 = aug_list(input, label=label, params=aug_list._params) >>> torch.equal(out[0], out2[0]), torch.equal(out[1], out2[1]), torch.equal(lab[1], lab2[1]) (True, True, True)

forward(input, label=None, params=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

Union[Tensor, Tuple[Tensor, Tensor], Tuple[Union[Tensor, Tuple[Tensor, Tensor]], Tensor]]

PatchSequential

class kornia.augmentation.container.PatchSequential(*args, grid_size=(4, 4), padding='same', same_on_batch=None, keepdim=None, patchwise_apply=True, random_apply=False)[source]

Container for performing patch-level image data augmentation.

https://kornia-tutorials.readthedocs.io/en/latest/_images/data_patch_sequential_7_0.png

PatchSequential breaks input images into patches by a given grid size, which will be resembled back afterwards.

Different image processing and augmentation methods will be performed on each patch region as in [LYFC21].

Parameters
  • *args – a list of processing modules.

  • grid_size (Tuple[int, int], optional) – controls the grid board separation. Default: (4, 4)

  • padding (str, optional) – same or valid padding. If same padding, it will pad to include all pixels if the input tensor cannot be divisible by grid_size. If valid padding, the redundant border will be removed. Default: 'same'

  • same_on_batch (Optional[bool], optional) – apply the same transformation across the batch. If None, it will not overwrite the function-wise settings. Default: None

  • keepdim (Optional[bool], optional) – whether to keep the output shape the same as input (True) or broadcast it to the batch form (False). If None, it will not overwrite the function-wise settings. Default: None

  • patchwise_apply (bool, optional) – apply image processing args will be applied patch-wisely. if True, the number of args must be equal to grid number. if False, the image processing args will be applied as a sequence to all patches. Default: True

  • random_apply (Union[int, bool, Tuple[int, int]], optional) – randomly select a sublist (order agnostic) of args to apply transformation. If int (batchwise mode only), a fixed number of transformations will be selected. If (a,) (batchwise mode only), x number of transformations (a <= x <= len(args)) will be selected. If (a, b) (batchwise mode only), x number of transformations (a <= x <= b) will be selected. If True, the whole list of args will be processed in a random order. If False and not patchwise_apply, the whole list of args will be processed in original order. If False and patchwise_apply, the whole list of args will be processed in original order location-wisely. Default: False

Note

Transformation matrix returned only considers the transformation applied in kornia.augmentation module. Those transformations in kornia.geometry will not be taken into account.

Note

See a working example here.

Examples

>>> import kornia.augmentation as K
>>> input = torch.randn(2, 3, 224, 224)
>>> seq = PatchSequential(
...     ImageSequential(
...         K.ColorJitter(0.1, 0.1, 0.1, 0.1, p=0.5),
...         K.RandomPerspective(0.2, p=0.5),
...         K.RandomSolarize(0.1, 0.1, p=0.5),
...     ),
...     K.RandomAffine(360, p=1.0),
...     ImageSequential(
...         K.ColorJitter(0.1, 0.1, 0.1, 0.1, p=0.5),
...         K.RandomPerspective(0.2, p=0.5),
...         K.RandomSolarize(0.1, 0.1, p=0.5),
...     ),
...     K.RandomSolarize(0.1, 0.1, p=0.1),
...     grid_size=(2,2),
...     patchwise_apply=True,
...     same_on_batch=True,
...     random_apply=False,
... )
>>> out = seq(input)
>>> out.shape
torch.Size([2, 3, 224, 224])
>>> out1 = seq(input, params=seq._params)
>>> torch.equal(out, out1)
True
forward(input, label=None, params=None)[source]

Input transformation will be returned if input is a tuple.

Return type

Union[Tensor, Tuple[Tensor, Tensor], Tuple[Union[Tensor, Tuple[Tensor, Tensor]], Tensor]]

Video Data Augmentation

Video data is a special case of 3D volumetric data that contains both spatial and temporal information, which can be referred as 2.5D than 3D. In most applications, augmenting video data requires a static temporal dimension to have the same augmentations are performed for each frame. Thus, VideoSequential can be used to do such trick as same as nn.Sequential. Currently, VideoSequential supports data format like \((B, C, T, H, W)\) and \((B, T, C, H, W)\).

import kornia.augmentation as K

transform = K.VideoSequential(
   K.RandomAffine(360),
   K.ColorJitter(0.2, 0.3, 0.2, 0.3),
   data_format="BCTHW",
   same_on_frame=True
)
class kornia.augmentation.container.VideoSequential(*args, data_format='BTCHW', same_on_frame=True, random_apply=False)[source]

VideoSequential for processing 5-dim video data like (B, T, C, H, W) and (B, C, T, H, W).

VideoSequential is used to replace nn.Sequential for processing video data augmentations. By default, VideoSequential enabled same_on_frame to make sure the same augmentations happen across temporal dimension. Meanwhile, it will not affect other augmentation behaviours like the settings on same_on_batch, etc.

Parameters
  • *args – a list of augmentation module.

  • data_format (str, optional) – only BCTHW and BTCHW are supported. Default: 'BTCHW'

  • same_on_frame (bool, optional) – apply the same transformation across the channel per frame. Default: True

  • random_apply (Union[int, bool, Tuple[int, int]], optional) – randomly select a sublist (order agnostic) of args to apply transformation. If int, a fixed number of transformations will be selected. If (a,), x number of transformations (a <= x <= len(args)) will be selected. If (a, b), x number of transformations (a <= x <= b) will be selected. If None, the whole list of args will be processed as a sequence. Default: False

Note

Transformation matrix returned only considers the transformation applied in kornia.augmentation module. Those transformations in kornia.geometry will not be taken into account.

Example

If set same_on_frame to True, we would expect the same augmentation has been applied to each timeframe.

>>> input, label = torch.randn(2, 3, 1, 5, 6).repeat(1, 1, 4, 1, 1), torch.tensor([0, 1])
>>> aug_list = VideoSequential(
...     kornia.augmentation.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0),
...     kornia.color.BgrToRgb(),
...     kornia.augmentation.RandomAffine(360, p=1.0),
...     random_apply=10,
...     data_format="BCTHW",
...     same_on_frame=True)
>>> output = aug_list(input)
>>> (output[0, :, 0] == output[0, :, 1]).all()
tensor(True)
>>> (output[0, :, 1] == output[0, :, 2]).all()
tensor(True)
>>> (output[0, :, 2] == output[0, :, 3]).all()
tensor(True)

If set same_on_frame to False:

>>> aug_list = VideoSequential(
...     kornia.augmentation.ColorJitter(0.1, 0.1, 0.1, 0.1, p=1.0),
...     kornia.augmentation.RandomAffine(360, p=1.0),
...     kornia.augmentation.RandomMixUp(p=1.0),
... data_format="BCTHW",
... same_on_frame=False)
>>> output, lab = aug_list(input)
>>> output.shape, lab.shape
(torch.Size([2, 3, 4, 5, 6]), torch.Size([2, 4, 3]))
>>> (output[0, :, 0] == output[0, :, 1]).all()
tensor(False)

Reproduce with provided params. >>> out2, lab2 = aug_list(input, label, params=aug_list._params) >>> torch.equal(output, out2) True

forward(input, label=None, params=None)[source]

Define the video computation performed.

Return type

Union[Tensor, Tuple[Tensor, Tensor], Tuple[Union[Tensor, Tuple[Tensor, Tensor]], Tensor]]