# kornia.utils¶

## Image¶

tensor_to_image(tensor, keepdim=False)[source]

Converts a PyTorch tensor image to a numpy image.

In case the tensor is in the GPU, it will be copied back to CPU.

Parameters
• tensor (Tensor) – image of the form $$(H, W)$$, $$(C, H, W)$$ or $$(B, C, H, W)$$.

• keepdim (bool, optional) – If False squeeze the input image to match the shape $$(H, W, C)$$ or $$(H, W)$$. Default: False

Return type

np.ndarray

Returns

image of the form $$(H, W)$$, $$(H, W, C)$$ or $$(B, H, W, C)$$.

image_to_tensor(image, keepdim=True)[source]

Convert a numpy image to a PyTorch 4d tensor image.

Parameters
• image (np.ndarray) – image of the form $$(H, W, C)$$, $$(H, W)$$ or $$(B, H, W, C)$$.

• keepdim (bool, optional) – If False unsqueeze the input image to match the shape $$(B, H, W, C)$$. Default: True

Return type

Tensor

Returns

tensor of the form $$(B, C, H, W)$$ if keepdim is False,

$$(C, H, W)$$ otherwise.

draw_rectangle(image, rectangle, color=None, fill=None, width=1)[source]

Draws N rectangles on a batch of image tensors.

Parameters
• image (Tensor) – is tensor of BxCxHxW.

• rectangle (Tensor) – represents number of rectangles to draw in BxNx4 N is the number of boxes to draw per batch index[x1, y1, x2, y2] 4 is in (top_left.x, top_left.y, bot_right.x, bot_right.y).

• color (Optional[Tensor], optional) – a size 1, size 3, BxNx1, or BxNx3 tensor. If C is 3, and color is 1 channel it will be broadcasted. Default: None

• fill (Optional[bool], optional) – is a flag used to fill the boxes with color if True. Default: None

• width (int, optional) – The line width (Not implemented yet). Default: 1

Return type

Tensor

Returns

This operation modifies image inplace but also returns the drawn tensor for convenience with same shape the of the input BxCxHxW.

Example

>>> img = torch.rand(2, 3, 10, 12)
>>> rect = torch.tensor([[[0, 0, 4, 4]], [[4, 4, 10, 10]]])
>>> out = draw_rectangle(img, rect)


## Grid¶

create_meshgrid(height, width, normalized_coordinates=True, device=device(type='cpu'), dtype=torch.float32)[source]

Generates a coordinate grid for an image.

When the flag normalized_coordinates is set to True, the grid is normalized to be in the range $$[-1,1]$$ to be consistent with the pytorch function torch.nn.functional.grid_sample().

Parameters
• height (int) – the image height (rows).

• width (int) – the image width (cols).

• normalized_coordinates (bool, optional) – whether to normalize coordinates in the range $$[-1,1]$$ in order to be consistent with the PyTorch function torch.nn.functional.grid_sample(). Default: True

• device (Optional[device], optional) – the device on which the grid will be generated. Default: device(type='cpu')

• dtype (dtype, optional) – the data type of the generated grid. Default: torch.float32

Return type

Tensor

Returns

grid tensor with shape $$(1, H, W, 2)$$.

Example

>>> create_meshgrid(2, 2)
tensor([[[[-1., -1.],
[ 1., -1.]],

[[-1.,  1.],
[ 1.,  1.]]]])

>>> create_meshgrid(2, 2, normalized_coordinates=False)
tensor([[[[0., 0.],
[1., 0.]],

[[0., 1.],
[1., 1.]]]])

create_meshgrid3d(depth, height, width, normalized_coordinates=True, device=device(type='cpu'), dtype=torch.float32)[source]

Generates a coordinate grid for an image.

When the flag normalized_coordinates is set to True, the grid is normalized to be in the range $$[-1,1]$$ to be consistent with the pytorch function torch.nn.functional.grid_sample().

Parameters
Return type

Tensor

Returns

grid tensor with shape $$(1, D, H, W, 3)$$.

## Pointcloud¶

save_pointcloud_ply(filename, pointcloud)[source]

Utility function to save to disk a pointcloud in PLY format.

Parameters
• filename (str) – the path to save the pointcloud.

• pointcloud (Tensor) – tensor containing the pointcloud to save. The tensor must be in the shape of $$(*, 3)$$ where the last component is assumed to be a 3d point coordinate $$(X, Y, Z)$$.

Return type

None

Utility function to load from disk a pointcloud in PLY format.

Parameters
• filename (str) – the path to the pointcloud.

• header_size (int, optional) – the size of the ply file header that will be skipped during loading. Default: 8

Return type

Tensor

Returns

tensor containing the loaded point with shape $$(*, 3)$$ where $$*$$ represents the number of points.

## Metrics¶

confusion_matrix(input, target, num_classes, normalized=False)[source]

Compute confusion matrix to evaluate the accuracy of a classification.

Parameters
• input (Tensor) – tensor with estimated targets returned by a classifier. The shape can be $$(B, *)$$ and must contain integer values between 0 and K-1.

• target (Tensor) – tensor with ground truth (correct) target values. The shape can be $$(B, *)$$ and must contain integer values between 0 and K-1, where targets are assumed to be provided as one-hot vectors.

• num_classes (int) – total possible number of classes in target.

• normalized (bool, optional) – whether to return the confusion matrix normalized. Default: False

Return type

Tensor

Returns

a tensor containing the confusion matrix with shape $$(B, K, K)$$ where K is the number of classes.

mean_iou(input, target, num_classes, eps=1e-06)[source]

Calculate mean Intersection-Over-Union (mIOU).

The function internally computes the confusion matrix.

Parameters
• input (Tensor) – tensor with estimated targets returned by a classifier. The shape can be $$(B, *)$$ and must contain integer values between 0 and K-1.

• target (Tensor) – tensor with ground truth (correct) target values. The shape can be $$(B, *)$$ and must contain integer values between 0 and K-1, where targets are assumed to be provided as one-hot vectors.

• num_classes (int) – total possible number of classes in target.

Return type

Tensor

Returns

ta tensor representing the mean intersection-over union with shape $$(B, K)$$ where K is the number of classes.

one_hot(labels, num_classes, device=None, dtype=None, eps=1e-06)[source]

Converts an integer label x-D tensor to a one-hot (x+1)-D tensor.

Parameters
• labels (Tensor) – tensor with labels of shape $$(N, *)$$, where N is batch size. Each value is an integer representing correct classification.

• num_classes (int) – number of classes in labels.

• device (Optional[device], optional) – the desired device of returned tensor. Default: None

• dtype (Optional[dtype], optional) – the desired data type of returned tensor. Default: None

Return type

Tensor

Returns

the labels in one hot tensor of shape $$(N, C, *)$$,

Examples

>>> labels = torch.LongTensor([[[0, 1], [2, 0]]])
>>> one_hot(labels, num_classes=3)
tensor([[[[1.0000e+00, 1.0000e-06],
[1.0000e-06, 1.0000e+00]],

[[1.0000e-06, 1.0000e+00],
[1.0000e-06, 1.0000e-06]],

[[1.0000e-06, 1.0000e-06],
[1.0000e+00, 1.0000e-06]]]])


## Memory¶

batched_forward(model, data, device, batch_size=128, **kwargs)[source]

Convenience function, which allows to run the forward in micro-batches.

When the just model.forward(data) does not fit into device memory, e.g. on laptop GPU. In the end, it transfers the output to the device of the input data tensor. E.g. running HardNet on 8000x1x32x32 tensor.

Parameters
• model (Module) – Any torch model, which outputs a single tensor as an output.

• data (Tensor) – Input data of Bx(Any) shape.

• device (device) – which device should we run on.

• batch_size (int, optional) – “micro-batch” size. Default: 128

• **kwargs – any other arguments, which accepts model.

Return type

Tensor

Returns

output of the model.

Example

>>> patches = torch.rand(8000, 1, 32, 32)
>>> sift = kornia.feature.SIFTDescriptor(32)
>>> desc_batched = batched_forward(sift, patches, torch.device('cpu'), 128)
>>> desc = sift(patches)
>>> assert torch.allclose(desc, desc_batched)