kornia.image¶

Module to provide a high level API to process images.

class kornia.image.ImageSize(height, width)[source]¶

Data class to represent image shape.

Parameters:

height (int | Tensor) – image height.
width (int | Tensor) – image width.

Example

>>> size = ImageSize(3, 4)
>>> size.height
3
>>> size.width
4

height: int | Tensor¶

width: int | Tensor¶

class kornia.image.PixelFormat(color_space, bit_depth)[source]¶

Data class to represent the pixel format of an image.

Parameters:

color_space (ColorSpace) – color space.
bit_depth (int) – the number of bits per channel.

Example

>>> pixel_format = PixelFormat(ColorSpace.RGB, 8)
>>> pixel_format.color_space
<ColorSpace.RGB: 2>
>>> pixel_format.bit_depth
8

bit_depth: int¶

color_space: ColorSpace¶

class kornia.image.ChannelsOrder(value)[source]¶

Enum that represents the channels order of an image.

CHANNELS_FIRST = 0¶

CHANNELS_LAST = 1¶

class kornia.image.ImageLayout(image_size, channels, channels_order)[source]¶

Data class to represent the layout of an image.

Parameters:

image_size (ImageSize) – image size.
channels (int) – number of channels.
channels_order (ChannelsOrder) – channels order.

Example

>>> layout = ImageLayout(ImageSize(3, 4), 3, ChannelsOrder.CHANNELS_LAST)
>>> layout.image_size
ImageSize(height=3, width=4)
>>> layout.channels
3
>>> layout.channels_order
<ChannelsOrder.CHANNELS_LAST: 1>

channels: int¶

channels_order: ChannelsOrder¶

image_size: ImageSize¶

class kornia.image.Image(data, pixel_format, layout)[source]¶

Class that holds an Image torch.Tensor representation.

Note

Disclaimer: This class provides the minimum functionality for image manipulation. However, as soon as you start to experiment with advanced torch.Tensor manipulation, you might expect fancy polymorphic behaviours.

Warning

This API is experimental and might suffer changes in the future.

Parameters:

data (Tensor) – a torch torch.Tensor containing the image data.
layout (ImageLayout) – a dataclass containing the image layout information.

Examples

>>> # from a torch.tensor
>>> data = torch.randint(0, 255, (3, 4, 5), dtype=torch.uint8)  # CxHxW
>>> pixel_format = PixelFormat(
...     color_space=ColorSpace.RGB,
...     bit_depth=8,
... )
>>> layout = ImageLayout(
...     image_size=ImageSize(4, 5),
...     channels=3,
...     channels_order=ChannelsOrder.CHANNELS_FIRST,
... )
>>> img = Image(data, pixel_format, layout)
>>> assert img.channels == 3

>>> # from a numpy array (like opencv)
>>> data = np.ones((4, 5, 3), dtype=np.uint8)  # HxWxC
>>> img = Image.from_numpy(data, color_space=ColorSpace.RGB)
>>> assert img.channels == 3
>>> assert img.width == 5
>>> assert img.height == 4

property channels: int¶: Return the number channels of the image.

property channels_order: ChannelsOrder¶: Return the channels order.

clone()[source]¶

Return a copy of the image.

Return type:: Image

property data: Tensor¶: Return the underlying torch.Tensor data.

property device: device¶: Return the image device.

property dtype: dtype¶: Return the image data type.

float()[source]¶

Return the image as float.

Return type:: Image

classmethod from_dlpack(data)[source]¶

Construct an image torch.Tensor from a DLPack capsule.

Parameters:: data (Any) – a DLPack capsule from numpy, tvm or jax.
Return type:: Image

Example

>>> x = np.ones((4, 5, 3))
>>> img = Image.from_dlpack(x.__dlpack__())

classmethod from_file(file_path)[source]¶

Construct an image torch.Tensor from a file.

Parameters:: file_path (str | Path) – the path to the file to read the image from.
Return type:: Image

classmethod from_numpy(data, color_space=ColorSpace.RGB, channels_order=ChannelsOrder.CHANNELS_LAST)[source]¶

Construct an image torch.Tensor from a numpy array.

Parameters:

data (Any) – a numpy array containing the image data.
color_space (ColorSpace, optional) – the color space of the image. Default: ColorSpace.RGB
pixel_format – the pixel format of the image.
channels_order (ChannelsOrder, optional) – what dimension the channels are in the image torch.Tensor. Default: ChannelsOrder.CHANNELS_LAST

Return type:

Image

Example

>>> data = np.ones((4, 5, 3), dtype=np.uint8)  # HxWxC
>>> img = Image.from_numpy(data, color_space=ColorSpace.RGB)
>>> assert img.channels == 3
>>> assert img.width == 5
>>> assert img.height == 4

property height: int¶: Return the image height (columns).

property image_size: ImageSize¶: Return the image size.

property layout: ImageLayout¶: Return the image layout.

property pixel_format: PixelFormat¶: Return the pixel format.

print(max_width=256)[source]¶

Print the image torch.Tensor to the console.

Parameters:: max_width (int, optional) – the maximum width of the image to print. Default: 256
Return type:: None

img = Image.from_file("panda.png")
img.print()

https://github.com/kornia/data/blob/main/print_image.png?raw=true

property shape: tuple[int, ...]¶: Return the image shape.

to(device=None, dtype=None)[source]¶

Move the image to the given device and dtype.

Parameters:

device (Union[str, device, None], optional) – the device to move the image to. Default: None
dtype (Optional[dtype], optional) – the data type to cast the image to. Default: None

Returns:

the image moved to the given device and dtype.

Return type:

Image

to_bgr()[source]¶

Convert the image to BGR.

Return type:: Image

to_dlpack()[source]¶

Return a DLPack capsule from the image torch.Tensor.

Return type:: Any

to_gray()[source]¶

Convert the image to grayscale.

Return type:: Image

to_numpy()[source]¶

Return a numpy array in cpu from the image torch.Tensor.

Return type:: Any

to_rgb()[source]¶

Convert the image to RGB.

Return type:: Image

property width: int¶: Return the image width (rows).

write(file_path)[source]¶

Write the image to a file.

For now, only support writing to JPEG format.

Parameters:: file_path (str | Path) – the path to the file to write the image to.
Return type:: None

Example

>>> data = np.ones((4, 5, 3), dtype=np.uint8)  # HxWxC
>>> img = Image.from_numpy(data)
>>> img.write("test.jpg")

Drawing¶

kornia.image.draw_line(image, p1, p2, color)[source]¶

Draw a single line into an image.

Parameters:

image (Tensor) – the input image to where to draw the lines with shape :math`(C,H,W)`.
p1 (Tensor) – the start point [x y] of the line with shape (2, ) or (B, 2).
p2 (Tensor) – the end point [x y] of the line with shape (2, ) or (B, 2).
color (Tensor) – the color of the line with shape :math`(C)` where :math`C` is the number of channels of the image.

Return type:

Tensor

Returns:

the image with containing the line.

Examples

>>> image = torch.zeros(1, 8, 8)
>>> draw_line(image, torch.tensor([6, 4]), torch.tensor([1, 4]), torch.tensor([255]))
tensor([[[  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.],
         [  0., 255., 255., 255., 255., 255., 255.,   0.],
         [  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.]]])

kornia.image.draw_rectangle(image, rectangle, color=None, fill=None)[source]¶

Draw N rectangles on a batch of image tensors.

Parameters:

image (Tensor) – is tensor of BxCxHxW.
rectangle (Tensor) – represents number of rectangles to draw in BxNx4 N is the number of boxes to draw per batch index[x1, y1, x2, y2] 4 is in (top_left.x, top_left.y, bot_right.x, bot_right.y).
color (Optional[Tensor], optional) – a size 1, size 3, BxNx1, or BxNx3 tensor. If C is 3, and color is 1 channel it will be broadcasted. Default: None
fill (Optional[bool], optional) – is a flag used to fill the boxes with color if True. Default: None

Return type:

Tensor

Returns:

This operation modifies image inplace but also returns the drawn tensor for convenience with same shape the of the input BxCxHxW.

Example

>>> img = torch.rand(2, 3, 10, 12)
>>> rect = torch.tensor([[[0, 0, 4, 4]], [[4, 4, 10, 10]]])
>>> out = draw_rectangle(img, rect)

kornia.image.draw_convex_polygon(images, polygons, colors)[source]¶

Draws convex polygons on a batch of image tensors.

Parameters:

images (Tensor) – is tensor of BxCxHxW.
polygons (Union[Tensor, List[Tensor]]) – represents polygons as points, either BxNx2 or List of variable length polygons. N is the number of points. 2 is (x, y).
colors (Tensor) – a B x 3 tensor or 3 tensor with color to fill in.

Return type:

Tensor

Returns:

This operation modifies image inplace but also returns the drawn tensor for convenience with same shape the of the input BxCxHxW.

Note

This function assumes a coordinate system (0, h - 1), (0, w - 1) in the image, with (0, 0) being the center of the top-left pixel and (w - 1, h - 1) being the center of the bottom-right coordinate.

Example

>>> img = torch.rand(1, 3, 12, 16)
>>> poly = torch.tensor([[[4, 4], [12, 4], [12, 8], [4, 8]]])
>>> color = torch.tensor([[0.5, 0.5, 0.5]])
>>> out = draw_convex_polygon(img, poly, color)

kornia.image.draw_point2d(image, points, color)[source]¶

Set one or more coordinates in a Tensor to a color.

Parameters:

image (Tensor) – the input image on which to draw the points with shape :math`(C,H,W)` or :math`(H,W)`.
points (Tensor) – the [x, y] points to be drawn on the image.
color (Tensor) – the color of the pixel with :math`(C)` where :math`C` is the number of channels of the image.

Return type:

Tensor

Returns:

The image with points set to the color.

Image Conversion¶

kornia.image.tensor_to_image(tensor, keepdim=False, force_contiguous=False)[source]¶

Convert a PyTorch tensor image to a numpy image.

In case the tensor is in the GPU, it will be copied back to CPU.

Parameters:

tensor (Tensor) – image of the form \((H, W)\), \((C, H, W)\) or \((B, C, H, W)\).
keepdim (bool, optional) – If False squeeze the input image to match the shape \((H, W, C)\) or \((H, W)\). Default: False
force_contiguous (bool, optional) – If True call contiguous to the tensor before Default: False

Return type:

Any

Returns:

image of the form \((H, W)\), \((H, W, C)\) or \((B, H, W, C)\).

Example

>>> img = torch.ones(1, 3, 3)
>>> tensor_to_image(img).shape
(3, 3)

>>> img = torch.ones(3, 4, 4)
>>> tensor_to_image(img).shape
(4, 4, 3)

kornia.image.image_to_tensor(image, keepdim=True)[source]¶

Convert a numpy image to a PyTorch 4d tensor image.

Parameters:

image (Any) – image of the form \((H, W, C)\), \((H, W)\) or \((B, H, W, C)\).
keepdim (bool, optional) – If False unsqueeze the input image to match the shape \((B, H, W, C)\). Default: True

Return type:

Tensor

Returns:

tensor of the form \((B, C, H, W)\) if keepdim is False,: \((C, H, W)\) otherwise.

Example

>>> img = np.ones((3, 3))
>>> image_to_tensor(img).shape
torch.Size([1, 3, 3])

>>> img = np.ones((4, 4, 1))
>>> image_to_tensor(img).shape
torch.Size([1, 4, 4])

>>> img = np.ones((4, 4, 3))
>>> image_to_tensor(img, keepdim=False).shape
torch.Size([1, 3, 4, 4])

kornia.image.image_list_to_tensor(images)[source]¶

Convert a list of numpy images to a PyTorch 4d tensor image.

Parameters:

images (List[Any]) – list of images, each of the form \((H, W, C)\).
consistent (Image shapes must be)

Return type:

Tensor

Returns:

tensor of the form \((B, C, H, W)\).

Example

>>> imgs = [np.ones((4, 4, 1)), np.zeros((4, 4, 1))]
>>> image_list_to_tensor(imgs).shape
torch.Size([2, 1, 4, 4])

class kornia.image.ImageToTensor(keepdim=False)[source]¶

Converts a numpy image to a PyTorch 4d tensor image.

Parameters:: keepdim (bool, optional) – If False unsqueeze the input image to match the shape \((B, H, W, C)\). Default: False

Image Printing¶

kornia.image.image_to_string(image, max_width=256)[source]¶

Obtain the closest xterm-256 approximation string from an image torch.Tensor.

The torch.Tensor shall be either 0~1 float type or 0~255 long type.

Parameters:

image (Tensor) – an RGB image with shape \(3HW\).
max_width (int, optional) – maximum width of the input image. Default: 256

Return type:

str

kornia.image.print_image(image, max_width=96)[source]¶

Print an image to the terminal.

Parameters:

image (Union[str, Tensor]) – path to a valid image file or a torch.Tensor.
max_width (int, optional) – maximum width to print to terminal. Default: 96

Return type:

None

Note

Need to use print_image(…).

Utilities¶

kornia.image.make_grid(tensor, n_row=None, padding=2)[source]¶

Convert a batched tensor to one image with padding in between.

Parameters:

tensor (Tensor) – A batched tensor of shape (B, C, H, W).
n_row (Optional[int], optional) – Number of images displayed in each row of the grid. Default: None
padding (int, optional) – The amount of padding to add between images. Default: 2

Returns:

The combined image grid.

Return type:

torch.Tensor

kornia.image.perform_keep_shape_image(f)[source]¶

Apply f to an image of arbitrary leading dimensions (*, C, H, W).

It works by first viewing the image as (B, C, H, W), applying the function and re-viewing the image as original shape.

Return type:: Callable[..., Tensor]

kornia.image.perform_keep_shape_video(f)[source]¶

Apply f to an image of arbitrary leading dimensions (*, C, D, H, W).

It works by first viewing the image as (B, C, D, H, W), applying the function and re-viewing the image as original shape.

Return type:: Callable[..., Tensor]