kornia.geometry.boxes

Module with useful functionalities for 2D and 3D bounding boxes manipulation.

class kornia.geometry.boxes.Boxes(boxes, raise_if_not_floating_point=True, mode='vertices_plus')

2D boxes containing N or BxN boxes.

Parameters:
  • boxes (Tensor | list[Tensor]) – 2D boxes, shape of \((N, 4, 2)\), \((B, N, 4, 2)\) or a list of \((N, 4, 2)\). See below for more details.

  • raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True

  • mode (str, optional) – the box format of the input boxes. Default: "vertices_plus"

Note

2D boxes format is defined as a floating data type tensor of shape Nx4x2 or BxNx4x2 where each box is a quadrilateral defined by it’s 4 vertices coordinates (A, B, C, D). Coordinates must be in x, y order. The height and width of a box is defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. Examples of quadrilaterals are rectangles, rhombus and trapezoids.

compute_area()

Returns \((B, N)\).

Return type:

Tensor

property device: device

Returns boxes device.

property dtype: dtype

Returns boxes dtype.

classmethod from_tensor(boxes, mode='xyxy', validate_boxes=True)

Helper method to easily create Boxes from boxes stored in another format.

Parameters:
  • boxes (Tensor | list[Tensor]) – 2D boxes, shape of \((N, 4)\), \((B, N, 4)\), \((N, 4, 2)\) or \((B, N, 4, 2)\).

  • mode (str, optional) –

    The format in which the boxes are provided. Default: "xyxy"

    • ’xyxy’: boxes are assumed to be in the format xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).

    • ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. With shape \((N, 4)\), \((B, N, 4)\).

    • ’xywh’: boxes are assumed to be in the format xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).

    • ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).

    • ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1``. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).

  • validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width and height >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type:

Boxes

Returns:

Boxes class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy, mode='xyxy')
>>> boxes.data  # (2, 4, 2)
tensor([[[0., 3.],
         [0., 3.],
         [0., 3.],
         [0., 3.]],

        [[5., 1.],
         [7., 1.],
         [7., 3.],
         [5., 3.]]])
get_boxes_shape()

Compute boxes heights and widths.

Return type:

tuple[Tensor, Tensor]

Returns:

  • Boxes heights, shape of \((N,)\) or \((B,N)\).

  • Boxes widths, shape of \((N,)\) or \((B,N)\).

Example

>>> boxes_xyxy = torch.tensor([[[1,1,2,2],[1,1,3,2]]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> boxes.get_boxes_shape()
(tensor([[1., 1.]]), tensor([[1., 2.]]))
merge(boxes, inplace=False)

Merges boxes.

Say, current instance holds \((B, N, 4, 2)\) and the incoming boxes holds \((B, M, 4, 2)\), the merge results in \((B, N + M, 4, 2)\).

Parameters:
  • boxes (Boxes) – 2D boxes.

  • inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes

pad(padding_size)

Pad a bounding box.

Parameters:

padding_size (Tensor) – (B, 4)

Return type:

Boxes

to(device=None, dtype=None)

Like torch.nn.Module.to() method.

Return type:

Boxes

to_mask(height, width)

Convert 2D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters:
  • height (int) – height of the masked image/images.

  • width (int) – width of the masked image/images.

Return type:

Tensor

Returns:

the output mask tensor, shape of \((N, width, height)\) or \((B,N, width, height)\) and dtype of Boxes.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes(torch.tensor([[  # Equivalent to boxes = Boxes.from_tensor([[1,1,4,3]])
...        [1., 1.],
...        [4., 1.],
...        [4., 3.],
...        [1., 3.],
...   ]]))  # 1x4x2
>>> boxes.to_mask(5, 5)
tensor([[[0., 0., 0., 0., 0.],
         [0., 1., 1., 1., 1.],
         [0., 1., 1., 1., 1.],
         [0., 1., 1., 1., 1.],
         [0., 0., 0., 0., 0.]]])
to_tensor(mode=None, as_padded_sequence=False)

Cast Boxes to a tensor. mode controls which 2D boxes format should be use to represent boxes in the tensor.

Parameters:
  • mode (Optional[str], optional) –

    the output box format. It could be: Default: None

    • ’xyxy’: boxes are defined as xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin.

    • ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.

    • ’xywh’: boxes are defined as xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin.

    • ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin.

    • ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1``.

  • as_padded_sequence (bool, optional) – whether to keep the pads for a list of boxes. This parameter is only valid if the boxes are from a box list whilst from_tensor. Default: False

Returns:

  • ‘vertices’ or ‘verticies_plus’: \((N, 4, 2)\) or \((B, N, 4, 2)\).

  • Any other value: \((N, 4)\) or \((B, N, 4)\).

Return type:

Boxes tensor in the mode format. The shape depends with the mode value

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> assert (boxes_xyxy == boxes.to_tensor(mode='xyxy')).all()
transform_boxes(M, inplace=False)

Apply a transformation matrix to the 2D boxes.

Parameters:
  • M (Tensor) – The transformation matrix to be applied, shape of \((3, 3)\) or \((B, 3, 3)\).

  • inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes

Returns:

The transformed boxes.

transform_boxes_(M)

Inplace version of Boxes.transform_boxes()

Return type:

Boxes

translate(size, method='warp', inplace=False)

Translates boxes by the provided size.

Parameters:
  • size (Tensor) – translate size for x, y direction, shape of \((B, 2)\).

  • method (str, optional) – “warp” or “fast”. Default: "warp"

  • inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes

Returns:

The transformed boxes.

trim(correspondence_preserve=False, inplace=False)

Trim out zero padded boxes.

Given box arrangements of shape \((4, 4, Box)\): :rtype: Boxes

Box

Box

Box

Box

0

0

Box

Box

0

Box

0

0

0

0

0

0

Nothing will change if correspondence_preserve is True. Only pure zero layers will be removed, resulting in shape \((4, 3, Box)\):

Box

Box

Box

Box

0

0

Box

Box

0

Box

0

0

Otherwise, you will get \((4, 2, Box)\):

Box

Box

Box

Box

0

Box

Box

Box

unpad(padding_size)

Pad a bounding box.

Parameters:

padding_size (Tensor) – (B, 4)

Return type:

Boxes

class kornia.geometry.boxes.Boxes3D(boxes, raise_if_not_floating_point=True, mode='xyzxyz_plus')

3D boxes containing N or BxN boxes.

Parameters:
  • boxes (Tensor) – 3D boxes, shape of \((N,8,3)\) or \((B,N,8,3)\). See below for more details.

  • raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True

Note

3D boxes format is defined as a floating data type tensor of shape Nx8x3 or BxNx8x3 where each box is a hexahedron defined by it’s 8 vertices coordinates. Coordinates must be in x, y, z order. The height, width and depth of a box is defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1. Examples of hexahedrons are cubes and rhombohedrons.

property device: device

Returns boxes device.

property dtype: dtype

Returns boxes dtype.

classmethod from_tensor(boxes, mode='xyzxyz', validate_boxes=True)

Helper method to easily create Boxes3D from 3D boxes stored in another format.

Parameters:
  • boxes (Tensor) – 3D boxes, shape of \((N,6)\) or \((B,N,6)\).

  • mode (str, optional) –

    The format in which the 3D boxes are provided. Default: "xyzxyz"

    • ’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

    • ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.

    • ’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width, height and depth >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type:

Boxes3D

Returns:

Boxes3D class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> boxes.data  # (2, 8, 3)
tensor([[[0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 7.],
         [0., 3., 7.],
         [0., 3., 7.],
         [0., 3., 7.]],

        [[5., 1., 3.],
         [7., 1., 3.],
         [7., 3., 3.],
         [5., 3., 3.],
         [5., 1., 8.],
         [7., 1., 8.],
         [7., 3., 8.],
         [5., 3., 8.]]])
get_boxes_shape()

Compute boxes heights and widths.

Return type:

tuple[Tensor, Tensor, Tensor]

Returns:

  • Boxes depths, shape of \((N,)\) or \((B,N)\).

  • Boxes heights, shape of \((N,)\) or \((B,N)\).

  • Boxes widths, shape of \((N,)\) or \((B,N)\).

Example

>>> boxes_xyzxyz = torch.tensor([[ 0,  1,  2, 10, 21, 32], [3, 4, 5, 43, 54, 65]])
>>> boxes3d = Boxes3D.from_tensor(boxes_xyzxyz)
>>> boxes3d.get_boxes_shape()
(tensor([30., 60.]), tensor([20., 50.]), tensor([10., 40.]))
to(device=None, dtype=None)

Like torch.nn.Module.to() method.

Return type:

Boxes3D

to_mask(depth, height, width)

Convert ·D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters:
  • depth (int) – depth of the masked image/images.

  • height (int) – height of the masked image/images.

  • width (int) – width of the masked image/images.

Return type:

Tensor

Returns:

the output mask tensor, shape of \((N, depth, width, height)\) or \((B,N, depth, width, height)\)

and dtype of Boxes3D.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes3D(torch.tensor([[  # Equivalent to boxes = Boxes.3Dfrom_tensor([[1,1,1,3,3,2]])
...     [1., 1., 1.],
...     [3., 1., 1.],
...     [3., 3., 1.],
...     [1., 3., 1.],
...     [1., 1., 2.],
...     [3., 1., 2.],
...     [3., 3., 2.],
...     [1., 3., 2.],
... ]]))  # 1x8x3
>>> boxes.to_mask(4, 5, 5)
tensor([[[[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]]]])
to_tensor(mode='xyzxyz')

Cast Boxes3D to a tensor. mode controls which 3D boxes format should be use to represent boxes in the tensor.

Parameters:

mode (str, optional) –

The format in which the boxes are provided. Default: "xyzxyz"

  • ’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as

    width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.

  • ’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • ’vertices’: boxes are defined by their vertices points in the following clockwise order: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. Vertices coordinates are in (x,y, z) order. Finally, box width, height and depth are defined as width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • ’vertices_plus’: similar to ‘vertices’ mode but where box width, length and depth are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.

Returns:

  • ‘vertices’ or ‘verticies_plus’: \((N, 8, 3)\) or \((B, N, 8, 3)\).

  • Any other value: \((N, 6)\) or \((B, N, 6)\).

Return type:

3D Boxes tensor in the mode format. The shape depends with the mode value

Note

It is currently non-differentiable due to a bug. See github issue #1304.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> assert (boxes.to_tensor(mode='xyzxyz') == boxes_xyzxyz).all()
transform_boxes(M, inplace=False)

Apply a transformation matrix to the 3D boxes.

Parameters:
  • M (Tensor) – The transformation matrix to be applied, shape of \((4, 4)\) or \((B, 4, 4)\).

  • inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes3D

Returns:

The transformed boxes.

transform_boxes_(M)

Inplace version of Boxes3D.transform_boxes()

Return type:

Boxes3D