kornia.geometry.boxes

Module with useful functionalities for 2D and 3D bounding boxes manipulation.

class kornia.geometry.boxes.Boxes(boxes, raise_if_not_floating_point=True, mode='vertices_plus')[source]

2D boxes containing N or BxN boxes.

Parameters
  • boxes (Union[Tensor, List[Tensor]]) – 2D boxes, shape of \((N, 4, 2)\), \((B, N, 4, 2)\) or a list of \((N, 4, 2)\). See below for more details.

  • raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True

  • mode (str, optional) – the box format of the input boxes. Default: 'vertices_plus'

Note

2D boxes format is defined as a floating data type tensor of shape Nx4x2 or BxNx4x2 where each box is a quadrilateral defined by it’s 4 vertices coordinates (A, B, C, D). Coordinates must be in x, y order. The height and width of a box is defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. Examples of quadrilaterals are rectangles, rhombus and trapezoids.

property device: torch.device

Returns boxes device.

Return type

device

property dtype: torch.dtype

Returns boxes dtype.

Return type

dtype

classmethod from_tensor(boxes, mode='xyxy', validate_boxes=True)[source]

Helper method to easily create Boxes from boxes stored in another format.

Parameters
  • boxes (Union[Tensor, List[Tensor]]) – 2D boxes, shape of \((N, 4)\), \((B, N, 4)\), \((N, 4, 2)\) or \((B, N, 4, 2)\).

  • mode (str, optional) –

    The format in which the boxes are provided. Default: 'xyxy'

    • ’xyxy’: boxes are assumed to be in the format xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).

    • ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. With shape \((N, 4)\), \((B, N, 4)\).

    • ’xywh’: boxes are assumed to be in the format xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).

    • ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).

    • ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1``. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).

  • validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width and height >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type

Boxes

Returns

Boxes class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy, mode='xyxy')
>>> boxes.data  # (2, 4, 2)
tensor([[[0., 3.],
         [0., 3.],
         [0., 3.],
         [0., 3.]],

        [[5., 1.],
         [7., 1.],
         [7., 3.],
         [5., 3.]]])
get_boxes_shape()[source]

Compute boxes heights and widths.

Return type

Tuple[Tensor, Tensor]

Returns

  • Boxes heights, shape of \((N,)\) or \((B,N)\).

  • Boxes widths, shape of \((N,)\) or \((B,N)\).

Example

>>> boxes_xyxy = torch.tensor([[[1,1,2,2],[1,1,3,2]]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> boxes.get_boxes_shape()
(tensor([[1., 1.]]), tensor([[1., 2.]]))
to(device=None, dtype=None)[source]

Like torch.nn.Module.to() method.

Return type

Boxes

to_mask(height, width)[source]

Convert 2D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters
  • height (int) – height of the masked image/images.

  • width (int) – width of the masked image/images.

Return type

Tensor

Returns

the output mask tensor, shape of \((N, width, height)\) or \((B,N, width, height)\) and dtype of Boxes.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes(torch.tensor([[  # Equivalent to boxes = Boxes.from_tensor([[1,1,4,3]])
...        [1., 1.],
...        [4., 1.],
...        [4., 3.],
...        [1., 3.],
...   ]]))  # 1x4x2
>>> boxes.to_mask(5, 5)
tensor([[[0., 0., 0., 0., 0.],
         [0., 1., 1., 1., 1.],
         [0., 1., 1., 1., 1.],
         [0., 1., 1., 1., 1.],
         [0., 0., 0., 0., 0.]]])
to_tensor(mode=None, as_padded_sequence=False)[source]

Cast Boxes to a tensor. mode controls which 2D boxes format should be use to represent boxes in the tensor.

Parameters
  • mode (Optional[str], optional) –

    the output box format. It could be: Default: None

    • ’xyxy’: boxes are defined as xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin.

    • ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.

    • ’xywh’: boxes are defined as xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin.

    • ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin.

    • ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1``.

  • as_padded_sequence (bool, optional) – whether to keep the pads for a list of boxes. This parameter is only valid if the boxes are from a box list. Default: False

Returns

  • ‘vertices’ or ‘verticies_plus’: \((N, 4, 2)\) or \((B, N, 4, 2)\).

  • Any other value: \((N, 4)\) or \((B, N, 4)\).

Return type

Boxes tensor in the mode format. The shape depends with the mode value

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> assert (boxes_xyxy == boxes.to_tensor(mode='xyxy')).all()
transform_boxes(M, inplace=False)[source]

Apply a transformation matrix to the 2D boxes.

Parameters
  • M (Tensor) – The transformation matrix to be applied, shape of \((3, 3)\) or \((B, 3, 3)\).

  • inplace (bool, optional) – do transform in-place and return self. Default: False

Return type

Boxes

Returns

The transformed boxes.

transform_boxes_(M)[source]

Inplace version of Boxes.transform_boxes()

Return type

Boxes

class kornia.geometry.boxes.Boxes3D(boxes, raise_if_not_floating_point=True, mode='xyzxyz_plus')[source]

3D boxes containing N or BxN boxes.

Parameters
  • boxes (Tensor) – 3D boxes, shape of \((N,8,3)\) or \((B,N,8,3)\). See below for more details.

  • raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True

Note

3D boxes format is defined as a floating data type tensor of shape Nx8x3 or BxNx8x3 where each box is a hexahedron defined by it’s 8 vertices coordinates. Coordinates must be in x, y, z order. The height, width and depth of a box is defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1. Examples of hexahedrons are cubes and rhombohedrons.

property device: torch.device

Returns boxes device.

Return type

device

property dtype: torch.dtype

Returns boxes dtype.

Return type

dtype

classmethod from_tensor(boxes, mode='xyzxyz', validate_boxes=True)[source]

Helper method to easily create Boxes3D from 3D boxes stored in another format.

Parameters
  • boxes (Tensor) – 3D boxes, shape of \((N,6)\) or \((B,N,6)\).

  • mode (str, optional) –

    The format in which the 3D boxes are provided. Default: 'xyzxyz'

    • ’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

    • ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.

    • ’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width, height and depth >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type

Boxes3D

Returns

Boxes3D class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> boxes.data  # (2, 8, 3)
tensor([[[0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 7.],
         [0., 3., 7.],
         [0., 3., 7.],
         [0., 3., 7.]],

        [[5., 1., 3.],
         [7., 1., 3.],
         [7., 3., 3.],
         [5., 3., 3.],
         [5., 1., 8.],
         [7., 1., 8.],
         [7., 3., 8.],
         [5., 3., 8.]]])
get_boxes_shape()[source]

Compute boxes heights and widths.

Return type

Tuple[Tensor, Tensor, Tensor]

Returns

  • Boxes depths, shape of \((N,)\) or \((B,N)\).

  • Boxes heights, shape of \((N,)\) or \((B,N)\).

  • Boxes widths, shape of \((N,)\) or \((B,N)\).

Example

>>> boxes_xyzxyz = torch.tensor([[ 0,  1,  2, 10, 21, 32], [3, 4, 5, 43, 54, 65]])
>>> boxes3d = Boxes3D.from_tensor(boxes_xyzxyz)
>>> boxes3d.get_boxes_shape()
(tensor([30., 60.]), tensor([20., 50.]), tensor([10., 40.]))
to(device=None, dtype=None)[source]

Like torch.nn.Module.to() method.

Return type

Boxes3D

to_mask(depth, height, width)[source]

Convert ·D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters
  • depth (int) – depth of the masked image/images.

  • height (int) – height of the masked image/images.

  • width (int) – width of the masked image/images.

Return type

Tensor

Returns

the output mask tensor, shape of \((N, depth, width, height)\) or \((B,N, depth, width, height)\)

and dtype of Boxes3D.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes3D(torch.tensor([[  # Equivalent to boxes = Boxes.3Dfrom_tensor([[1,1,1,3,3,2]])
...     [1., 1., 1.],
...     [3., 1., 1.],
...     [3., 3., 1.],
...     [1., 3., 1.],
...     [1., 1., 2.],
...     [3., 1., 2.],
...     [3., 3., 2.],
...     [1., 3., 2.],
... ]]))  # 1x8x3
>>> boxes.to_mask(4, 5, 5)
tensor([[[[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]]]])
to_tensor(mode='xyzxyz')[source]

Cast Boxes3D to a tensor. mode controls which 3D boxes format should be use to represent boxes in the tensor.

Parameters

mode (str, optional) –

The format in which the boxes are provided. Default: 'xyzxyz'

  • ’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as

    width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.

  • ’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • ’vertices’: boxes are defined by their vertices points in the following clockwise order: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. Vertices coordinates are in (x,y, z) order. Finally, box width, height and depth are defined as width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

  • ’vertices_plus’: similar to ‘vertices’ mode but where box width, length and depth are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.

Returns

  • ‘vertices’ or ‘verticies_plus’: \((N, 8, 3)\) or \((B, N, 8, 3)\).

  • Any other value: \((N, 6)\) or \((B, N, 6)\).

Return type

3D Boxes tensor in the mode format. The shape depends with the mode value

Note

It is currently non-differentiable due to a bug. See github issue #1304.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> assert (boxes.to_tensor(mode='xyzxyz') == boxes_xyzxyz).all()
transform_boxes(M, inplace=False)[source]

Apply a transformation matrix to the 3D boxes.

Parameters
  • M (Tensor) – The transformation matrix to be applied, shape of \((4, 4)\) or \((B, 4, 4)\).

  • inplace (bool, optional) – do transform in-place and return self. Default: False

Return type

Boxes3D

Returns

The transformed boxes.

transform_boxes_(M)[source]

Inplace version of Boxes3D.transform_boxes()

Return type

Boxes3D