kornia.geometry.boxes¶

Module with useful functionalities for 2D and 3D bounding boxes manipulation.

class kornia.geometry.boxes.Boxes(boxes, raise_if_not_floating_point=True, mode='vertices_plus')¶

2D boxes containing N or BxN boxes.

Parameters:

boxes (Tensor | list[Tensor]) – 2D boxes, shape of \((N, 4, 2)\), \((B, N, 4, 2)\) or a list of \((N, 4, 2)\). See below for more details.
raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True
mode (str, optional) – the box format of the input boxes. Default: "vertices_plus"

Note

2D boxes format is defined as a floating data type tensor of shape Nx4x2 or BxNx4x2 where each box is a quadrilateral defined by it’s 4 vertices coordinates (A, B, C, D). Coordinates must be in x, y order. The height and width of a box is defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. Examples of quadrilaterals are rectangles, rhombus and trapezoids.

compute_area()¶

Returns \((B, N)\).

Return type:: Tensor

property device: device¶: Returns boxes device.

property dtype: dtype¶: Returns boxes dtype.

classmethod from_tensor(boxes, mode='xyxy', validate_boxes=True)¶

Helper method to easily create Boxes from boxes stored in another format.

Parameters:

boxes (Tensor | list[Tensor]) – 2D boxes, shape of \((N, 4)\), \((B, N, 4)\), \((N, 4, 2)\) or \((B, N, 4, 2)\).
mode (str, optional) –
The format in which the boxes are provided. Default: "xyxy"
- ’xyxy’: boxes are assumed to be in the format xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).
- ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. With shape \((N, 4)\), \((B, N, 4)\).
- ’xywh’: boxes are assumed to be in the format xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).
- ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).
- ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1``. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).
validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width and height >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type:

Boxes

Returns:

Boxes class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy, mode='xyxy')
>>> boxes.data  # (2, 4, 2)
tensor([[[0., 3.],
         [0., 3.],
         [0., 3.],
         [0., 3.]],

        [[5., 1.],
         [7., 1.],
         [7., 3.],
         [5., 3.]]])

get_boxes_shape()¶

Compute boxes heights and widths.

Return type:

tuple[Tensor, Tensor]

Returns:

Boxes heights, shape of \((N,)\) or \((B,N)\).
Boxes widths, shape of \((N,)\) or \((B,N)\).

Example

>>> boxes_xyxy = torch.tensor([[[1,1,2,2],[1,1,3,2]]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> boxes.get_boxes_shape()
(tensor([[1., 1.]]), tensor([[1., 2.]]))

merge(boxes, inplace=False)¶

Merges boxes.

Say, current instance holds \((B, N, 4, 2)\) and the incoming boxes holds \((B, M, 4, 2)\), the merge results in \((B, N + M, 4, 2)\).

Parameters:

boxes (Boxes) – 2D boxes.
inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes

pad(padding_size)¶

Pad a bounding box.

Parameters:: padding_size (Tensor) – (B, 4)
Return type:: Boxes

to(device=None, dtype=None)¶

Like torch.nn.Module.to() method.

Return type:: Boxes

to_mask(height, width)¶

Convert 2D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters:

height (int) – height of the masked image/images.
width (int) – width of the masked image/images.

Return type:

Tensor

Returns:

the output mask tensor, shape of \((N, width, height)\) or \((B,N, width, height)\) and dtype of Boxes.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes(torch.tensor([[  # Equivalent to boxes = Boxes.from_tensor([[1,1,4,3]])
...        [1., 1.],
...        [4., 1.],
...        [4., 3.],
...        [1., 3.],
...   ]]))  # 1x4x2
>>> boxes.to_mask(5, 5)
tensor([[[0., 0., 0., 0., 0.],
         [0., 1., 1., 1., 1.],
         [0., 1., 1., 1., 1.],
         [0., 1., 1., 1., 1.],
         [0., 0., 0., 0., 0.]]])

to_tensor(mode=None, as_padded_sequence=False)¶

Cast Boxes to a tensor. mode controls which 2D boxes format should be use to represent boxes in the tensor.

Parameters:

mode (Optional[str], optional) –
the output box format. It could be: Default: None
- ’xyxy’: boxes are defined as xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin.
- ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.
- ’xywh’: boxes are defined as xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin.
- ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin.
- ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1``.
as_padded_sequence (bool, optional) – whether to keep the pads for a list of boxes. This parameter is only valid if the boxes are from a box list whilst from_tensor. Default: False

Returns:

‘vertices’ or ‘verticies_plus’: \((N, 4, 2)\) or \((B, N, 4, 2)\).
Any other value: \((N, 4)\) or \((B, N, 4)\).

Return type:

Boxes tensor in the mode format. The shape depends with the mode value

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> assert (boxes_xyxy == boxes.to_tensor(mode='xyxy')).all()

transform_boxes(M, inplace=False)¶

Apply a transformation matrix to the 2D boxes.

Parameters:

M (Tensor) – The transformation matrix to be applied, shape of \((3, 3)\) or \((B, 3, 3)\).
inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes

Returns:

The transformed boxes.

transform_boxes_(M)¶

Inplace version of Boxes.transform_boxes()

Return type:: Boxes

translate(size, method='warp', inplace=False)¶

Translates boxes by the provided size.

Parameters:

size (Tensor) – translate size for x, y direction, shape of \((B, 2)\).
method (str, optional) – “warp” or “fast”. Default: "warp"
inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes

Returns:

The transformed boxes.

trim(correspondence_preserve=False, inplace=False)¶

Trim out zero padded boxes.

Given box arrangements of shape \((4, 4, Box)\): :rtype: Boxes

–

Box

–

Box

–

Box

–

Box

–

–

0

–

0

–

Box

–

Box

–

–

0

–

Box

–

0

–

0

–

–

0

–

0

–

0

–

0

–

Nothing will change if correspondence_preserve is True. Only pure zero layers will be removed, resulting in shape \((4, 3, Box)\):

–

Box

–

Box

–

Box

–

Box

–

–

0

–

0

–

Box

–

Box

–

–

0

–

Box

–

0

–

0

–

Otherwise, you will get \((4, 2, Box)\):

–

Box

–

Box

–

Box

–

Box

–

–

0

–

Box

–

Box

–

Box

–

unpad(padding_size)¶

Pad a bounding box.

Parameters:: padding_size (Tensor) – (B, 4)
Return type:: Boxes

class kornia.geometry.boxes.Boxes3D(boxes, raise_if_not_floating_point=True, mode='xyzxyz_plus')¶

3D boxes containing N or BxN boxes.

Parameters:

boxes (Tensor) – 3D boxes, shape of \((N,8,3)\) or \((B,N,8,3)\). See below for more details.
raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True

Note

3D boxes format is defined as a floating data type tensor of shape Nx8x3 or BxNx8x3 where each box is a hexahedron defined by it’s 8 vertices coordinates. Coordinates must be in x, y, z order. The height, width and depth of a box is defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1. Examples of hexahedrons are cubes and rhombohedrons.

property device: device¶: Returns boxes device.

property dtype: dtype¶: Returns boxes dtype.

classmethod from_tensor(boxes, mode='xyzxyz', validate_boxes=True)¶

Helper method to easily create Boxes3D from 3D boxes stored in another format.

Parameters:

boxes (Tensor) – 3D boxes, shape of \((N,6)\) or \((B,N,6)\).
mode (str, optional) –
The format in which the 3D boxes are provided. Default: "xyzxyz"
- ’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.
- ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.
- ’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.
validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width, height and depth >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type:

Boxes3D

Returns:

Boxes3D class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> boxes.data  # (2, 8, 3)
tensor([[[0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 6.],
         [0., 3., 7.],
         [0., 3., 7.],
         [0., 3., 7.],
         [0., 3., 7.]],

        [[5., 1., 3.],
         [7., 1., 3.],
         [7., 3., 3.],
         [5., 3., 3.],
         [5., 1., 8.],
         [7., 1., 8.],
         [7., 3., 8.],
         [5., 3., 8.]]])

get_boxes_shape()¶

Compute boxes heights and widths.

Return type:

tuple[Tensor, Tensor, Tensor]

Returns:

Boxes depths, shape of \((N,)\) or \((B,N)\).
Boxes heights, shape of \((N,)\) or \((B,N)\).
Boxes widths, shape of \((N,)\) or \((B,N)\).

Example

>>> boxes_xyzxyz = torch.tensor([[ 0,  1,  2, 10, 21, 32], [3, 4, 5, 43, 54, 65]])
>>> boxes3d = Boxes3D.from_tensor(boxes_xyzxyz)
>>> boxes3d.get_boxes_shape()
(tensor([30., 60.]), tensor([20., 50.]), tensor([10., 40.]))

to(device=None, dtype=None)¶

Like torch.nn.Module.to() method.

Return type:: Boxes3D

to_mask(depth, height, width)¶

Convert ·D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters:

depth (int) – depth of the masked image/images.
height (int) – height of the masked image/images.
width (int) – width of the masked image/images.

Return type:

Tensor

Returns:

the output mask tensor, shape of \((N, depth, width, height)\) or \((B,N, depth, width, height)\): and dtype of Boxes3D.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes3D(torch.tensor([[  # Equivalent to boxes = Boxes.3Dfrom_tensor([[1,1,1,3,3,2]])
...     [1., 1., 1.],
...     [3., 1., 1.],
...     [3., 3., 1.],
...     [1., 3., 1.],
...     [1., 1., 2.],
...     [3., 1., 2.],
...     [3., 3., 2.],
...     [1., 3., 2.],
... ]]))  # 1x8x3
>>> boxes.to_mask(4, 5, 5)
tensor([[[[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]]]])

to_tensor(mode='xyzxyz')¶

Cast Boxes3D to a tensor. mode controls which 3D boxes format should be use to represent boxes in the tensor.

Parameters:

mode (str, optional) –

The format in which the boxes are provided. Default: "xyzxyz"

’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.
’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as
width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.
’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.
’vertices’: boxes are defined by their vertices points in the following clockwise order: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. Vertices coordinates are in (x,y, z) order. Finally, box width, height and depth are defined as width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.
’vertices_plus’: similar to ‘vertices’ mode but where box width, length and depth are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.

Returns:

‘vertices’ or ‘verticies_plus’: \((N, 8, 3)\) or \((B, N, 8, 3)\).
Any other value: \((N, 6)\) or \((B, N, 6)\).

Return type:

3D Boxes tensor in the mode format. The shape depends with the mode value

Note

It is currently non-differentiable due to a bug. See github issue #1304.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> assert (boxes.to_tensor(mode='xyzxyz') == boxes_xyzxyz).all()

transform_boxes(M, inplace=False)¶

Apply a transformation matrix to the 3D boxes.

Parameters:

M (Tensor) – The transformation matrix to be applied, shape of \((4, 4)\) or \((B, 4, 4)\).
inplace (bool, optional) – do transform in-place and return self. Default: False

Return type:

Boxes3D

Returns:

The transformed boxes.

transform_boxes_(M)¶

Inplace version of Boxes3D.transform_boxes()

Return type:: Boxes3D

–	Box	–	Box	–	Box	–	Box	–
–	0	–	0	–	Box	–	Box	–
–	0	–	Box	–	0	–	0	–
–	0	–	0	–	0	–	0	–

–	Box	–	Box	–	Box	–	Box	–
–	0	–	0	–	Box	–	Box	–
–	0	–	Box	–	0	–	0	–

–	Box	–	Box	–	Box	–	Box	–
–	0	–	0	–	Box	–	Box	–
–	0	–	Box	–	0	–	0	–
–	0	–	0	–	0	–	0	–

–	Box	–	Box	–	Box	–	Box	–
–	0	–	0	–	Box	–	Box	–
–	0	–	Box	–	0	–	0	–

–	Box	–	Box	–	Box	–	Box	–
–	0	–	0	–	Box	–	Box	–
–	0	–	Box	–	0	–	0	–
–	0	–	0	–	0	–	0	–

–	Box	–	Box	–	Box	–	Box	–
–	0	–	0	–	Box	–	Box	–
–	0	–	Box	–	0	–	0	–