# kornia.geometry.boxes#

Module with useful functionalities for 2D and 3D bounding boxes manipulation.

class kornia.geometry.boxes.Boxes(boxes, raise_if_not_floating_point=True, mode='vertices_plus')[source]#

2D boxes containing N or BxN boxes.

Parameters
• boxes (Union[Tensor, List[Tensor]]) – 2D boxes, shape of $$(N, 4, 2)$$, $$(B, N, 4, 2)$$ or a list of $$(N, 4, 2)$$. See below for more details.

• raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True

• mode (str, optional) – the box format of the input boxes. Default: 'vertices_plus'

Note

2D boxes format is defined as a floating data type tensor of shape Nx4x2 or BxNx4x2 where each box is a quadrilateral defined by it’s 4 vertices coordinates (A, B, C, D). Coordinates must be in x, y order. The height and width of a box is defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. Examples of quadrilaterals are rectangles, rhombus and trapezoids.

compute_area()[source]#

Returns $$(B, N)$$.

Return type

Tensor

property device: device#

Returns boxes device.

Return type

device

property dtype: dtype#

Returns boxes dtype.

Return type

dtype

classmethod from_tensor(boxes, mode='xyxy', validate_boxes=True)[source]#

Helper method to easily create Boxes from boxes stored in another format.

Parameters
• boxes (Union[Tensor, List[Tensor]]) – 2D boxes, shape of $$(N, 4)$$, $$(B, N, 4)$$, $$(N, 4, 2)$$ or $$(B, N, 4, 2)$$.

• mode (str, optional) –

The format in which the boxes are provided. Default: 'xyxy'

• ’xyxy’: boxes are assumed to be in the format xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin. With shape $$(N, 4)$$, $$(B, N, 4)$$.

• ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. With shape $$(N, 4)$$, $$(B, N, 4)$$.

• ’xywh’: boxes are assumed to be in the format xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin. With shape $$(N, 4)$$, $$(B, N, 4)$$.

• ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin. With shape $$(N, 4, 2)$$ or $$(B, N, 4, 2)$$.

• ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1. With shape $$(N, 4, 2)$$ or $$(B, N, 4, 2)$$.

• validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width and height >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type

Boxes

Returns

Boxes class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy, mode='xyxy')
>>> boxes.data  # (2, 4, 2)
tensor([[[0., 3.],
[0., 3.],
[0., 3.],
[0., 3.]],

[[5., 1.],
[7., 1.],
[7., 3.],
[5., 3.]]])

get_boxes_shape()[source]#

Compute boxes heights and widths.

Return type
Returns

• Boxes heights, shape of $$(N,)$$ or $$(B,N)$$.

• Boxes widths, shape of $$(N,)$$ or $$(B,N)$$.

Example

>>> boxes_xyxy = torch.tensor([[[1,1,2,2],[1,1,3,2]]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> boxes.get_boxes_shape()
(tensor([[1., 1.]]), tensor([[1., 2.]]))

merge(boxes, inplace=False)[source]#

Merges boxes.

Say, current instance holds $$(B, N, 4, 2)$$ and the incoming boxes holds $$(B, M, 4, 2)$$, the merge results in $$(B, N + M, 4, 2)$$.

Parameters
Return type

Boxes

to(device=None, dtype=None)[source]#

Like torch.nn.Module.to() method.

Return type

Boxes

Convert 2D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters
Return type

Tensor

Returns

the output mask tensor, shape of $$(N, width, height)$$ or $$(B,N, width, height)$$ and dtype of Boxes.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes(torch.tensor([[  # Equivalent to boxes = Boxes.from_tensor([[1,1,4,3]])
...        [1., 1.],
...        [4., 1.],
...        [4., 3.],
...        [1., 3.],
...   ]]))  # 1x4x2
tensor([[[0., 0., 0., 0., 0.],
[0., 1., 1., 1., 1.],
[0., 1., 1., 1., 1.],
[0., 1., 1., 1., 1.],
[0., 0., 0., 0., 0.]]])


Cast Boxes to a tensor. mode controls which 2D boxes format should be use to represent boxes in the tensor.

Parameters
• mode (Optional[str], optional) –

the output box format. It could be: Default: None

• ’xyxy’: boxes are defined as xmin, ymin, xmax, ymax where width = xmax - xmin and height = ymax - ymin.

• ’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.

• ’xywh’: boxes are defined as xmin, ymin, width, height where width = xmax - xmin and height = ymax - ymin.

• ’vertices’: boxes are defined by their vertices points in the following clockwise order: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined as width = xmax - xmin and height = ymax - ymin.

• ’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1. ymin + 1.

• as_padded_sequence (bool, optional) – whether to keep the pads for a list of boxes. This parameter is only valid if the boxes are from a box list. Default: False

Returns

• ‘vertices’ or ‘verticies_plus’: $$(N, 4, 2)$$ or $$(B, N, 4, 2)$$.

• Any other value: $$(N, 4)$$ or $$(B, N, 4)$$.

Return type

Boxes tensor in the mode format. The shape depends with the mode value

Examples

>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]])
>>> boxes = Boxes.from_tensor(boxes_xyxy)
>>> assert (boxes_xyxy == boxes.to_tensor(mode='xyxy')).all()

transform_boxes(M, inplace=False)[source]#

Apply a transformation matrix to the 2D boxes.

Parameters
• M (Tensor) – The transformation matrix to be applied, shape of $$(3, 3)$$ or $$(B, 3, 3)$$.

• inplace (bool, optional) – do transform in-place and return self. Default: False

Return type

Boxes

Returns

The transformed boxes.

transform_boxes_(M)[source]#

Inplace version of Boxes.transform_boxes()

Return type

Boxes

translate(size, method='warp', inplace=False)[source]#

Translates boxes by the provided size.

Parameters
• size (Tensor) – translate size for x, y direction, shape of $$(B, 2)$$.

• method (str, optional) – “warp” or “fast”. Default: 'warp'

• inplace (bool, optional) – do transform in-place and return self. Default: False

Return type

Boxes

Returns

The transformed boxes.

trim(correspondence_preserve=False, inplace=False)[source]#

Trim out zero padded boxes.

Given box arrangements of shape $$(4, 4, Box)$$:

– Box – Box – Box – Box – – 0 – 0 – Box – Box – – 0 – Box – 0 – 0 – – 0 – 0 – 0 – 0 –

Nothing will change if correspondence_preserve is True. Only pure zero layers will be removed, resulting in shape $$(4, 3, Box)$$:

– Box – Box – Box – Box – – 0 – 0 – Box – Box – – 0 – Box – 0 – 0 –

Otherwise, you will get $$(4, 2, Box)$$:

– Box – Box – Box – Box – – 0 – Box – Box – Box –

Return type

Boxes

class kornia.geometry.boxes.Boxes3D(boxes, raise_if_not_floating_point=True, mode='xyzxyz_plus')[source]#

3D boxes containing N or BxN boxes.

Parameters
• boxes (Tensor) – 3D boxes, shape of $$(N,8,3)$$ or $$(B,N,8,3)$$. See below for more details.

• raise_if_not_floating_point (bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default: True

Note

3D boxes format is defined as a floating data type tensor of shape Nx8x3 or BxNx8x3 where each box is a hexahedron defined by it’s 8 vertices coordinates. Coordinates must be in x, y, z order. The height, width and depth of a box is defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1. Examples of hexahedrons are cubes and rhombohedrons.

property device: device#

Returns boxes device.

Return type

device

property dtype: dtype#

Returns boxes dtype.

Return type

dtype

classmethod from_tensor(boxes, mode='xyzxyz', validate_boxes=True)[source]#

Helper method to easily create Boxes3D from 3D boxes stored in another format.

Parameters
• boxes (Tensor) – 3D boxes, shape of $$(N,6)$$ or $$(B,N,6)$$.

• mode (str, optional) –

The format in which the 3D boxes are provided. Default: 'xyzxyz'

• ’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

• ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.

• ’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

• validate_boxes (bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width, height and depth >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default: True

Return type

Boxes3D

Returns

Boxes3D class containing the original boxes in the format specified by mode.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> boxes.data  # (2, 8, 3)
tensor([[[0., 3., 6.],
[0., 3., 6.],
[0., 3., 6.],
[0., 3., 6.],
[0., 3., 7.],
[0., 3., 7.],
[0., 3., 7.],
[0., 3., 7.]],

[[5., 1., 3.],
[7., 1., 3.],
[7., 3., 3.],
[5., 3., 3.],
[5., 1., 8.],
[7., 1., 8.],
[7., 3., 8.],
[5., 3., 8.]]])

get_boxes_shape()[source]#

Compute boxes heights and widths.

Return type
Returns

• Boxes depths, shape of $$(N,)$$ or $$(B,N)$$.

• Boxes heights, shape of $$(N,)$$ or $$(B,N)$$.

• Boxes widths, shape of $$(N,)$$ or $$(B,N)$$.

Example

>>> boxes_xyzxyz = torch.tensor([[ 0,  1,  2, 10, 21, 32], [3, 4, 5, 43, 54, 65]])
>>> boxes3d = Boxes3D.from_tensor(boxes_xyzxyz)
>>> boxes3d.get_boxes_shape()
(tensor([30., 60.]), tensor([20., 50.]), tensor([10., 40.]))

to(device=None, dtype=None)[source]#

Like torch.nn.Module.to() method.

Return type

Boxes3D

Convert ·D boxes to masks. Covered area is 1 and the remaining is 0.

Parameters
Return type

Tensor

Returns

the output mask tensor, shape of $$(N, depth, width, height)$$ or $$(B,N, depth, width, height)$$

and dtype of Boxes3D.dtype() (it can be any floating point dtype).

Note

It is currently non-differentiable.

Examples

>>> boxes = Boxes3D(torch.tensor([[  # Equivalent to boxes = Boxes.3Dfrom_tensor([[1,1,1,3,3,2]])
...     [1., 1., 1.],
...     [3., 1., 1.],
...     [3., 3., 1.],
...     [1., 3., 1.],
...     [1., 1., 2.],
...     [3., 1., 2.],
...     [3., 3., 2.],
...     [1., 3., 2.],
... ]]))  # 1x8x3
>>> boxes.to_mask(4, 5, 5)
tensor([[[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]],

[[0., 0., 0., 0., 0.],
[0., 1., 1., 1., 0.],
[0., 1., 1., 1., 0.],
[0., 1., 1., 1., 0.],
[0., 0., 0., 0., 0.]],

[[0., 0., 0., 0., 0.],
[0., 1., 1., 1., 0.],
[0., 1., 1., 1., 0.],
[0., 1., 1., 1., 0.],
[0., 0., 0., 0., 0.]],

[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]]]])

to_tensor(mode='xyzxyz')[source]#

Cast Boxes3D to a tensor. mode controls which 3D boxes format should be use to represent boxes in the tensor.

Parameters

mode (str, optional) –

The format in which the boxes are provided. Default: 'xyzxyz'

• ’xyzxyz’: boxes are assumed to be in the format xmin, ymin, zmin, xmax, ymax, zmax where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

• ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as

width = xmax - xmin + 1, height = ymax - ymin + 1 and depth = zmax - zmin + 1.

• ’xyzwhd’: boxes are assumed to be in the format xmin, ymin, zmin, width, height, depth where width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

• ’vertices’: boxes are defined by their vertices points in the following clockwise order: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. Vertices coordinates are in (x,y, z) order. Finally, box width, height and depth are defined as width = xmax - xmin, height = ymax - ymin and depth = zmax - zmin.

• ’vertices_plus’: similar to ‘vertices’ mode but where box width, length and depth are defined as width = xmax - xmin + 1 and height = ymax - ymin + 1.

Returns

• ‘vertices’ or ‘verticies_plus’: $$(N, 8, 3)$$ or $$(B, N, 8, 3)$$.

• Any other value: $$(N, 6)$$ or $$(B, N, 6)$$.

Return type

3D Boxes tensor in the mode format. The shape depends with the mode value

Note

It is currently non-differentiable due to a bug. See github issue #1304.

Examples

>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]])
>>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz')
>>> assert (boxes.to_tensor(mode='xyzxyz') == boxes_xyzxyz).all()

transform_boxes(M, inplace=False)[source]#

Apply a transformation matrix to the 3D boxes.

Parameters
• M (Tensor) – The transformation matrix to be applied, shape of $$(4, 4)$$ or $$(B, 4, 4)$$.

• inplace (bool, optional) – do transform in-place and return self. Default: False

Return type

Boxes3D

Returns

The transformed boxes.

transform_boxes_(M)[source]#

Inplace version of Boxes3D.transform_boxes()

Return type

Boxes3D