kornia.geometry.bbox#

Module with useful functionalities for 2D and 3D bounding boxes manipulation.

kornia.geometry.bbox.bbox_generator(x_start, y_start, width, height)#

Generate 2D bounding boxes according to the provided start coords, width and height.

Parameters:
  • x_start (Tensor) – a tensor containing the x coordinates of the bounding boxes to be extracted. Shape must be a scalar tensor or \((B,)\).

  • y_start (Tensor) – a tensor containing the y coordinates of the bounding boxes to be extracted. Shape must be a scalar tensor or \((B,)\).

  • width (Tensor) – widths of the masked image. Shape must be a scalar tensor or \((B,)\).

  • height (Tensor) – heights of the masked image. Shape must be a scalar tensor or \((B,)\).

Return type:

Tensor

Returns:

the bounding box tensor.

Examples

>>> x_start = torch.tensor([0, 1])
>>> y_start = torch.tensor([1, 0])
>>> width = torch.tensor([5, 3])
>>> height = torch.tensor([7, 4])
>>> bbox_generator(x_start, y_start, width, height)
tensor([[[0, 1],
         [4, 1],
         [4, 7],
         [0, 7]],

        [[1, 0],
         [3, 0],
         [3, 3],
         [1, 3]]])
kornia.geometry.bbox.bbox_generator3d(x_start, y_start, z_start, width, height, depth)#

Generate 3D bounding boxes according to the provided start coords, width, height and depth.

Parameters:
  • x_start (Tensor) – a tensor containing the x coordinates of the bounding boxes to be extracted. Shape must be a scalar tensor or \((B,)\).

  • y_start (Tensor) – a tensor containing the y coordinates of the bounding boxes to be extracted. Shape must be a scalar tensor or \((B,)\).

  • z_start (Tensor) – a tensor containing the z coordinates of the bounding boxes to be extracted. Shape must be a scalar tensor or \((B,)\).

  • width (Tensor) – widths of the masked image. Shape must be a scalar tensor or \((B,)\).

  • height (Tensor) – heights of the masked image. Shape must be a scalar tensor or \((B,)\).

  • depth (Tensor) – depths of the masked image. Shape must be a scalar tensor or \((B,)\).

Return type:

Tensor

Returns:

the 3d bounding box tensor \((B, 8, 3)\).

Examples

>>> x_start = torch.tensor([0, 3])
>>> y_start = torch.tensor([1, 4])
>>> z_start = torch.tensor([2, 5])
>>> width = torch.tensor([10, 40])
>>> height = torch.tensor([20, 50])
>>> depth = torch.tensor([30, 60])
>>> bbox_generator3d(x_start, y_start, z_start, width, height, depth)
tensor([[[ 0,  1,  2],
         [10,  1,  2],
         [10, 21,  2],
         [ 0, 21,  2],
         [ 0,  1, 32],
         [10,  1, 32],
         [10, 21, 32],
         [ 0, 21, 32]],

        [[ 3,  4,  5],
         [43,  4,  5],
         [43, 54,  5],
         [ 3, 54,  5],
         [ 3,  4, 65],
         [43,  4, 65],
         [43, 54, 65],
         [ 3, 54, 65]]])
kornia.geometry.bbox.bbox_to_mask(boxes, width, height)#

Convert 2D bounding boxes to masks. Covered area is 1. and the remaining is 0.

Parameters:
  • boxes (Tensor) – a tensor containing the coordinates of the bounding boxes to be extracted. The tensor must have the shape of Bx4x2, where each box is defined in the following clockwise order: top-left, top-right, bottom-right and bottom-left. The coordinates must be in the x, y order.

  • width (int) – width of the masked image.

  • height (int) – height of the masked image.

Return type:

Tensor

Returns:

the output mask tensor.

Note

It is currently non-differentiable.

Examples

>>> boxes = torch.tensor([[
...        [1., 1.],
...        [3., 1.],
...        [3., 2.],
...        [1., 2.],
...   ]])  # 1x4x2
>>> bbox_to_mask(boxes, 5, 5)
tensor([[[0., 0., 0., 0., 0.],
         [0., 1., 1., 1., 0.],
         [0., 1., 1., 1., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]])
kornia.geometry.bbox.bbox_to_mask3d(boxes, size)#

Convert 3D bounding boxes to masks. Covered area is 1. and the remaining is 0.

Parameters:
  • boxes (Tensor) – a tensor containing the coordinates of the bounding boxes to be extracted. The tensor must have the shape of Bx8x3, where each box is defined in the following clockwise order: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. The coordinates must be in the x, y, z order.

  • size (tuple[int, int, int]) – depth, height and width of the masked image.

Return type:

Tensor

Returns:

the output mask tensor.

Examples

>>> boxes = torch.tensor([[
...     [1., 1., 1.],
...     [2., 1., 1.],
...     [2., 2., 1.],
...     [1., 2., 1.],
...     [1., 1., 2.],
...     [2., 1., 2.],
...     [2., 2., 2.],
...     [1., 2., 2.],
... ]])  # 1x8x3
>>> bbox_to_mask3d(boxes, (4, 5, 5))
tensor([[[[[0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.]],

          [[0., 0., 0., 0., 0.],
           [0., 1., 1., 0., 0.],
           [0., 1., 1., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.]],

          [[0., 0., 0., 0., 0.],
           [0., 1., 1., 0., 0.],
           [0., 1., 1., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.]],

          [[0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0.]]]]])
kornia.geometry.bbox.infer_bbox_shape(boxes)#

Auto-infer the output sizes for the given 2D bounding boxes.

Parameters:

boxes (Tensor) – a tensor containing the coordinates of the bounding boxes to be extracted. The tensor must have the shape of Bx4x2, where each box is defined in the following clockwise order: top-left, top-right, bottom-right, bottom-left. The coordinates must be in the x, y order.

Return type:

tuple[Tensor, Tensor]

Returns:

  • Bounding box heights, shape of \((B,)\).

  • Boundingbox widths, shape of \((B,)\).

Example

>>> boxes = torch.tensor([[
...     [1., 1.],
...     [2., 1.],
...     [2., 2.],
...     [1., 2.],
... ], [
...     [1., 1.],
...     [3., 1.],
...     [3., 2.],
...     [1., 2.],
... ]])  # 2x4x2
>>> infer_bbox_shape(boxes)
(tensor([2., 2.]), tensor([2., 3.]))
kornia.geometry.bbox.infer_bbox_shape3d(boxes)#

Auto-infer the output sizes for the given 3D bounding boxes.

Parameters:

boxes (Tensor) – a tensor containing the coordinates of the bounding boxes to be extracted. The tensor must have the shape of Bx8x3, where each box is defined in the following clockwise order: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. The coordinates must be in the x, y, z order.

Return type:

tuple[Tensor, Tensor, Tensor]

Returns:

  • Bounding box depths, shape of \((B,)\).

  • Bounding box heights, shape of \((B,)\).

  • Bounding box widths, shape of \((B,)\).

Example

>>> boxes = torch.tensor([[[ 0,  1,  2],
...         [10,  1,  2],
...         [10, 21,  2],
...         [ 0, 21,  2],
...         [ 0,  1, 32],
...         [10,  1, 32],
...         [10, 21, 32],
...         [ 0, 21, 32]],
...        [[ 3,  4,  5],
...         [43,  4,  5],
...         [43, 54,  5],
...         [ 3, 54,  5],
...         [ 3,  4, 65],
...         [43,  4, 65],
...         [43, 54, 65],
...         [ 3, 54, 65]]]) # 2x8x3
>>> infer_bbox_shape3d(boxes)
(tensor([31, 61]), tensor([21, 51]), tensor([11, 41]))
kornia.geometry.bbox.nms(boxes, scores, iou_threshold)#

Perform non-maxima suppression (NMS) on a given tensor of bounding boxes according to the intersection-over- union (IoU).

Parameters:
  • boxes (Tensor) – tensor containing the encoded bounding boxes with the shape \((N, (x_1, y_1, x_2, y_2))\).

  • scores (Tensor) – tensor containing the scores associated to each bounding box with shape \((N,)\).

  • iou_threshold (float) – the throshold to discard the overlapping boxes.

Return type:

Tensor

Returns:

A tensor mask with the indices to keep from the input set of boxes and scores.

Example

>>> boxes = torch.tensor([
...     [10., 10., 20., 20.],
...     [15., 5., 15., 25.],
...     [100., 100., 200., 200.],
...     [100., 100., 200., 200.]])
>>> scores = torch.tensor([0.9, 0.8, 0.7, 0.9])
>>> nms(boxes, scores, iou_threshold=0.8)
tensor([0, 3, 1])
kornia.geometry.bbox.transform_bbox(trans_mat, boxes, mode='xyxy', restore_coordinates=None)#

Apply a transformation matrix to a box or batch of boxes.

Parameters:
  • trans_mat (Tensor) – The transformation matrix to be applied with a shape of \((3, 3)\) or batched as \((B, 3, 3)\).

  • boxes (Tensor) – The boxes to be transformed with a common shape of \((N, 4)\) or batched as \((B, N, 4)\), the polygon shape of \((B, N, 4, 2)\) is also supported.

  • mode (str, optional) – The format in which the boxes are provided. If set to ‘xyxy’ the boxes are assumed to be in the format xmin, ymin, xmax, ymax. If set to ‘xywh’ the boxes are assumed to be in the format xmin, ymin, width, height Default: "xyxy"

  • restore_coordinates (Optional[bool], optional) – In case the boxes are flipped, adding a post processing step to restore the coordinates to a valid bounding box. Default: None

Return type:

Tensor

Returns:

The set of transformed points in the specified mode

kornia.geometry.bbox.validate_bbox(boxes)#

Validate if a 2D bounding box usable or not. This function checks if the boxes are rectangular or not.

Parameters:

boxes (Tensor) – a tensor containing the coordinates of the bounding boxes to be extracted. The tensor must have the shape of Bx4x2, where each box is defined in the following clockwise order: top-left, top-right, bottom-right, bottom-left. The coordinates must be in the x, y order.

Return type:

bool

kornia.geometry.bbox.validate_bbox3d(boxes)#

Validate if a 3D bounding box usable or not. This function checks if the boxes are cube or not.

Parameters:

boxes (Tensor) – a tensor containing the coordinates of the bounding boxes to be extracted. The tensor must have the shape of Bx8x3, where each box is defined in the following clockwise order: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. The coordinates must be in the x, y, z order.

Return type:

bool