kornia.geometry.boxes¶
Module with useful functionalities for 2D and 3D bounding boxes manipulation.
- class kornia.geometry.boxes.Boxes(boxes, raise_if_not_floating_point=True, mode='vertices_plus')[source]¶
2D boxes containing N or BxN boxes.
- Parameters:
boxes (
Tensor|list[Tensor]) – 2D boxes, shape of \((N, 4, 2)\), \((B, N, 4, 2)\) or a list of \((N, 4, 2)\). See below for more details.raise_if_not_floating_point (
bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default:Truemode (
str, optional) – the box format of the input boxes. Default:"vertices_plus"
Note
2D boxes format is defined as a floating data type tensor of shape
Nx4x2orBxNx4x2where each box is a quadrilateral defined by it’s 4 vertices coordinates (A, B, C, D). Coordinates must be inx, yorder. The height and width of a box is defined aswidth = xmax - xmin + 1andheight = ymax - ymin + 1. Examples of quadrilaterals are rectangles, rhombus and trapezoids.- clamp(topleft=None, botright=None, inplace=False)[source]¶
Clamp every box vertex inside per-image coordinate limits.
Coordinates below
topleftare raised to the lower bound and coordinates abovebotrightare lowered to the upper bound. The implementation expects tensor bounds with one(x, y)pair per batch element.- Parameters:
topleft (
Union[Tensor,tuple[int,int],None], optional) – Tensor of shape \((B, 2)\) containing the minimumxandycoordinate allowed for each batch item. Default:Nonebotright (
Union[Tensor,tuple[int,int],None], optional) – Tensor of shape \((B, 2)\) containing the maximumxandycoordinate allowed for each batch item. Default:Noneinplace (
bool, optional) – IfTrue, clamp this object in place. Otherwise, return a newBoxesobject with clamped data. Default:False
- Return type:
- Returns:
Boxeswhose vertex coordinates are restricted to the provided bounds.
- property data: Tensor¶
Return the raw quadrilateral coordinate tensor.
- Returns:
Tensor storing four vertices per box in
(x, y)order. The common shapes are \((N, 4, 2)\) for unbatched boxes and \((B, N, 4, 2)\) for batched boxes, where \(B\) is batch size and \(N\) is the number of boxes.
- filter_boxes_by_area(min_area=None, max_area=None, inplace=False)[source]¶
Remove boxes whose polygon area is outside the requested range.
The box area is computed from its four vertices. Boxes smaller than
min_areaor larger thanmax_areaare not dropped from the tensor; their coordinates are replaced with zeros so the original batch and box dimensions stay unchanged.- Parameters:
min_area (
Optional[float], optional) – Optional lower inclusive area threshold. Boxes with area below this value are zeroed. Default:Nonemax_area (
Optional[float], optional) – Optional upper inclusive area threshold. Boxes with area above this value are zeroed. Default:Noneinplace (
bool, optional) – IfTrue, update this object in place. Otherwise, return a filtered clone. Default:False
- Return type:
- Returns:
Boxeswith the same shape as the input container and out-of-range boxes replaced by zero coordinates.
- classmethod from_tensor(boxes, mode='xyxy', validate_boxes=True)[source]¶
Create
Boxesfrom boxes stored in another format.- Parameters:
boxes (
Tensor|list[Tensor]) – 2D boxes, shape of \((N, 4)\), \((B, N, 4)\), \((N, 4, 2)\) or \((B, N, 4, 2)\).mode (
str, optional) – The format in which the boxes are provided. Default:"xyxy"validate_boxes (
bool, optional) –Check if boxes are valid. Default is True. Default:
True’xyxy’: boxes are assumed to be in the format
xmin, ymin, xmax, ymaxwherewidth = xmax - xminandheight = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as
width = xmax - xmin + 1andheight = ymax - ymin + 1. With shape \((N, 4)\), \((B, N, 4)\).’xywh’: boxes are assumed to be in the format
xmin, ymin, width, heightwherewidth = xmax - xminandheight = ymax - ymin. With shape \((N, 4)\), \((B, N, 4)\).’vertices’: boxes are defined by their vertices points in the following
clockwiseorder: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined aswidth = xmax - xminandheight = ymax - ymin. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as
width = xmax - xmin + 1andheight = ymax - ymin + 1. ymin + 1``. With shape \((N, 4, 2)\) or \((B, N, 4, 2)\).
validate_boxes – check if boxes are valid rectangles or not. Valid rectangles are those with width and height >= 1 (>= 2 when mode ends with ‘_plus’ suffix).
- Return type:
- Returns:
Boxesclass containing the original boxes in the format specified bymode.
Examples
>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]]) >>> boxes = Boxes.from_tensor(boxes_xyxy, mode='xyxy') >>> boxes.data # (2, 4, 2) tensor([[[0., 3.], [0., 3.], [0., 3.], [0., 3.]], [[5., 1.], [7., 1.], [7., 3.], [5., 3.]]])
- get_boxes_shape()[source]¶
Compute boxes heights and widths.
- Return type:
- Returns:
Boxes heights, shape of \((N,)\) or \((B,N)\).
Boxes widths, shape of \((N,)\) or \((B,N)\).
Example
>>> boxes_xyxy = torch.tensor([[[1,1,2,2],[1,1,3,2]]]) >>> boxes = Boxes.from_tensor(boxes_xyxy) >>> boxes.get_boxes_shape() (tensor([[1., 1.]]), tensor([[1., 2.]]))
- index_put(indices, values, inplace=False)[source]¶
Write box coordinates at selected tensor indices.
This mirrors
torch.Tensor.index_put_()for the internal quadrilateral tensor. It is useful when a subset of boxes in a batch must be replaced while keeping theBoxeswrapper and metadata.- Parameters:
indices (
tuple[Tensor,...] |list[Tensor]) – Index tuple or list accepted byTensor.index_put_. The indices address entries in the stored tensor, commonly shaped \((B, N, 4, 2)\) or \((N, 4, 2)\).values (
Tensor|Boxes) – Replacement coordinates. If aBoxesobject is passed, itsdatatensor is used.inplace (
bool, optional) – IfTrue, update this object and returnself. IfFalse, clone the current data first and return a newBoxesinstance. Default:False
- Return type:
- Returns:
Boxescontaining the updated coordinates.
- merge(boxes, inplace=False)[source]¶
Merge boxes.
Say, current instance holds \((B, N, 4, 2)\) and the incoming boxes holds \((B, M, 4, 2)\), the merge results in \((B, N + M, 4, 2)\).
- property mode: str¶
Return the box format remembered by this container.
- Returns:
Mode string used as the default by
to_tensor(), such as"xyxy","xywh","vertices", or their"_plus"variants.
- property shape: tuple[int, ...] | Size¶
Return the tensor shape used to store the boxes.
- Returns:
Shape of
data. For unbatched boxes this is usually \((N, 4, 2)\), where \(N\) is the number of boxes,4is the number of corner vertices, and2stores(x, y). For batched boxes the shape is usually \((B, N, 4, 2)\), where \(B\) is the batch size.
- to_mask(height, width)[source]¶
Convert 2D boxes to masks. Covered area is 1 and the remaining is 0.
- Parameters:
- Return type:
- Returns:
the output mask tensor, shape of \((N, width, height)\) or \((B,N, width, height)\) and dtype of
Boxes.dtype()(it can be any floating point dtype).
Note
It is currently non-differentiable.
Examples
>>> boxes = Boxes(torch.tensor([[ # Equivalent to boxes = Boxes.from_tensor([[1,1,4,3]]) ... [1., 1.], ... [4., 1.], ... [4., 3.], ... [1., 3.], ... ]])) # 1x4x2 >>> boxes.to_mask(5, 5) tensor([[[0., 0., 0., 0., 0.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 0., 0., 0., 0.]]])
- to_tensor(mode=None, as_padded_sequence=False)[source]¶
Cast
Boxesto a tensor.modecontrols which 2D boxes format should be use to represent boxes in the tensor.- Parameters:
mode (
Optional[str], optional) –the output box format. It could be: Default:
None’xyxy’: boxes are defined as
xmin, ymin, xmax, ymaxwherewidth = xmax - xminandheight = ymax - ymin.’xyxy_plus’: similar to ‘xyxy’ mode but where box width and length are defined as
width = xmax - xmin + 1andheight = ymax - ymin + 1.’xywh’: boxes are defined as
xmin, ymin, width, heightwherewidth = xmax - xminandheight = ymax - ymin.’vertices’: boxes are defined by their vertices points in the following
clockwiseorder: top-left, top-right, bottom-right, bottom-left. Vertices coordinates are in (x,y) order. Finally, box width and height are defined aswidth = xmax - xminandheight = ymax - ymin.’vertices_plus’: similar to ‘vertices’ mode but where box width and length are defined as
width = xmax - xmin + 1andheight = ymax - ymin + 1. ymin + 1``.
as_padded_sequence (
bool, optional) – whether to keep the pads for a list of boxes. This parameter is only valid if the boxes are from a box list whilst from_tensor. Default:False
- Returns:
‘vertices’ or ‘verticies_plus’: \((N, 4, 2)\) or \((B, N, 4, 2)\).
Any other value: \((N, 4)\) or \((B, N, 4)\).
- Return type:
Boxes tensor in the
modeformat. The shape depends with themodevalue
Examples
>>> boxes_xyxy = torch.as_tensor([[0, 3, 1, 4], [5, 1, 8, 4]]) >>> boxes = Boxes.from_tensor(boxes_xyxy) >>> assert (boxes_xyxy == boxes.to_tensor(mode='xyxy')).all()
- transform_boxes_(M)[source]¶
Inplace version of
Boxes.transform_boxes().- Return type:
- trim(correspondence_preserve=False, inplace=False)[source]¶
Trim out zero padded boxes.
Given box arrangements of shape \((4, 4, Box)\):
–
Box
–
Box
–
Box
–
Box
–
–
0
–
0
–
Box
–
Box
–
–
0
–
Box
–
0
–
0
–
–
0
–
0
–
0
–
0
–
Nothing will change if correspondence_preserve is True. Only pure zero layers will be removed, resulting in shape \((4, 3, Box)\):
–
Box
–
Box
–
Box
–
Box
–
–
0
–
0
–
Box
–
Box
–
–
0
–
Box
–
0
–
0
–
Otherwise, you will get \((4, 2, Box)\):
–
Box
–
Box
–
Box
–
Box
–
–
0
–
Box
–
Box
–
Box
–
- Return type:
- class kornia.geometry.boxes.Boxes3D(boxes, raise_if_not_floating_point=True, mode='xyzxyz_plus')[source]¶
3D boxes containing N or BxN boxes.
- Parameters:
boxes (
Tensor) – 3D boxes, shape of \((N,8,3)\) or \((B,N,8,3)\). See below for more details.raise_if_not_floating_point (
bool, optional) – flag to control floating point casting behaviour when boxes is not a floating point tensor. True to raise an error when boxes isn’t a floating point tensor, False to cast to float. Default:True
Note
3D boxes format is defined as a floating data type tensor of shape
Nx8x3orBxNx8x3where each box is a hexahedron defined by it’s 8 vertices coordinates. Coordinates must be inx, y, zorder. The height, width and depth of a box is defined aswidth = xmax - xmin + 1,height = ymax - ymin + 1anddepth = zmax - zmin + 1. Examples of hexahedrons are cubes and rhombohedrons.- property data: Tensor¶
Return the raw 3D corner-coordinate tensor.
- Returns:
Tensor containing eight 3D corner coordinates per box, usually shaped \((N, 8, 3)\) or \((B, N, 8, 3)\).
- classmethod from_tensor(boxes, mode='xyzxyz', validate_boxes=True)[source]¶
Create
Boxes3Dfrom 3D boxes stored in another format.- Parameters:
boxes (
Tensor) – 3D boxes, shape of \((N,6)\) or \((B,N,6)\).mode (
str, optional) –The format in which the 3D boxes are provided. Default:
"xyzxyz"’xyzxyz’: boxes are assumed to be in the format
xmin, ymin, zmin, xmax, ymax, zmaxwherewidth = xmax - xmin,height = ymax - yminanddepth = zmax - zmin.’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as
width = xmax - xmin + 1,height = ymax - ymin + 1anddepth = zmax - zmin + 1.’xyzwhd’: boxes are assumed to be in the format
xmin, ymin, zmin, width, height, depthwherewidth = xmax - xmin,height = ymax - yminanddepth = zmax - zmin.
validate_boxes (
bool, optional) – check if boxes are valid rectangles or not. Valid rectangles are those with width, height and depth >= 1 (>= 2 when mode ends with ‘_plus’ suffix). Default:True
- Return type:
- Returns:
Boxes3Dclass containing the original boxes in the format specified bymode.
Examples
>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]]) >>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz') >>> boxes.data # (2, 8, 3) tensor([[[0., 3., 6.], [0., 3., 6.], [0., 3., 6.], [0., 3., 6.], [0., 3., 7.], [0., 3., 7.], [0., 3., 7.], [0., 3., 7.]], [[5., 1., 3.], [7., 1., 3.], [7., 3., 3.], [5., 3., 3.], [5., 1., 8.], [7., 1., 8.], [7., 3., 8.], [5., 3., 8.]]])
- get_boxes_shape()[source]¶
Compute boxes heights and widths.
- Return type:
- Returns:
Boxes depths, shape of \((N,)\) or \((B,N)\).
Boxes heights, shape of \((N,)\) or \((B,N)\).
Boxes widths, shape of \((N,)\) or \((B,N)\).
Example
>>> boxes_xyzxyz = torch.tensor([[ 0, 1, 2, 10, 21, 32], [3, 4, 5, 43, 54, 65]]) >>> boxes3d = Boxes3D.from_tensor(boxes_xyzxyz) >>> boxes3d.get_boxes_shape() (tensor([30., 60.]), tensor([20., 50.]), tensor([10., 40.]))
- property mode: str¶
Return the 3D box format remembered by this container.
- Returns:
Mode string describing how this container should be interpreted during tensor conversion.
- property shape: tuple[int, ...] | Size¶
Return the tensor shape used to store 3D boxes.
- Returns:
Shape of
data. For unbatched boxes this is usually \((N, 8, 3)\), where \(N\) is the number of boxes,8is the number of cuboid corners, and3stores(x, y, z). For batched boxes the shape is usually \((B, N, 8, 3)\).
- to_mask(depth, height, width)[source]¶
Convert ·D boxes to masks. Covered area is 1 and the remaining is 0.
- Parameters:
- Return type:
- Returns:
- the output mask tensor, shape of \((N, depth, width, height)\) or \((B,N, depth, width, height)\)
and dtype of
Boxes3D.dtype()(it can be any floating point dtype).
Note
It is currently non-differentiable.
Examples
>>> boxes = Boxes3D(torch.tensor([[ # Equivalent to boxes = Boxes.3Dfrom_tensor([[1,1,1,3,3,2]]) ... [1., 1., 1.], ... [3., 1., 1.], ... [3., 3., 1.], ... [1., 3., 1.], ... [1., 1., 2.], ... [3., 1., 2.], ... [3., 3., 2.], ... [1., 3., 2.], ... ]])) # 1x8x3 >>> boxes.to_mask(4, 5, 5) tensor([[[[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]], [[0., 0., 0., 0., 0.], [0., 1., 1., 1., 0.], [0., 1., 1., 1., 0.], [0., 1., 1., 1., 0.], [0., 0., 0., 0., 0.]], [[0., 0., 0., 0., 0.], [0., 1., 1., 1., 0.], [0., 1., 1., 1., 0.], [0., 1., 1., 1., 0.], [0., 0., 0., 0., 0.]], [[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]]]])
- to_tensor(mode='xyzxyz')[source]¶
Cast
Boxes3Dto a tensor.modecontrols which 3D boxes format should be use to represent boxes in the tensor.- Parameters:
mode (
str, optional) –The format in which the boxes are provided. Default:
"xyzxyz"’xyzxyz’: boxes are assumed to be in the format
xmin, ymin, zmin, xmax, ymax, zmaxwherewidth = xmax - xmin,height = ymax - yminanddepth = zmax - zmin.- ’xyzxyz_plus’: similar to ‘xyzxyz’ mode but where box width, length and depth are defined as
width = xmax - xmin + 1,height = ymax - ymin + 1anddepth = zmax - zmin + 1.
’xyzwhd’: boxes are assumed to be in the format
xmin, ymin, zmin, width, height, depthwherewidth = xmax - xmin,height = ymax - yminanddepth = zmax - zmin.’vertices’: boxes are defined by their vertices points in the following
clockwiseorder: front-top-left, front-top-right, front-bottom-right, front-bottom-left, back-top-left, back-top-right, back-bottom-right, back-bottom-left. Vertices coordinates are in (x,y, z) order. Finally, box width, height and depth are defined aswidth = xmax - xmin,height = ymax - yminanddepth = zmax - zmin.’vertices_plus’: similar to ‘vertices’ mode but where box width, length and depth are defined as
width = xmax - xmin + 1andheight = ymax - ymin + 1.
- Returns:
‘vertices’ or ‘verticies_plus’: \((N, 8, 3)\) or \((B, N, 8, 3)\).
Any other value: \((N, 6)\) or \((B, N, 6)\).
- Return type:
3D Boxes tensor in the
modeformat. The shape depends with themodevalue
Note
It is currently non-differentiable due to a bug. See github issue #1304.
Examples
>>> boxes_xyzxyz = torch.as_tensor([[0, 3, 6, 1, 4, 8], [5, 1, 3, 8, 4, 9]]) >>> boxes = Boxes3D.from_tensor(boxes_xyzxyz, mode='xyzxyz') >>> assert (boxes.to_tensor(mode='xyzxyz') == boxes_xyzxyz).all()
- transform_boxes_(M)[source]¶
Inplace version of
Boxes3D.transform_boxes().- Return type: