Pinhole Camera

In this module we have all the functions and data structures needed to describe the projection of a 3D scene space onto a 2D image plane.

In computer vision, we can map between the 3D world and a 2D image using projective geometry. The module implements the simplest camera model, the Pinhole Camera, which is the most basic model for general projective cameras from the finite cameras group.

The Pinhole Camera model is shown in the following figure:

_images/pinhole_model.png

Using this model, a scene view can be formed by projecting 3D points into the image plane using a perspective transformation.

\[s \; m' = K [R|t] M'\]

or

\[\begin{split}s \begin{bmatrix} u \\ v \\ 1\end{bmatrix} = \begin{bmatrix} f_x & 0 & u_0 \\ 0 & f_y & v_0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_1 \\ r_{21} & r_{22} & r_{23} & t_2 \\ r_{31} & r_{32} & r_{33} & t_3 \end{bmatrix} \begin{bmatrix} X \\ Y \\ Z \\ 1 \end{bmatrix}\end{split}\]
where:
  • \(M'\) is a 3D point in space with coordinates \([X,Y,Z]^T\) expressed in a Euclidean coordinate system.
  • \(m'\) is the projection of the 3D point \(M'\) onto the image plane with coordinates \([u,v]^T\) expressed in pixel units.
  • \(K\) is the camera calibration matrix, also refered as the instrinsics parameters matrix.
  • \(C\) is the principal point offset with coordinates \([u_0, v_0]^T\) at the origin in the image plane.
  • \(fx, fy\) are the focal lengths expressed in pixel units.

The camera rotation and translation are expressed in terms of Euclidean coordinate frame, also known as the world coordinates system. This terms are usually expressed by the joint rotation-translation matrix \([R|t]\), or also called as the extrinsics parameters matrix. It is used to describe the camera pose around a static scene and translates the coordinates of a 3D point \((X,Y,Z)\) to a coordinate sytstem respect to the camera.

class PinholeCamera(intrinsics: torch.Tensor, extrinsics: torch.Tensor, height: torch.Tensor, width: torch.Tensor)[source]

Class that represents a Pinhole Camera model.

Parameters:
  • intrinsics (torch.Tensor) – tensor with shape \((B, 4, 4)\) containing the full 4x4 camera calibration matrix.
  • extrinsics (torch.Tensor) – tensor with shape \((B, 4, 4)\) containing the full 4x4 rotation-translation matrix.
  • height (torch.Tensor) – tensor with shape \((B)\) containing the image height.
  • widht (torch.Tensor) – tensor with shape \((B)\) containing the image width.

Note

We assume that the class attributes are in batch form in order to take advantage of PyTorch parallelism to boost computing performce.

batch_size

Returns the batch size of the storage.

Returns:scalar with the batch size
Return type:int
camera_matrix

Returns the 3x3 camera matrix containing the intrinsics.

Returns:tensor of shape \((B, 3, 3)\)
Return type:torch.Tensor
clone() → torchgeometry.core.pinhole.PinholeCamera[source]

Returns a deep copy of the current object instance.

cx

Returns the x-coordinate of the principal point.

Returns:tensor of shape \((B)\)
Return type:torch.Tensor
cy

Returns the y-coordinate of the principal point.

Returns:tensor of shape \((B)\)
Return type:torch.Tensor
extrinsics

The full 4x4 extrinsics matrix.

Returns:tensor of shape \((B, 4, 4)\)
Return type:torch.Tensor
fx

Returns the focal lenght in the x-direction.

Returns:tensor of shape \((B)\)
Return type:torch.Tensor
fy

Returns the focal lenght in the y-direction.

Returns:tensor of shape \((B)\)
Return type:torch.Tensor
intrinsics

The full 4x4 intrinsics matrix.

Returns:tensor of shape \((B, 4, 4)\)
Return type:torch.Tensor
intrinsics_inverse() → torch.Tensor[source]

Returns the inverse of the 4x4 instrisics matrix.

Returns:tensor of shape \((B, 4, 4)\)
Return type:torch.Tensor
rotation_matrix

Returns the 3x3 rotation matrix from the extrinsics.

Returns:tensor of shape \((B, 3, 3)\)
Return type:torch.Tensor
rt_matrix

Returns the 3x4 rotation-translation matrix.

Returns:tensor of shape \((B, 3, 4)\)
Return type:torch.Tensor
scale(scale_factor) → torchgeometry.core.pinhole.PinholeCamera[source]

Scales the pinhole model.

Parameters:scale_factor (torch.Tensor) – a tensor with the scale factor. It has to be broadcastable with class members. The expected shape is \((B)\) or \((1)\).
Returns:the camera model with scaled parameters.
Return type:PinholeCamera
scale_(scale_factor) → torchgeometry.core.pinhole.PinholeCamera[source]

Scales the pinhole model in-place.

Parameters:scale_factor (torch.Tensor) – a tensor with the scale factor. It has to be broadcastable with class members. The expected shape is \((B)\) or \((1)\).
Returns:the camera model with scaled parameters.
Return type:PinholeCamera
translation_vector

Returns the translation vector from the extrinsics.

Returns:tensor of shape \((B, 3, 1)\)
Return type:torch.Tensor
tx

Returns the x-coordinate of the translation vector.

Returns:tensor of shape \((B)\)
Return type:torch.Tensor
ty

Returns the y-coordinate of the translation vector.

Returns:tensor of shape \((B)\)
Return type:torch.Tensor
tz

Returns the z-coordinate of the translation vector.

Returns:tensor of shape \((B)\)
Return type:torch.Tensor
class PinholeCamerasList(pinholes_list: Iterable[torchgeometry.core.pinhole.PinholeCamera])[source]

Class that represents a list of pinhole cameras.

The class inherits from PinholeCamera meaning that it will keep the same class properties but with an extra dimension.

Note

The underlying data tensor will be stacked in the first dimension. That’s it, given a list of two camera instances, the intrinsics tensor will have a shape \((B, N, 4, 4)\) where \(B\) is the batch size and \(N\) is the numbers of cameras (in this case two).

Parameters:pinholes_list (Iterable[PinholeCamera]) – a python tuple or list containg a set of PinholeCamera instances.
cam2pixel(cam_coords_src: torch.Tensor, dst_proj_src: torch.Tensor, eps: Optional[float] = 1e-06) → torch.Tensor[source]

Transform coordinates in the camera frame to the pixel frame.

Parameters:
  • cam_coords (torch.Tensor) – pixel coordinates defined in the first camera coordinates system. Shape must be BxHxWx3.
  • dst_proj_src (torch.Tensor) – the projection matrix between the reference and the non reference camera frame. Shape must be Bx4x4.
Returns:

array of [-1, 1] coordinates of shape BxHxWx2.

Return type:

torch.Tensor

pixel2cam(depth: torch.Tensor, intrinsics_inv: torch.Tensor, pixel_coords: torch.Tensor) → torch.Tensor[source]

Transform coordinates in the pixel frame to the camera frame.

Parameters:
  • depth (torch.Tensor) – the source depth maps. Shape must be Bx1xHxW.
  • intrinsics_inv (torch.Tensor) – the inverse intrinsics camera matrix. Shape must be Bx4x4.
  • pixel_coords (torch.Tensor) – the grid with the homogeneous camera coordinates. Shape must be BxHxWx3.
Returns:

array of (u, v, 1) cam coordinates with shape BxHxWx3.

Return type:

torch.Tensor

normalize_pixel_coordinates(pixel_coordinates: torch.Tensor, height: float, width: float) → torch.Tensor[source]

Normalize pixel coordinates between -1 and 1.

Normalized, -1 if on extreme left, 1 if on extreme right (x = w-1).

Parameters:
  • pixel_coordinate (torch.Tensor) – the grid with pixel coordinates. Shape must be \((B, H, W, 2)\).
  • width (float) – the maximum width in the x-axis.
  • height (float) – the maximum height in the y-axis.
Returns:

the nornmalized pixel coordinates.

Return type:

torch.Tensor

homography_i_H_ref(pinhole_i, pinhole_ref)[source]

Homography from reference to ith pinhole

Note

The pinhole model is represented in a single vector as follows:

\[pinhole = (f_x, f_y, c_x, c_y, height, width, r_x, r_y, r_z, t_x, t_y, t_z)\]
where:

\((r_x, r_y, r_z)\) is the rotation vector in angle-axis convention.

\((t_x, t_y, t_z)\) is the translation vector.

\[H_{ref}^{i} = K_{i} * T_{ref}^{i} * K_{ref}^{-1}\]
Parameters:
  • pinhole_i (Tensor) – tensor with pinhole model for ith frame.
  • pinhole_ref (Tensor) – tensor with pinhole model for reference frame.
Returns:

tensors that convert depth points (u, v, d) from pinhole_ref to pinhole_i.

Return type:

Tensor

Shape:
  • Input: \((N, 12)\) and \((N, 12)\)
  • Output: \((N, 4, 4)\)

Example

>>> pinhole_i = torch.rand(1, 12)    # Nx12
>>> pinhole_ref = torch.rand(1, 12)  # Nx12
>>> i_H_ref = tgm.homography_i_H_ref(pinhole_i, pinhole_ref)  # Nx4x4
pinhole_matrix(pinholes, eps=1e-06)[source]

Function that returns the pinhole matrix from a pinhole model

Note

This method is going to be deprecated in version 0.2 in favour of torchgeometry.PinholeCamera.camera_matrix.

Parameters:pinholes (Tensor) – tensor of pinhole models.
Returns:tensor of pinhole matrices.
Return type:Tensor
Shape:
  • Input: \((N, 12)\)
  • Output: \((N, 4, 4)\)

Example

>>> pinhole = torch.rand(1, 12)    # Nx12
>>> pinhole_matrix = tgm.pinhole_matrix(pinhole)  # Nx4x4
inverse_pinhole_matrix(pinhole, eps=1e-06)[source]

Returns the inverted pinhole matrix from a pinhole model

Note

This method is going to be deprecated in version 0.2 in favour of torchgeometry.PinholeCamera.intrinsics_inverse().

Parameters:pinholes (Tensor) – tensor with pinhole models.
Returns:tensor of inverted pinhole matrices.
Return type:Tensor
Shape:
  • Input: \((N, 12)\)
  • Output: \((N, 4, 4)\)

Example

>>> pinhole = torch.rand(1, 12)    # Nx12
>>> pinhole_matrix_inv = tgm.inverse_pinhole_matrix(pinhole)  # Nx4x4
scale_pinhole(pinholes, scale)[source]

Scales the pinhole matrix for each pinhole model.

Note

This method is going to be deprecated in version 0.2 in favour of torchgeometry.PinholeCamera.scale().

Parameters:
  • pinholes (Tensor) – tensor with the pinhole model.
  • scale (Tensor) – tensor of scales.
Returns:

tensor of scaled pinholes.

Return type:

Tensor

Shape:
  • Input: \((N, 12)\) and \((N, 1)\)
  • Output: \((N, 12)\)

Example

>>> pinhole_i = torch.rand(1, 12)  # Nx12
>>> scales = 2.0 * torch.ones(1)   # N
>>> pinhole_i_scaled = tgm.scale_pinhole(pinhole_i)  # Nx12
class PinholeMatrix[source]

Creates an object that returns the pinhole matrix from a pinhole model

Parameters:pinholes (Tensor) – tensor of pinhole models.
Returns:tensor of pinhole matrices.
Return type:Tensor
Shape:
  • Input: \((N, 12)\)
  • Output: \((N, 4, 4)\)

Example

>>> pinhole = torch.rand(1, 12)          # Nx12
>>> transform = tgm.PinholeMatrix()
>>> pinhole_matrix = transform(pinhole)  # Nx4x4
class InversePinholeMatrix[source]

Returns and object that inverts a pinhole matrix from a pinhole model

Parameters:pinholes (Tensor) – tensor with pinhole models.
Returns:tensor of inverted pinhole matrices.
Return type:Tensor
Shape:
  • Input: \((N, 12)\)
  • Output: \((N, 4, 4)\)

Example

>>> pinhole = torch.rand(1, 12)              # Nx12
>>> transform = tgm.InversePinholeMatrix()
>>> pinhole_matrix_inv = transform(pinhole)  # Nx4x4