Pinhole Camera¶
In this module we have all the functions and data structures needed to describe the projection of a 3D scene space onto a 2D image plane.
In computer vision, we can map between the 3D world and a 2D image using projective geometry. The module implements the simplest camera model, the Pinhole Camera, which is the most basic model for general projective cameras from the finite cameras group.
The Pinhole Camera model is shown in the following figure:

Using this model, a scene view can be formed by projecting 3D points into the image plane using a perspective transformation.
or
- where:
\(M'\) is a 3D point in space with coordinates \([X,Y,Z]^T\) expressed in a Euclidean coordinate system.
\(m'\) is the projection of the 3D point \(M'\) onto the image plane with coordinates \([u,v]^T\) expressed in pixel units.
\(K\) is the camera calibration matrix, also referred as the intrinsics parameters matrix.
\(C\) is the principal point offset with coordinates \([u_0, v_0]^T\) at the origin in the image plane.
\(fx, fy\) are the focal lengths expressed in pixel units.
The camera rotation and translation are expressed in terms of Euclidean coordinate frame, also known as the world coordinates system. This terms are usually expressed by the joint rotation-translation matrix \([R|t]\), or also called as the extrinsics parameters matrix. It is used to describe the camera pose around a static scene and translates the coordinates of a 3D point \((X,Y,Z)\) to a coordinate system respect to the camera.
The PinholeCamera
expects the intrinsics parameters matrix and the extrensics parameters matrix
to be of shape (B, 4, 4) such that each intrinsics parameters matrix has the following format:
And each extrensics parameters matrix has the following format:
- class PinholeCamera(intrinsics, extrinsics, height, width)[source]¶
Class that represents a Pinhole Camera model.
- Parameters
intrinsics (
Tensor
) – tensor with shape \((B, 4, 4)\) containing the full 4x4 camera calibration matrix.extrinsics (
Tensor
) – tensor with shape \((B, 4, 4)\) containing the full 4x4 rotation-translation matrix.height (
Tensor
) – tensor with shape \((B)\) containing the image height.width (
Tensor
) – tensor with shape \((B)\) containing the image width.
Note
We assume that the class attributes are in batch form in order to take advantage of PyTorch parallelism to boost computing performance.
- property batch_size: int¶
Returns the batch size of the storage.
- Return type
- Returns
scalar with the batch size.
- property camera_matrix: torch.Tensor¶
Returns the 3x3 camera matrix containing the intrinsics.
- Return type
- Returns
tensor of shape \((B, 3, 3)\).
- property cx: torch.Tensor¶
Returns the x-coordinate of the principal point.
- Return type
- Returns
tensor of shape \((B)\).
- property cy: torch.Tensor¶
Returns the y-coordinate of the principal point.
- Return type
- Returns
tensor of shape \((B)\).
- property extrinsics: torch.Tensor¶
The full 4x4 extrinsics matrix.
- Return type
- Returns
tensor of shape \((B, 4, 4)\).
- property fx: torch.Tensor¶
Returns the focal length in the x-direction.
- Return type
- Returns
tensor of shape \((B)\).
- property fy: torch.Tensor¶
Returns the focal length in the y-direction.
- Return type
- Returns
tensor of shape \((B)\).
- property intrinsics: torch.Tensor¶
The full 4x4 intrinsics matrix.
- Return type
- Returns
tensor of shape \((B, 4, 4)\).
- intrinsics_inverse()[source]¶
Returns the inverse of the 4x4 instrisics matrix.
- Return type
- Returns
tensor of shape \((B, 4, 4)\).
- property rotation_matrix: torch.Tensor¶
Returns the 3x3 rotation matrix from the extrinsics.
- Return type
- Returns
tensor of shape \((B, 3, 3)\).
- property rt_matrix: torch.Tensor¶
Returns the 3x4 rotation-translation matrix.
- Return type
- Returns
tensor of shape \((B, 3, 4)\).
- scale(scale_factor)[source]¶
Scales the pinhole model.
- Parameters
scale_factor – a tensor with the scale factor. It has to be broadcastable with class members. The expected shape is \((B)\) or \((1)\).
- Return type
- Returns
the camera model with scaled parameters.
- scale_(scale_factor)[source]¶
Scales the pinhole model in-place.
- Parameters
scale_factor – a tensor with the scale factor. It has to be broadcastable with class members. The expected shape is \((B)\) or \((1)\).
- Return type
- Returns
the camera model with scaled parameters.
- property translation_vector: torch.Tensor¶
Returns the translation vector from the extrinsics.
- Return type
- Returns
tensor of shape \((B, 3, 1)\).
- property tx: torch.Tensor¶
Returns the x-coordinate of the translation vector.
- Return type
- Returns
tensor of shape \((B)\).
- property ty: torch.Tensor¶
Returns the y-coordinate of the translation vector.
- Return type
- Returns
tensor of shape \((B)\).
- cam2pixel(cam_coords_src, dst_proj_src, eps=1e-12)[source]¶
Transform coordinates in the camera frame to the pixel frame.
- Parameters
cam_coords – (x, y, z) coordinates defined in the first camera coordinates system. Shape must be BxHxWx3.
dst_proj_src (
Tensor
) – the projection matrix between the reference and the non reference camera frame. Shape must be Bx4x4.eps (
float
, optional) – small value to avoid division by zero error. Default:1e-12
- Return type
- Returns
tensor of shape BxHxWx2 with (u, v) pixel coordinates.