kornia.nerf

The functions in this sections perform Neural Radiance Fields (NeRF) related

Models

class kornia.nerf.nerf_model.NerfModel(num_ray_points, irregular_ray_sampling=True, num_pos_freqs=10, num_dir_freqs=4, num_units=2, num_unit_layers=4, num_hidden=128, log_space_encoding=True)

Class to represent NeRF model.

Parameters:
  • num_ray_points (int) – Number of points to sample along rays.

  • irregular_ray_sampling (bool, optional) – Whether to sample ray points irregularly. Default: True

  • num_pos_freqs (int, optional) – Number of frequencies for positional encoding. Default: 10

  • num_dir_freqs (int, optional) – Number of frequencies for directional encoding. Default: 4

  • num_units (int, optional) – Number of sub-units. Default: 2

  • num_unit_layers (int, optional) – Number of fully connected layers in each sub-unit. Default: 4

  • num_hidden (int, optional) – Layer hidden dimensions. Default: 128

  • log_space_encoding (bool, optional) – Whether to apply log spacing for encoding. Default: True

forward(origins, directions)

Forward method.

Parameters:
  • origins (Tensor) – Ray origins with shape \((B, 3)\).

  • directions (Tensor) – Ray directions with shape \((B, 3)\).

Return type:

Tensor

Returns:

Rendered image pixels \((B, 3)\).

class kornia.nerf.nerf_model.NerfModelRenderer(nerf_model, image_size, device, dtype)

Renders a novel synthesis view of a trained NeRF model for given camera.

render_view(camera)

Renders a novel synthesis view of a trained NeRF model for given camera.

Parameters:

camera (PinholeCamera) – camera for image rendering: PinholeCamera.

Return type:

Tensor

Returns:

Rendered image with shape \((H, W, 3)\).

class kornia.nerf.nerf_model.MLP(num_dims, num_units=2, num_unit_layers=4, num_hidden=128)

Class to represent a multi-layer perceptron.

The MLP represents a deep NN of fully connected layers. The network is build of user defined sub-units, each with a user defined number of layers.

Skip connections span between the sub-units. The model follows: Ben Mildenhall et el. (2020) at https://arxiv.org/abs/2003.08934.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Solvers

class kornia.nerf.nerf_solver.NerfSolver(device, dtype)

NeRF solver class.

Parameters:
  • device (device) – device for class tensors: Union[str, Device]

  • dtype (dtype) – type for all floating point calculations: torch.dtype

property nerf_model: Module | None

Returns the NeRF model.

run(num_epochs=1)

Runs training epochs.

Parameters:

num_epochs (int, optional) – number of epochs to run. Default: 1.

Return type:

None

setup_solver(cameras, min_depth, max_depth, ndc, imgs, num_img_rays, batch_size, num_ray_points, irregular_ray_sampling=True, log_space_encoding=True, lr=1.0e-3)

Initializes training settings and model.

Parameters:
  • cameras (PinholeCamera) – Scene cameras in the order of input images.

  • min_depth (float) – sampled rays minimal depth from cameras.

  • max_depth (float) – sampled rays maximal depth from cameras.

  • ndc (bool) – convert ray parameters to normalized device coordinates.

  • imgs (Union[List[str], List[Tensor]]) – Scene 2D images (one for each camera).

  • num_img_rays (Tensor | int) – Number of rays to randomly cast from each camera: math: (B).

  • batch_size (int) – Number of rays to sample in a batch.

  • num_ray_points (int) – Number of points to sample along rays.

  • irregular_ray_sampling (bool, optional) – Whether to sample ray points irregularly. Default: True

  • log_space – Whether frequency sampling should be log spaced.

  • lr (float, optional) – Learning rate. Default: 1.0e-3

Return type:

None

Renderers

class kornia.nerf.volume_renderer.VolumeRenderer(shift=1)

Base class for volume rendering.

Implementation follows Ben Mildenhall et el. (2020) at https://arxiv.org/abs/2003.08934.

forward(rgbs, densities, points_3d)

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class kornia.nerf.volume_renderer.IrregularRenderer(shift=1)

Renders 3D irregularly sampled points along rays.

forward(rgbs, densities, points_3d)

Renders 3D irregularly sampled points along rays.

Parameters:
  • rgbs (Tensor) – RGB values of points along rays \((*, N, 3)\)

  • densities (Tensor) – Volume densities of points along rays \((*, N)\)

  • points_3d (Tensor) – 3D points along rays \((*, N, 3)\)

Return type:

Tensor

Returns:

Rendered RGB values for each ray \((*, 3)\)

class kornia.nerf.volume_renderer.RegularRenderer(shift=1)

Renders 3D regularly sampled points along rays.

forward(rgbs, densities, points_3d)

Renders 3D regularly sampled points along rays.

Parameters:
  • rgbs (Tensor) – RGB values of points along rays \((*, N, 3)\)

  • densities (Tensor) – Volume densities of points along rays \((*, N)\)

  • points_3d (Tensor) – 3D points along rays \((*, N, 3)\)

Return type:

Tensor

Returns:

Rendered RGB values for each ray \((*, 3)\)

Samplers

class kornia.nerf.samplers.RaySampler(min_depth, max_depth, ndc, device, dtype)

Class to manage spatial ray sampling.

Parameters:
  • min_depth (float) – sampled rays minimal depth from cameras: float

  • max_depth (float) – sampled rays maximal depth from cameras: float

  • ndc (bool) – convert ray parameters to normalized device coordinates: bool

  • device (Union[str, device, None]) – device for ray tensors: Union[str, torch.device]

class Points2D(points_2d, camera_ids)

A class to hold ray 2d pixel coordinates and a camera id for each.

Parameters:
  • points_2d (Tensor) – tensor with ray pixel coordinates (the coordinates in the image plane that correspond to the ray):math:(B, 2)

  • camera_ids (List[int]) – list of camera ids for each pixel coordinates: List[int]

class Points2D_FlatTensors

Class to hold x/y pixel coordinates for each ray, and its scene camera id.

transform_ray_params_world_to_ndc(cameras)

Transforms ray parameters to normalized coordinate device (camera) system (NDC)

Parameters:

cameras (PinholeCamera) – scene cameras: PinholeCamera

Return type:

Tuple[Tensor, Tensor]

class kornia.nerf.samplers.RandomRaySampler(min_depth, max_depth, ndc, device, dtype)

Class to manage random ray spatial sampling.

Parameters:
  • min_depth (float) – sampled rays minimal depth from cameras: float

  • max_depth (float) – sampled rays maximal depth from cameras: float

  • ndc (bool) – convert to normalized device coordinates: bool

  • device (Union[str, device, None]) – device for ray tensors: Union[str, torch.device]

calc_ray_params(cameras, num_img_rays)

Calculates ray parameters: origins, directions. Also stored are camera ids for each ray, and its pixel coordinates.

Parameters:
  • cameras (PinholeCamera) – scene cameras: PinholeCamera

  • num_img_rays (Tensor) – tensor that holds the number of rays to randomly cast from each scene camera: int math: (B).

Return type:

None

sample_points_2d(heights, widths, num_img_rays)

Randomly sample pixel points in 2d.

Parameters:
  • heights (Tensor) – tensor that holds scene camera image heights (can vary between cameras): math: (B).

  • widths (Tensor) – tensor that holds scene camera image widths (can vary between cameras): math: (B).

  • num_img_rays (Tensor) – tensor that holds the number of rays to randomly cast from each scene camera: math: (B).

Return type:

Dict[int, Points2D]

Returns:

dictionary of Points2D objects that holds information on pixel 2d coordinates of each ray and the camera

id it was casted by: Dict[int, Points2D]

class kornia.nerf.samplers.RandomGridRaySampler(min_depth, max_depth, ndc, device, dtype)

Class to manage random ray spatial sampling.

Sampling is done on a regular grid of pixels by randomizing column and row values, and casting rays for all pixels along the selected ones.

Parameters:
  • min_depth (float) – sampled rays minimal depth from cameras: float

  • max_depth (float) – sampled rays maximal depth from cameras: float

  • ndc (bool) – convert to normalized device coordinates: bool

  • device (Union[str, device, None]) – device for ray tensors: Union[str, torch.device]

sample_points_2d(heights, widths, num_img_rays)

Randomly sample pixel points in 2d over a regular row-column grid.

Parameters:
  • heights (Tensor) – tensor that holds scene camera image heights (can vary between cameras): math: (B).

  • widths (Tensor) – tensor that holds scene camera image widths (can vary between cameras): math: (B).

  • num_img_rays (Tensor) – tensor that holds the number of rays to randomly cast from each scene camera. Number of rows and columns is the square root of this value: int math: (B).

Return type:

Dict[int, Points2D]

Returns:

dictionary of Points2D objects that holds information on pixel 2d coordinates of each ray and the camera

id it was casted by: Dict[int, Points2D]

class kornia.nerf.samplers.UniformRaySampler(min_depth, max_depth, ndc, device, dtype)

Class to manage uniform ray spatial sampling for all camera scene pixels.

Parameters:
  • min_depth (float) – sampled rays minimal depth from cameras: float

  • max_depth (float) – sampled rays maximal depth from cameras: float

  • ndc (bool) – convert to normalized device coordinates: bool

  • device (Union[str, device, None]) – device for ray tensors: Union[str, torch.device]

sample_points_2d(heights, widths, sampling_step=1)

Uniformly sample pixel points in 2d for all scene camera pixels.

Parameters:
  • heights (Tensor) – tensor that holds scene camera image heights (can vary between cameras): math: (B).

  • widths (Tensor) – tensor that holds scene camera image widths (can vary between cameras): math: (B).

  • sampling_step (int, optional) – defines uniform strides between rows and columns: int. Default: 1

Return type:

Dict[int, Points2D]

Returns:

dictionary of Points2D objects that holds information on pixel 2d coordinates of each ray and the camera

id it was casted by: Dict[int, Points2D]