kornia.metrics¶
Module containing metrics for training networks
Classification¶
- kornia.metrics.accuracy(pred, target, topk=(1,))¶
Computes the accuracy over the k top predictions for the specified values of k.
- Parameters:
- Return type:
Example
>>> logits = torch.tensor([[0, 1, 0]]) >>> target = torch.tensor([[1]]) >>> accuracy(logits, target) [tensor(100.)]
Segmentation¶
- kornia.metrics.confusion_matrix(pred, target, num_classes, normalized=False)¶
Compute confusion matrix to evaluate the accuracy of a classification.
- Parameters:
pred (
Tensor
) – tensor with estimated targets returned by a classifier. The shape can be \((B, *)\) and must contain integer values between 0 and K-1.target (
Tensor
) – tensor with ground truth (correct) target values. The shape can be \((B, *)\) and must contain integer values between 0 and K-1, where targets are assumed to be provided as one-hot vectors.num_classes (
int
) – total possible number of classes in target.normalized (
bool
, optional) – whether to return the confusion matrix normalized. Default:False
- Return type:
- Returns:
a tensor containing the confusion matrix with shape \((B, K, K)\) where K is the number of classes.
Example
>>> logits = torch.tensor([[0, 1, 0]]) >>> target = torch.tensor([[0, 1, 0]]) >>> confusion_matrix(logits, target, num_classes=3) tensor([[[2., 0., 0.], [0., 1., 0.], [0., 0., 0.]]])
- kornia.metrics.mean_iou(pred, target, num_classes, eps=1e-6)¶
Calculate mean Intersection-Over-Union (mIOU).
The function internally computes the confusion matrix.
- Parameters:
pred (
Tensor
) – tensor with estimated targets returned by a classifier. The shape can be \((B, *)\) and must contain integer values between 0 and K-1.target (
Tensor
) – tensor with ground truth (correct) target values. The shape can be \((B, *)\) and must contain integer values between 0 and K-1, where targets are assumed to be provided as one-hot vectors.num_classes (
int
) – total possible number of classes in target.
- Return type:
- Returns:
a tensor representing the mean intersection-over union with shape \((B, K)\) where K is the number of classes.
Example
>>> logits = torch.tensor([[0, 1, 0]]) >>> target = torch.tensor([[0, 1, 0]]) >>> mean_iou(logits, target, num_classes=3) tensor([[1., 1., 1.]])
Detection¶
- kornia.metrics.mean_average_precision(pred_boxes, pred_labels, pred_scores, gt_boxes, gt_labels, n_classes, threshold=0.5)¶
Calculate the Mean Average Precision (mAP) of detected objects.
Code altered from https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/utils.py#L271. Background class (0 index) is excluded.
- Parameters:
pred_boxes (
List
[Tensor
]) – a tensor list of predicted bounding boxes.pred_labels (
List
[Tensor
]) – a tensor list of predicted labels.pred_scores (
List
[Tensor
]) – a tensor list of predicted labels’ scores.gt_boxes (
List
[Tensor
]) – a tensor list of ground truth bounding boxes.gt_labels (
List
[Tensor
]) – a tensor list of ground truth labels.n_classes (
int
) – the number of classes.threshold (
float
, optional) – count as a positive if the overlap is greater than the threshold. Default:0.5
- Return type:
- Returns:
mean average precision (mAP), list of average precisions for each class.
Examples
>>> boxes, labels, scores = torch.tensor([[100, 50, 150, 100.]]), torch.tensor([1]), torch.tensor([.7]) >>> gt_boxes, gt_labels = torch.tensor([[100, 50, 150, 100.]]), torch.tensor([1]) >>> mean_average_precision([boxes], [labels], [scores], [gt_boxes], [gt_labels], 2) (tensor(1.), {1: 1.0})
- kornia.metrics.mean_iou_bbox(boxes_1, boxes_2)¶
Compute the IoU of the cartesian product of two sets of boxes.
Each box in each set shall be (x1, y1, x2, y2).
- Parameters:
- Return type:
- Returns:
a tensor in dimensions \((B1, B2)\), representing the intersection of each of the boxes in set 1 with respect to each of the boxes in set 2.
Example
>>> boxes_1 = torch.tensor([[40, 40, 60, 60], [30, 40, 50, 60]]) >>> boxes_2 = torch.tensor([[40, 50, 60, 70], [30, 40, 40, 50]]) >>> mean_iou_bbox(boxes_1, boxes_2) tensor([[0.3333, 0.0000], [0.1429, 0.2500]])
Image Quality¶
- kornia.metrics.psnr(image, target, max_val)¶
Create a function that calculates the PSNR between 2 images.
PSNR is Peek Signal to Noise Ratio, which is similar to mean squared error. Given an m x n image, the PSNR is:
\[\text{PSNR} = 10 \log_{10} \bigg(\frac{\text{MAX}_I^2}{MSE(I,T)}\bigg)\]where
\[\text{MSE}(I,T) = \frac{1}{mn}\sum_{i=0}^{m-1}\sum_{j=0}^{n-1} [I(i,j) - T(i,j)]^2\]and \(\text{MAX}_I\) is the maximum possible input value (e.g for floating point images \(\text{MAX}_I=1\)).
- Parameters:
- Return type:
- Returns:
the computed loss as a scalar.
Examples
>>> ones = torch.ones(1) >>> psnr(ones, 1.2 * ones, 2.) # 10 * log(4/((1.2-1)**2)) / log(10) tensor(20.0000)
- kornia.metrics.ssim(img1, img2, window_size, max_val=1.0, eps=1e-12, padding='same')¶
Function that computes the Structural Similarity (SSIM) index map between two images.
Measures the (SSIM) index between each element in the input x and target y.
The index can be described as:
\[\text{SSIM}(x, y) = \frac{(2\mu_x\mu_y+c_1)(2\sigma_{xy}+c_2)} {(\mu_x^2+\mu_y^2+c_1)(\sigma_x^2+\sigma_y^2+c_2)}\]- where:
\(c_1=(k_1 L)^2\) and \(c_2=(k_2 L)^2\) are two variables to stabilize the division with weak denominator.
\(L\) is the dynamic range of the pixel-values (typically this is \(2^{\#\text{bits per pixel}}-1\)).
- Parameters:
img1 (
Tensor
) – the first input image with shape \((B, C, H, W)\).img2 (
Tensor
) – the second input image with shape \((B, C, H, W)\).window_size (
int
) – the size of the gaussian kernel to smooth the images.max_val (
float
, optional) – the dynamic range of the images. Default:1.0
eps (
float
, optional) – Small value for numerically stability when dividing. Default:1e-12
padding (
str
, optional) –'same'
|'valid'
. Whether to only use the “valid” convolution area to compute SSIM to match the MATLAB implementation of original SSIM paper. Default:"same"
- Return type:
- Returns:
The ssim index map with shape \((B, C, H, W)\).
Examples
>>> input1 = torch.rand(1, 4, 5, 5) >>> input2 = torch.rand(1, 4, 5, 5) >>> ssim_map = ssim(input1, input2, 5) # 1x4x5x5
- kornia.metrics.ssim3d(img1, img2, window_size, max_val=1.0, eps=1e-12, padding='same')¶
Function that computes the Structural Similarity (SSIM) index map between two images.
Measures the (SSIM) index between each element in the input x and target y.
The index can be described as:
\[\text{SSIM}(x, y) = \frac{(2\mu_x\mu_y+c_1)(2\sigma_{xy}+c_2)} {(\mu_x^2+\mu_y^2+c_1)(\sigma_x^2+\sigma_y^2+c_2)}\]- where:
\(c_1=(k_1 L)^2\) and \(c_2=(k_2 L)^2\) are two variables to stabilize the division with weak denominator.
\(L\) is the dynamic range of the pixel-values (typically this is \(2^{\#\text{bits per pixel}}-1\)).
- Parameters:
img1 (
Tensor
) – the first input image with shape \((B, C, D, H, W)\).img2 (
Tensor
) – the second input image with shape \((B, C, D, H, W)\).window_size (
int
) – the size of the gaussian kernel to smooth the images.max_val (
float
, optional) – the dynamic range of the images. Default:1.0
eps (
float
, optional) – Small value for numerically stability when dividing. Default:1e-12
padding (
str
, optional) –'same'
|'valid'
. Whether to only use the “valid” convolution area to compute SSIM to match the MATLAB implementation of original SSIM paper. Default:"same"
- Return type:
- Returns:
The ssim index map with shape \((B, C, D, H, W)\).
Examples
>>> input1 = torch.rand(1, 4, 5, 5, 5) >>> input2 = torch.rand(1, 4, 5, 5, 5) >>> ssim_map = ssim3d(input1, input2, 5) # 1x4x5x5x5
- class kornia.metrics.SSIM(window_size, max_val=1.0, eps=1e-12, padding='same')¶
Create a module that computes the Structural Similarity (SSIM) index between two images.
Measures the (SSIM) index between each element in the input x and target y.
The index can be described as:
\[\text{SSIM}(x, y) = \frac{(2\mu_x\mu_y+c_1)(2\sigma_{xy}+c_2)} {(\mu_x^2+\mu_y^2+c_1)(\sigma_x^2+\sigma_y^2+c_2)}\]- where:
\(c_1=(k_1 L)^2\) and \(c_2=(k_2 L)^2\) are two variables to stabilize the division with weak denominator.
\(L\) is the dynamic range of the pixel-values (typically this is \(2^{\#\text{bits per pixel}}-1\)).
- Parameters:
window_size (
int
) – the size of the gaussian kernel to smooth the images.max_val (
float
, optional) – the dynamic range of the images. Default:1.0
eps (
float
, optional) – Small value for numerically stability when dividing. Default:1e-12
padding (
str
, optional) –'same'
|'valid'
. Whether to only use the “valid” convolution area to compute SSIM to match the MATLAB implementation of original SSIM paper. Default:"same"
- Shape:
Input: \((B, C, H, W)\).
Target \((B, C, H, W)\).
Output: \((B, C, H, W)\).
Examples
>>> input1 = torch.rand(1, 4, 5, 5) >>> input2 = torch.rand(1, 4, 5, 5) >>> ssim = SSIM(5) >>> ssim_map = ssim(input1, input2) # 1x4x5x5
- class kornia.metrics.SSIM3D(window_size, max_val=1.0, eps=1e-12, padding='same')¶
Create a module that computes the Structural Similarity (SSIM) index between two 3D images.
Measures the (SSIM) index between each element in the input x and target y.
The index can be described as:
\[\text{SSIM}(x, y) = \frac{(2\mu_x\mu_y+c_1)(2\sigma_{xy}+c_2)} {(\mu_x^2+\mu_y^2+c_1)(\sigma_x^2+\sigma_y^2+c_2)}\]- where:
\(c_1=(k_1 L)^2\) and \(c_2=(k_2 L)^2\) are two variables to stabilize the division with weak denominator.
\(L\) is the dynamic range of the pixel-values (typically this is \(2^{\#\text{bits per pixel}}-1\)).
- Parameters:
window_size (
int
) – the size of the gaussian kernel to smooth the images.max_val (
float
, optional) – the dynamic range of the images. Default:1.0
eps (
float
, optional) – Small value for numerically stability when dividing. Default:1e-12
padding (
str
, optional) –'same'
|'valid'
. Whether to only use the “valid” convolution area to compute SSIM to match the MATLAB implementation of original SSIM paper. Default:"same"
- Shape:
Input: \((B, C, D, H, W)\).
Target \((B, C, D, H, W)\).
Output: \((B, C, D, H, W)\).
Examples
>>> input1 = torch.rand(1, 4, 5, 5, 5) >>> input2 = torch.rand(1, 4, 5, 5, 5) >>> ssim = SSIM3D(5) >>> ssim_map = ssim(input1, input2) # 1x4x5x5x5
Optical Flow¶
- kornia.metrics.aepe(input, target, reduction='mean')¶
Create a function that calculates the average endpoint error (AEPE) between 2 flow maps.
AEPE is the endpoint error between two 2D vectors (e.g., optical flow). Given a h x w x 2 optical flow map, the AEPE is:
\[\text{AEPE}=\frac{1}{hw}\sum_{i=1, j=1}^{h, w}\sqrt{(I_{i,j,1}-T_{i,j,1})^{2}+(I_{i,j,2}-T_{i,j,2})^{2}}\]- Parameters:
input (
Tensor
) – the input flow map with shape \((*, 2)\).target (
Tensor
) – the target flow map with shape \((*, 2)\).reduction (
str
, optional) – Specifies the reduction to apply to the output:'none'
|'mean'
|'sum'
.'none'
: no reduction will be applied,'mean'
: the sum of the output will be divided by the number of elements in the output,'sum'
: the output will be summed. Default:"mean"
- Return type:
- Returns:
the computed AEPE as a scalar.
Examples
>>> ones = torch.ones(4, 4, 2) >>> aepe(ones, 1.2 * ones) tensor(0.2828)
- class kornia.metrics.AEPE(reduction='mean')¶
Computes the average endpoint error (AEPE) between 2 flow maps.
EPE is the endpoint error between two 2D vectors (e.g., optical flow). Given a h x w x 2 optical flow map, the AEPE is:
\[\text{AEPE}=\frac{1}{hw}\sum_{i=1, j=1}^{h, w}\sqrt{(I_{i,j,1}-T_{i,j,1})^{2}+(I_{i,j,2}-T_{i,j,2})^{2}}\]- Parameters:
reduction (
str
, optional) – Specifies the reduction to apply to the output:'none'
|'mean'
|'sum'
.'none'
: no reduction will be applied,'mean'
: the sum of the output will be divided by the number of elements in the output,'sum'
: the output will be summed. Default:"mean"
- Shape:
input: \((*, 2)\).
target \((*, 2)\).
output: \((1)\).
Examples
>>> input1 = torch.rand(1, 4, 5, 2) >>> input2 = torch.rand(1, 4, 5, 2) >>> epe = AEPE(reduction="mean") >>> epe = epe(input1, input2)
Monitoring¶
- class kornia.metrics.AverageMeter¶
Computes and stores the average and current value.
Example
>>> stats = AverageMeter() >>> acc1 = torch.tensor(0.99) # coming from K.metrics.accuracy >>> stats.update(acc1, n=1) # where n is batch size usually >>> round(stats.avg, 2) 0.99