Classification Report¶
Module Interface¶
- class torchmetrics.ClassificationReport(**kwargs)[source]¶
Compute precision, recall, F-measure and support for each class.
\[ \begin{align}\begin{aligned}\text{Precision} = \frac{TP}{TP + FP}\\\text{Recall} = \frac{TP}{TP + FN}\\\text{F1} = 2 * \frac{\text{Precision} * \text{Recall}}{\text{Precision} + \text{Recall}}\\\text{Support} = \sum_i^N 1(y_i = k)\end{aligned}\end{align} \]Where \(TP\) is true positives, \(FP\) is false positives, \(FN\) is false negatives, \(y\) is a tensor of target values, \(k\) is the class, and \(N\) is the number of samples.
This module is a simple wrapper to get the task specific versions of this metric, which is done by setting the
task
argument to either'binary'
,'multiclass'
or'multilabel'
. See the documentation ofBinaryClassificationReport
,MulticlassClassificationReport
andMultilabelClassificationReport
for the specific details of each argument influence and examples.- Example (Binary Classification):
>>> from torch import tensor >>> from torchmetrics.classification import ClassificationReport >>> target = tensor([0, 1, 0, 1]) >>> preds = tensor([0, 1, 1, 1]) >>> target_names = ['0', '1'] >>> report = ClassificationReport( ... task="binary", ... target_names=target_names, ... digits=2 ... ) >>> report.update(preds, target) >>> print(report.compute()) precision recall f1-score support 0 1.00 0.50 0.67 2 1 0.67 1.00 0.80 2 accuracy 0.75 4 macro avg 0.83 0.75 0.73 4 weighted avg 0.83 0.75 0.73 4
BinaryClassificationReport¶
- class torchmetrics.classification.BinaryClassificationReport(threshold=0.5, target_names=None, sample_weight=None, digits=2, output_dict=False, zero_division='warn', ignore_index=None, **kwargs)[source]¶
Compute precision, recall, F-measure and support for binary classification tasks.
The classification report provides detailed metrics for each class in a binary classification task: precision, recall, F1-score, and support.
\[ \begin{align}\begin{aligned}\text{Precision} = \frac{TP}{TP + FP}\\\text{Recall} = \frac{TP}{TP + FN}\\\text{F1} = 2 * \frac{\text{Precision} * \text{Recall}}{\text{Precision} + \text{Recall}}\\\text{Support} = \sum_i^N 1(y_i = k)\end{aligned}\end{align} \]Where \(TP\) is true positives, \(FP\) is false positives, \(FN\) is false negatives, \(y\) is a tensor of target values, \(k\) is the class, and \(N\) is the number of samples.
As input to
forward
andupdate
the metric accepts the following input:preds
(Tensor
): A tensor of predictions of shape(N, ...)
whereN
is the batch size. If preds is a floating point tensor with values outside [0,1] range we consider the input to be logits and will auto apply sigmoid per element. Additionally, we convert to int tensor with thresholding using the value inthreshold
.target
(Tensor
): A tensor of targets of shape(N, ...)
whereN
is the batch size.
As output to
forward
andcompute
the metric returns either:A formatted string report if
output_dict=False
A dictionary of metrics if
output_dict=True
- Parameters:
threshold¶ (
float
) – Threshold for transforming probability to binary (0,1) predictionstarget_names¶ (
Optional
[Sequence
[str
]]) – Optional list of names for each classsample_weight¶ (
Optional
[Tensor
]) – Optional weights for each sampledigits¶ (
int
) – Number of decimal places to display in the reportoutput_dict¶ (
bool
) – If True, return a dict instead of a string reportzero_division¶ (
Union
[str
,int
]) – Value to use when dividing by zero
Example
>>> from torch import tensor >>> from torchmetrics.classification.classification_report import binary_classification_report >>> target = tensor([0, 1, 0, 1]) >>> preds = tensor([0, 1, 1, 1]) >>> target_names = ['0', '1'] >>> report = binary_classification_report( ... preds=preds, ... target=target, ... target_names=target_names, ... digits=2 ... ) >>> print(report) precision recall f1-score support 0 1.00 0.50 0.67 2 1 0.67 1.00 0.80 2 accuracy 0.75 4 macro avg 0.83 0.75 0.73 4 weighted avg 0.83 0.75 0.73 4
MulticlassClassificationReport¶
- class torchmetrics.classification.MulticlassClassificationReport(num_classes, target_names=None, sample_weight=None, digits=2, output_dict=False, zero_division='warn', ignore_index=None, top_k=1, **kwargs)[source]¶
Compute precision, recall, F-measure and support for multiclass classification tasks.
The classification report provides detailed metrics for each class in a multiclass classification task: precision, recall, F1-score, and support.
\[ \begin{align}\begin{aligned}\text{Precision} = \frac{TP}{TP + FP}\\\text{Recall} = \frac{TP}{TP + FN}\\\text{F1} = 2 * \frac{\text{Precision} * \text{Recall}}{\text{Precision} + \text{Recall}}\\\text{Support} = \sum_i^N 1(y_i = k)\end{aligned}\end{align} \]Where \(TP\) is true positives, \(FP\) is false positives, \(FN\) is false negatives, \(y\) is a tensor of target values, \(k\) is the class, and \(N\) is the number of samples.
As input to
forward
andupdate
the metric accepts the following input:As output to
forward
andcompute
the metric returns either:A formatted string report if
output_dict=False
A dictionary of metrics if
output_dict=True
- Parameters:
target_names¶ (
Optional
[Sequence
[str
]]) – Optional list of names for each classsample_weight¶ (
Optional
[Tensor
]) – Optional weights for each sampledigits¶ (
int
) – Number of decimal places to display in the reportoutput_dict¶ (
bool
) – If True, return a dict instead of a string reportzero_division¶ (
Union
[str
,int
]) – Value to use when dividing by zerotop_k¶ (
int
) – Number of highest probability or logit score predictions considered to find the correct label. Only works whenpreds
contain probabilities/logits.
Example
>>> from torch import tensor >>> from torchmetrics.classification.classification_report import multiclass_classification_report >>> target = tensor([0, 1, 2, 2, 2]) >>> preds = tensor([0, 0, 2, 2, 1]) >>> target_names = ["class 0", "class 1", "class 2"] >>> report = multiclass_classification_report( ... preds=preds, ... target=target, ... num_classes=3, ... target_names=target_names, ... digits=2 ... ) >>> print(report) precision recall f1-score support class 0 0.50 1.00 0.67 1 class 1 0.00 0.00 0.00 1 class 2 1.00 0.67 0.80 3 accuracy 0.60 5 macro avg 0.50 0.56 0.49 5 weighted avg 0.70 0.60 0.61 5
MultilabelClassificationReport¶
- class torchmetrics.classification.MultilabelClassificationReport(num_labels, target_names=None, threshold=0.5, sample_weight=None, digits=2, output_dict=False, zero_division='warn', ignore_index=None, **kwargs)[source]¶
Compute precision, recall, F-measure and support for multilabel classification tasks.
The classification report provides detailed metrics for each class in a multilabel classification task: precision, recall, F1-score, and support.
\[ \begin{align}\begin{aligned}\text{Precision} = \frac{TP}{TP + FP}\\\text{Recall} = \frac{TP}{TP + FN}\\\text{F1} = 2 * \frac{\text{Precision} * \text{Recall}}{\text{Precision} + \text{Recall}}\\\text{Support} = \sum_i^N 1(y_i = k)\end{aligned}\end{align} \]Where \(TP\) is true positives, \(FP\) is false positives, \(FN\) is false negatives, \(y\) is a tensor of target values, \(k\) is the class, and \(N\) is the number of samples.
As input to
forward
andupdate
the metric accepts the following input:preds
(Tensor
): A tensor of predictions of shape(N, C)
whereN
is the batch size andC
is the number of labels. If preds is a floating point tensor with values outside [0,1] range we consider the input to be logits and will auto apply sigmoid per element. Additionally, we convert to int tensor with thresholding using the value inthreshold
.target
(Tensor
): A tensor of targets of shape(N, C)
whereN
is the batch size andC
is the number of labels.
As output to
forward
andcompute
the metric returns either:A formatted string report if
output_dict=False
A dictionary of metrics if
output_dict=True
- Parameters:
target_names¶ (
Optional
[Sequence
[str
]]) – Optional list of names for each labelthreshold¶ (
float
) – Threshold for transforming probability to binary (0,1) predictionssample_weight¶ (
Optional
[Tensor
]) – Optional weights for each sampledigits¶ (
int
) – Number of decimal places to display in the reportoutput_dict¶ (
bool
) – If True, return a dict instead of a string reportzero_division¶ (
Union
[str
,int
]) – Value to use when dividing by zero
Example
>>> from torch import tensor >>> from torchmetrics.classification.classification_report import multilabel_classification_report >>> target = tensor([[1, 0, 1], [0, 1, 0], [1, 1, 0]]) >>> preds = tensor([[1, 0, 1], [0, 1, 1], [1, 0, 0]]) >>> target_names = ["Label A", "Label B", "Label C"] >>> report = multilabel_classification_report( ... preds=preds, ... target=target, ... num_labels=len(target_names), ... target_names=target_names, ... digits=2, ... ) >>> print(report) precision recall f1-score support Label A 1.00 1.00 1.00 2 Label B 1.00 0.50 0.67 2 Label C 0.50 1.00 0.67 1 micro avg 0.80 0.80 0.80 5 macro avg 0.83 0.83 0.78 5 weighted avg 0.90 0.80 0.80 5 samples avg 0.83 0.83 0.78 5
Functional Interface¶
- torchmetrics.functional.classification.classification_report(preds, target, task, threshold=0.5, num_classes=None, num_labels=None, target_names=None, digits=2, output_dict=False, zero_division=0.0, ignore_index=None, validate_args=True, labels=None, top_k=1)[source]¶
Compute a classification report for various classification tasks.
The classification report shows the precision, recall, F1 score, and support for each class/label.
- Parameters:
task¶ (
Literal
['binary'
,'multiclass'
,'multilabel'
]) – The classification task - either ‘binary’, ‘multiclass’, or ‘multilabel’threshold¶ (
float
) – Threshold for converting probabilities to binary predictions (for binary and multilabel tasks)num_classes¶ (
Optional
[int
]) – Number of classes (for multiclass tasks)num_labels¶ (
Optional
[int
]) – Number of labels (for multilabel tasks)target_names¶ (
Optional
[List
[str
]]) – Optional list of names for the classes/labelsdigits¶ (
int
) – Number of decimal places to display in the reportoutput_dict¶ (
bool
) – If True, return a dict instead of a string reportzero_division¶ (
Union
[str
,float
]) – Value to use when dividing by zeroignore_index¶ (
Optional
[int
]) – Optional index to ignore in the target (for multiclass tasks)validate_args¶ (
bool
) – bool indicating if input arguments and tensors should be validated for correctnesslabels¶ (
Optional
[List
[int
]]) – Optional list of label indices to include in the report (for multiclass tasks)top_k¶ (
int
) – Number of highest probability or logit score predictions considered to find the correct label. Only works whenpreds
contain probabilities/logits and task is ‘multiclass’.
- Return type:
Union
[str
,Dict
[str
,Union
[float
,Dict
[str
,Union
[float
,int
]]]]]- Returns:
If output_dict=True, a dictionary with the classification report data. Otherwise, a formatted string with the classification report.
Examples
>>> from torch import tensor >>> from torchmetrics.functional.classification.classification_report import classification_report >>> >>> # Binary classification example >>> binary_target = tensor([0, 1, 0, 1]) >>> binary_preds = tensor([0, 1, 1, 1]) >>> binary_report = classification_report( ... preds=binary_preds, ... target=binary_target, ... task="binary", ... target_names=['Class 0', 'Class 1'], ... digits=2 ... ) >>> print(binary_report) precision recall f1-score support Class 0 1.00 0.50 0.67 2 Class 1 0.67 1.00 0.80 2 accuracy 0.75 4 macro avg 0.83 0.75 0.73 4 weighted avg 0.83 0.75 0.73 4 >>> >>> # Multiclass classification example >>> multiclass_target = tensor([0, 1, 2, 2, 2]) >>> multiclass_preds = tensor([0, 0, 2, 2, 1]) >>> multiclass_report = classification_report( ... preds=multiclass_preds, ... target=multiclass_target, ... task="multiclass", ... num_classes=3, ... target_names=["Class 0", "Class 1", "Class 2"], ... digits=2 ... ) >>> print(multiclass_report) precision recall f1-score support Class 0 0.50 1.00 0.67 1 Class 1 0.00 0.00 0.00 1 Class 2 1.00 0.67 0.80 3 accuracy 0.60 5 macro avg 0.50 0.56 0.49 5 weighted avg 0.70 0.60 0.61 5 >>> >>> # Multilabel classification example >>> multilabel_target = tensor([[1, 0, 1], [0, 1, 0], [1, 1, 0]]) >>> multilabel_preds = tensor([[1, 0, 1], [0, 1, 1], [1, 0, 0]]) >>> multilabel_report = classification_report( ... preds=multilabel_preds, ... target=multilabel_target, ... task="multilabel", ... num_labels=3, ... target_names=["Label A", "Label B", "Label C"], ... digits=2 ... ) >>> print(multilabel_report) precision recall f1-score support Label A 1.00 1.00 1.00 2 Label B 1.00 0.50 0.67 2 Label C 0.50 1.00 0.67 1 micro avg 0.80 0.80 0.80 5 macro avg 0.83 0.83 0.78 5 weighted avg 0.90 0.80 0.80 5 samples avg 0.83 0.83 0.78 5
binary_classification_report¶
- torchmetrics.functional.classification.binary_classification_report(preds, target, threshold=0.5, target_names=None, digits=2, output_dict=False, zero_division=0.0, ignore_index=None, validate_args=True)[source]¶
Compute a classification report for binary classification tasks.
The classification report shows the precision, recall, F1 score, and support for each class.
- Parameters:
threshold¶ (
float
) – Threshold for converting probabilities to binary predictionstarget_names¶ (
Optional
[List
[str
]]) – Optional list of names for the classesdigits¶ (
int
) – Number of decimal places to display in the reportoutput_dict¶ (
bool
) – If True, return a dict instead of a string reportzero_division¶ (
Union
[str
,float
]) – Value to use when dividing by zeroignore_index¶ (
Optional
[int
]) – Specifies a target value that is ignored and does not contribute to the metric calculationvalidate_args¶ (
bool
) – bool indicating if input arguments and tensors should be validated for correctness
- Return type:
Union
[str
,Dict
[str
,Union
[float
,Dict
[str
,Union
[float
,int
]]]]]- Returns:
If output_dict=True, a dictionary with the classification report data. Otherwise, a formatted string with the classification report.
Example
>>> from torch import tensor >>> from torchmetrics.functional.classification.classification_report import binary_classification_report >>> target = tensor([0, 1, 0, 1]) >>> preds = tensor([0, 1, 1, 1]) >>> target_names = ['0', '1'] >>> report = binary_classification_report( ... preds=preds, ... target=target, ... target_names=target_names, ... digits=2 ... ) >>> print(report) precision recall f1-score support 0 1.00 0.50 0.67 2 1 0.67 1.00 0.80 2 accuracy 0.75 4 macro avg 0.83 0.75 0.73 4 weighted avg 0.83 0.75 0.73 4
multiclass_classification_report¶
- torchmetrics.functional.classification.multiclass_classification_report(preds, target, num_classes, target_names=None, digits=2, output_dict=False, zero_division=0.0, ignore_index=None, validate_args=True, labels=None, top_k=1)[source]¶
Compute a classification report for multiclass classification tasks.
The classification report shows the precision, recall, F1 score, and support for each class.
- Parameters:
preds¶ (
Tensor
) – Tensor with predictions of shape (N, …) or (N, C, …) where C is the number of classestarget¶ (
Tensor
) – Tensor with ground truth labels of shape (N, …)target_names¶ (
Optional
[List
[str
]]) – Optional list of names for the classesdigits¶ (
int
) – Number of decimal places to display in the reportoutput_dict¶ (
bool
) – If True, return a dict instead of a string reportzero_division¶ (
Union
[str
,float
]) – Value to use when dividing by zeroignore_index¶ (
Optional
[int
]) – Optional index to ignore in the targetvalidate_args¶ (
bool
) – bool indicating if input arguments and tensors should be validated for correctnesslabels¶ (
Optional
[List
[int
]]) – Optional list of label indices to include in the reporttop_k¶ (
int
) – Number of highest probability or logit score predictions considered to find the correct label. Only works whenpreds
contain probabilities/logits.
- Return type:
Union
[str
,Dict
[str
,Union
[float
,Dict
[str
,Union
[float
,int
]]]]]- Returns:
If output_dict=True, a dictionary with the classification report data. Otherwise, a formatted string with the classification report.
Example
>>> from torch import tensor >>> from torchmetrics.functional.classification.classification_report import multiclass_classification_report >>> target = tensor([0, 1, 2, 2, 2]) >>> preds = tensor([0, 0, 2, 2, 1]) >>> target_names = ["class 0", "class 1", "class 2"] >>> report = multiclass_classification_report( ... preds=preds, ... target=target, ... num_classes=3, ... target_names=target_names, ... digits=2 ... ) >>> print(report) precision recall f1-score support class 0 0.50 1.00 0.67 1 class 1 0.00 0.00 0.00 1 class 2 1.00 0.67 0.80 3 accuracy 0.60 5 macro avg 0.50 0.56 0.49 5 weighted avg 0.70 0.60 0.61 5
multilabel_classification_report¶
- torchmetrics.functional.classification.multilabel_classification_report(preds, target, num_labels, threshold=0.5, target_names=None, digits=2, output_dict=False, zero_division=0.0, ignore_index=None, validate_args=True)[source]¶
Compute a classification report for multilabel classification tasks.
The classification report shows the precision, recall, F1 score, and support for each label.
- Parameters:
preds¶ (
Tensor
) – Tensor with predictions of shape (N, L, …) where L is the number of labelstarget¶ (
Tensor
) – Tensor with ground truth labels of shape (N, L, …)threshold¶ (
float
) – Threshold for converting probabilities to binary predictionstarget_names¶ (
Optional
[List
[str
]]) – Optional list of names for the labelsdigits¶ (
int
) – Number of decimal places to display in the reportoutput_dict¶ (
bool
) – If True, return a dict instead of a string reportzero_division¶ (
Union
[str
,float
]) – Value to use when dividing by zeroignore_index¶ (
Optional
[int
]) – Specifies a target value that is ignored and does not contribute to the metric calculationvalidate_args¶ (
bool
) – bool indicating if input arguments and tensors should be validated for correctness
- Return type:
Union
[str
,Dict
[str
,Union
[float
,Dict
[str
,Union
[float
,int
]]]]]- Returns:
If output_dict=True, a dictionary with the classification report data. Otherwise, a formatted string with the classification report.
Example
>>> from torch import tensor >>> from torchmetrics.functional.classification.classification_report import multilabel_classification_report >>> target = tensor([[1, 0, 1], [0, 1, 0], [1, 1, 0]]) >>> preds = tensor([[1, 0, 1], [0, 1, 1], [1, 0, 0]]) >>> target_names = ["Label A", "Label B", "Label C"] >>> report = multilabel_classification_report( ... preds=preds, ... target=target, ... num_labels=len(target_names), ... target_names=target_names, ... digits=2, ... ) >>> print(report) precision recall f1-score support Label A 1.00 1.00 1.00 2 Label B 1.00 0.50 0.67 2 Label C 0.50 1.00 0.67 1 micro avg 0.80 0.80 0.80 5 macro avg 0.83 0.83 0.78 5 weighted avg 0.90 0.80 0.80 5 samples avg 0.83 0.83 0.78 5