🎉 AI Engineers: Join our webinar on Prompt Engineering for AI Agents. Register here >>

April 4, 2024 - last updated
Machine Learning

Understanding Precision: The fine art of accurate positive predictions

Or Jacobi
Or Jacobi

Or is a software engineer at Aporia and an avid gaming enthusiast "All I need is a cold brew and the controller in my hand, and I'm good to go."

5 min read Aug 20, 2023

Today, we’ll take a look at the intricacies of the machine learning evaluation metric, Precision. In this practical guide, we aim to equip you with the knowledge of how precision is calculated, its relevance, its trade-offs, and how to use it to evaluate ML performance.

What is Precision?

Precision is one of the cornerstone metrics used in the evaluation of machine learning models, particularly in classification problems. It’s crucial in scenarios where the cost of False Positives is high.

Essentially, Precision quantifies the proportion of true positives (TP) over the sum of true positives and false positives (FP). It is mathematically expressed as:

Precision = TP / (TP + FP)

For example, consider a model designed to predict if a bank transaction is fraudulent. In this case, a false positive would mean an ordinary transaction being flagged as fraudulent—an instance we’d like to minimize. Here, precision plays a crucial role.

def calculate_precision(predictions, actuals):
    if len(predictions) != len(actuals):
        raise ValueError("The length of predictions and actuals lists must be equal.")

    true_positives = sum([p == a == 1 for p, a in zip(predictions, actuals)])
    false_positives = sum([p == 1 and a == 0 for p, a in zip(predictions, actuals)])
    if true_positives + false_positives == 0:
        return 0
    precision = true_positives / (true_positives + false_positives)
    return precision

predictions = [1, 0, 1, 1, 0, 1, 0, 0, 1, 1] # Sample prediction values
actuals = [1, 0, 0, 1, 0, 1, 1, 0, 1, 0] # Sample actual values

print(calculate_precision(predictions, actuals))

Note: In this example, predictions and actuals are lists with binary values, where 1 represents a fraudulent transaction and 0 represents a normal transaction.

Also, this is a simple way to calculate precision and doesn’t consider many nuances and edge cases that might be involved in a real-world scenario. For a production-level code, you would probably use an established library, such as scikit-learn, which has built-in functions to compute precision, recall, and many other metrics.

The Interplay of Precision, Recall, and Accuracy

Precision is often used alongside other metrics, such as recall (also known as sensitivity) and accuracy.

  • Recall measures the proportion of actual positives that were correctly classified. In other words, it quantifies how many relevant items are selected.
  • Accuracy is the ratio of correctly predicted observations to the total observations. It gives an overall performance of the model.

To illustrate the difference between precision and recall, let’s use a practical example with code. We will use the sklearn library to calculate these metrics:

from sklearn.metrics import precision_score, recall_score

# Actual values
y_true = [0, 1, 1, 0, 1, 1, 0, 0, 0, 1]

# Predicted values
y_pred = [0, 0, 1, 1, 1, 1, 0, 1, 0, 0]

precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)

print(f'Precision: {precision}')
print(f'Recall: {recall}')

In this example, you’ll notice the difference between precision and recall. While precision provides the accuracy of positive predictions, recall informs us about the fraction of positives our model can identify correctly.

Role of Precision in Model Monitoring and Evaluation

Precision holds a significant role in model monitoring and evaluation. It provides an essential measure of your model’s performance, particularly in applications where false positives have substantial consequences.

For instance, in fraud detection, falsely classifying a true transaction as fraudulent (false positive) is more consequential than letting a fraudulent payment slip through the cracks (false negative). Precision helps monitor these false positives effectively, allowing for the refinement of the model’s performance over time.

However, precision alone doesn’t provide the complete picture. It must be balanced with other metrics like recall, especially when the cost of missing a positive instance (false negative) is high.

Practical tips for improving Precision

  • Feature engineering: Creating relevant features can help improve Precision.
  • Hyperparameter tuning: Adjusting parameters of your algorithm might lead to better results.
  • Ensemble methods: Combining predictions from multiple models often improves overall performance.
  • Carefully choose threshold: Depending on your priorities, you can adjust the classification threshold to increase Precision.

Let’s look at some limitations of Precision: 

  • ​​Risks of over-optimizing Precision: This might make the model biased and less useful, especially if the cost of false negatives is high.
  • Understanding business context: Sometimes other metrics may be more critical.
  • Importance of human evaluation: Particularly in critical applications, human evaluation and common sense should accompany model decisions.

Pros and Cons of Precision

While precision is a helpful metric, it has its trade-offs.

Precision is valuable in situations where false positives are more harmful than false negatives.It doesn’t take false negatives into account. So, it’s not useful when false negatives carry high costs.
It provides insight into the accuracy of positive predictions.In cases where the negative class dominates, precision might not be a good indicator of overall performance.

Wrapping up Precision

Understanding precision, its merits, and trade-offs is crucial when evaluating machine learning models. While precision might not always be the best standalone metric, it’s often used in conjunction with others to provide a holistic view of model performance.

Always remember, the choice of metric should align with the problem at hand and the cost associated with the type of error. In the next posts, we will cover other important metrics in detail, ensuring you’re well-versed in all facets of ML model evaluation. Stay tuned!

Reach out to learn more about model monitoring. 

Green Background

Control All your GenAI Apps in minutes