How To Reduce Your False Positive Rate
Anomaly detection software is an integral part of modern business monitoring, fraud prevention, and cybersecurity. But to dependably alert you to anomalies, they rely on being able to correctly identify unusual behavior.
Not identifying an issue when there is one, a false negative, obviously has a serious impact. When problems go unrecognized, the whole system is at risk.
However, false positives are also a problem. False positives occur when the detection system raises an alert for unusual behavior that is, in fact, completely harmless. They are like the boy who cried wolf, raising the alarm for no reason.
At best, these false positives are inconvenient, wasting staff time and causing unnecessary panic. But, at worst, they can have a significant impact on income, preventing genuine customers from making a purchase.
Too many false positives render your detection system useless because you are overwhelmed with alerts. You’ve heard ‘wolf’ too many times and might end up missing genuine issues when they get lost in a sea of false positives.
Unfortunately, false positives can be a frequent occurrence with automated anomaly detection systems. The trick is to reduce them without taking the risk of losing your genuine alerts at the same time.
Attributes in Anomaly Detection
If you want to be able to detect anomalies in your data, you first need to identify what normality looks like. Using long-term trends in your time-series data, your detection system establishes the likely range of values for the given metric. Outliers that fall significantly outside this range can then be identified as anomalies
The best anomaly detection models allow for cyclical and seasonal changes to reduce false positives. You don’t, for example, want to be sent an alert that sales are unusually high on Black Friday because it is an expected spike.
There are three main attributes that you can use to determine which anomalies are worthy of your attention. These are the direction of the anomaly, the delta, and the duration.
1. Anomaly Direction
One step in reducing false positives is identifying which direction your data should deviate from the normal range to be cause for concern. For example, if you are an eCommerce site, a sudden drop in successful payments might indicate an issue with your payment gateway. In this case, the anomaly direction that concerns you is down.
But if you are involved with fraud prevention, an increase in transactions from a particular location might be a more telling indicator that there is an issue. So, your direction of concern would be up.
By considering the anomaly direction as you set up your alerts, you reduce the number of false positives you get from metrics deviating in a way that does not require action.
Small peaks and troughs above and below your expected range may not be cause for concern. This is where the delta is important. It is a measure of how far the unexpected values deviate from the normal range for each metric.
The delta is expressed as a percentage and an absolute value. For example, if you look at website visits, you might consider a difference of 20% below your expected range noteworthy. If you typically see between 150 and 300 visitors a day, then you’d want to be alerted when you get 30 fewer visitors than your minimum.
In this case, the maximum delta is both the percentage change, 20%, and the absolute change, 30.
Just as small deviations from your normal range might not warrant an alert, momentary blips may not be a sign that something is wrong. So, the third vital attribute of an anomaly is how long the metric is measuring outside the expected range.
These three attributes are used together to define which outliers are worthy of your attention and which are an acceptable variation from usual activity. A large part of minimizing false positives is checking that these parameters are set up correctly before an alert is triggered.
Establishing the Right Conditions to Trigger an Anomaly
Determining the correct conditions to identify an anomaly is vital if you want to reduce the number of false positives you get. You’ll want to set rules based on the three attributes above, as well as an understanding of how other factors influence whether unusual values for a particular metric need attention or not.
The first step is to decide which metrics to use to detect anomalies. Which will indicate that there is an issue in your system? And will you use the raw value, or do you need to apply some normalization before it becomes valuable?
For example, the number of failed payments might be an indicator that there is an issue with your payment gateway. Or it could be due to your marketing team’s new campaign driving extra traffic to your website. In this case, the failure rate is more important than the number of failures. But it requires you to calculate it from the total number of attempted payments.
You’ll also need to decide how granular you want your alerts to be. Will you monitor overall payment success, for example, or will you break it down by payment type, location, device, or a combination of these. The more granular your conditions, the easier it will be to isolate issues. But you’ll also spend more time investigating every single anomaly.
Once your metrics are chosen, you need to decide for each one:
- What direction indicates an anomaly that is a concern – up or down?
- How long should the anomaly be present before it triggers an alert?
- How far should the metric deviate from the normal range to be considered an issue?
What Warrants the Triggering of an Alert?
Once you have set your conditions to identify anomalies, you need to decide when an alert will be triggered.
In many ways, this relates more to how your business operates than to outliers in your data. At what point do anomalies require action?
The answer will vary for each metric you monitor. And it may even have two answers since, for many of your metrics, both a dip and a peak might be worthy of note.
A sudden drop in contact form submissions, for example, might suggest an issue with your website. A sudden increase might just indicate plenty of new leads. But it could relate to unusual activity from spambots. The drop in submissions will likely be a concern earlier than the increase is, but both could require action.
When an alert is sent will depend on how urgent the issue is, whether it has persisted for a long time, and how large the anomaly is. It might also depend on the interaction between different metrics. For example, you would expect a correlation between website sessions and purchases. If these increase in tandem, it might not be worth an alert.
But if one increases while the other drops, the behavior is unusual and you would want to know that it is happening since it could indicate an issue that needs your attention.
The Role of Machine Learning in Anomaly Detection
As we’ve seen, there is a lot to consider in setting up an anomaly detection system that doesn’t send too many false positives.
Configuring such systems manually requires an expert with deep knowledge of your data and a good understanding of your business’s priorities and operation.
Even then, the problem with manually-built anomaly detection systems is that they don’t adapt to changing circumstances. The person who built it must go back in to make changes regularly, so it remains updated and performing correctly. If those changes aren’t made, your number of false positives will increase as your system fails to consider the new conditions.
In contrast, systems that use machine learning are constantly adapting to incorporate new information. They automatically update their anomaly detection criteria according to changing data. This may include alerts being marked as false positives.
There are two main types of machine learning methods for anomaly detection. The first is supervised. A predictive model is built based on a labeled training set, which includes both expected and unusual results.
The other is unsupervised, where the training data isn’t manually labeled. Instead, it assumes that most of the points fall within the normal range and uses infrequent occurrences to identify anomalies.
Some systems use a combination of supervised and unsupervised. This is called semi-supervised learning.
How Machine Learning Can Reduce Your False Positive Rate
Machine learning automatically identifies patterns in your data and uses them to update its predictive modeling. This creates an anomaly detection system with the flexibility to respond to changing circumstances and integrate new information.
Machine learning is designed to cope with large datasets and can spot patterns that human analysts may miss. The accuracy of its anomaly detection should improve as it continues to assess and evaluate your data, reducing false positives.
The interplay between different metrics is a vital part of anomaly detection and another area where machine learning can increase the accuracy of your alerts. By combining and analyzing datasets, the machine learning model works to understand which correlations are expected between trends in different metrics.
The system’s ability to adapt and learn creates an anomaly detection system with the flexibility to modify rule-based alert systems according to insights it gleans from your data. And this reduces the number of false positives, meaning you and your team can concentrate on investigating genuine anomalies.