What is Anomaly Detection?

Defining anomaly detection

Anomaly detection, often called outlier detection, is the identification of unexpected data points that deviates significantly from expected behaviour.

These expectations are set against the context of “normal” performance, and occasionally external factors can strongly influence this.

Anomaly detection depends on having data with sufficiently high quality, there are 2 basic assumptions:

  • Anomalies are rare within the data
  • One of the measurements within the data contains sufficient information to reveal the anomaly, such that a human operator, given sufficient time and skill, would be able to unearth it

What is an anomaly?

An anomaly is a data point that falls outside of the range of usual behaviour.  We can classify it broadly into 2 types:

Expected Anomalies

These are generally well understood by those reviewing the results of Anomaly Detection – great examples of these include a spike in sales due to Black Friday, or a fall in footfall at a shopping centre on Christmas Day.  Importantly – they do not generate an “Aha!” moment when they are revealed to a human, as a change in performance is already expected

Unknown Anomalies

These are generally not well understood by the audience – an example of this could be a sudden drop in orders from a retailer’s website, or a spike in the number of customer complaints for a video streaming platform.

These are, by-far, the more interesting anomalies – they represent an expected and unknown change in performance, which the human had not anticipated – and generate an “Aha!” moment when revealed.

Why is anomaly detection important?

It is critical for humans to be able to identify changing performance and take actions on that insight.

A shift in a metric could be innocuous, or it could represent a detrimental event happening within the business, or a positive opportunity for growth.

By being alerted to these instances via anomaly detection, users can discern between insignificant changes and those that are truly unusual, driving insight and action.

A well-constructed anomaly detection model that learns from a specific company, for specific metrics, allows humans to not manually monitor for changes around the clock, but to leave it to a system to tell the signal from the noise, and focus on what really matters.

The challenge of anomaly detection and machine learning

There are some great machine learning techniques out there, generally broken down into 2 camps:

Supervised learning

In this context, anomalies would be labelled as such in the historical data, which could take significant time & effort to implement.  However, with labelled data, it is much easier to determine what constitutes an anomaly going forward into the future!

Unsupervised learning

In this context, anomalies would not be labelled in historical data, more closely reflecting the state-of-affairs for real, operational systems.  In this case, the system has to intelligently assess the datapoint that are likely to be historical anomalies, and return that to the user, alongside determining this for future datapoint.

What is time series data anomaly detection?

Time-series anomaly detection introduces a key variable to the mix – time.  In many applications, this is vital for making sure the expected range of normal behaviour is correctly set.

As an example, a retailer typically expects high sales at the weekend (Sat, Sun), but lower sales mid-week (Tues, Weds).  Time-series anomaly detection takes this context into account, and sets the normal range of behaviour according, so that a moderate drop in sales on Saturday is accurately detected as an anomaly.

Without the time-series component, the anomaly detection system would expect the same behaviour for all days of the week, and may very well have missed that specific Saturday’s fall!

Why does your company need anomaly detection?

1. Anomaly detection for marketing performance

For marketers, every dollar spent, impression, click and conversion is precious.  Traditional approaches to reviewing and improving marketing performance takes days to weeks to react to issues, leaving your business to spend money on marketing channels that don’t generate maximum returns, and leave revenue on the table that could otherwise be captured.

Often, performance and analytics teams look at last week’s (or month, or quarter) performance, to understand which campaigns were able to drive conversions, clicks and impressions.  Traditionally delivered using dashboards, these static analyses often arrive to late for the information to be actionable.

As an example – if a campaign suddenly stops generating conversions, would you like to know within hours, or at the end of the working week?  In the latter case, at least several days of conversions (and their associated revenue) has already been lost.

Anomaly detection solves this problem by highlighting the problem as soon as it occurs – so humans can take action – for example, this is traced back to an configuration change (e.g. tags, website), the relevant teams can be contacted to rectify these problems.  If it’s related to a paid campaign, the team can focus on specific ones to ensure the right audiences are targeted.

2. Anomaly detection for sales performance

Website checkout funnel

In eCommerce, customers have come to expect a smooth flow from visit to purchase completion – but problems can occur in every step of the journey in between.  Failing to address these issues as they (often inevitably) arise costs the business revenue.

Reports on eCommerce funnel activity are often shared for review weekly – but an interruption of even a day can be extremely costly in terms of a drop in sales – especially if the problem is across multiple areas of a website.

As an example – after an application update, if a certain payment gateway is not working for users from specific countries and is not picked up for a week, this could result in tens of thousands of dollars in lost revenue.

Detecting this type of early on with anomaly detection and activating the response team will significantly limit the impact of these events, and return cash into the business rapidly.

3. Anomaly detection for user experience

Customer complaints

For consumer-facing businesses, an error-free experience is crucial – be it in content streaming, service provision (think Gmail), or social media.  Users rightly feel out of pocket if the service they’ve paid for doesn’t function as expected!

As an example, a content-streaming website focusing on live sports depends on subscriptions to their web, mobile and tablet applications to view sports events, as they happen.  An issue with the sign-in module that prevents users from logging in grew from a small problem to an extremely significant one as a major soccer match was streamed, resulting in hundreds of thousands of complaints and an avalanche of subscription cancellations & refunds.

With anomaly detection, leading indicators for this would highlight the unusual behaviour ahead of it becoming a P1, critical issue, and allowed the team to make changes ahead of the important event, retaining the confidence of its customers to deliver on its core offering (and their subscription revenue with it, too!).

Understanding the different anomaly detection methods

In data-poor businesses, anomaly detection is generally manual and performed solely by humans. With only a small number of high-level metrics to track, these are managed by members of the analytics team using data extracts from operational systems.

As businesses grow and becomes data-rich, a new problem arises in the form of scalability.  Tracking now needs be more granular – for example, tracking “overall” sales used to be sufficient, but now monitoring sales split by category and country is now considered essential, reflecting the organisation’s success.

With hundreds, thousands or even millions of metrics to keep on top of, humans struggle to deal with the deluge of information, and dealing with this new complexity becomes both time-consuming and expensive- after all, there’s only so much budget for analysts!

Business Intelligence have tried to satisfy this need with dashboards – but businesses find themselves with hundreds of dashboards, each with reams of graphs, and still reliant on the human operator to check through and assess these dashboards – simply not feasible!

An automated anomaly detection method differs from these historical approaches in three key ways:

  • Processing happens in the background to check for unusual behaviours – a task that a computer is uniquely well-adapted for
  • Only (metrics with) anomalies are shown to users – reducing the cognitive load and allowing them to tell the signal from the noise

Make it easy to share these anomalies (and the insights associated with them) with other users – rallying teams around an event, whether good or bad!

Conclusion: Getting started with anomaly detection

This overview has provided a high-level explanation of

  • What anomaly detection is
  • How it works, and some of the underlying techniques
  • Where it might be applicable within your business
  • Why it’s important for data-rich businesses and alternative approaches

With many companies having embarked on a journey centralising and collating their data —now is a great time to use that data to gain insights that will improve your business outcomes.

If you’re ready to try out anomaly detection on your data, sign up for a free demo/trial!

Do you need Anomaly Detection?

Discover in seconds which factors lead to unexpected changes, the impact of each factor driving the change and the metric forecast.

Button Text