From the onset, digital advertising was the one advertising channel which would provide marketers and agencies with accuracy. Marketers didn’t necessarily know who / how many saw their ad on TV, in a magazine or a newspaper, but with digital advertising, they had an indication of who / how many clicked, viewed and engaged with their ads.

This knowledge was supposed to make digital advertising more accurate.

And that’s why we probably tend to use the terms ‘accurate’ and ‘accuracy’ a lot in digital advertising. In fact, many companies with probabilistic solutions for identifying users across their devices talk about the accuracy of their solutions.

Cross-device identification is really a classification problem when you get right down to it. In this classification problem we try and classify each pair of devices as either a ‘match’ or ‘no match’ so that we can identify if the pair of devices actually belong to the same person (‘match’) or not (‘no match’).

To explain how Accuracy is calculated, let me take you through the following two examples.

(Click here to see all of the formulas described in this article.)

For the sake of the examples, let’s limit ourselves to desktop to mobile device matches. We’ll say that we have 500 people and each has two devices, one desktop and one mobile. Therefore, there are 1000 devices in the set of devices belonging to these 500 people.

The number of total possible desktop to mobile pairs is 500*500 (each desktop with each mobile). The number of pairs that are in fact a match is exactly 500. The number of pairs that are not a match is the number of total possible pairs minus pairs which are a match: 500*500-500 = 249,500.

Algorithm 1

Classifies 1 desktop to mobile pair as a ‘match’ while classifying all the other remaining pairs as ‘not matched’. We’ll then assume that it is always correct for the one pair which it classified as a ‘match’, meaning it was correct in classifying this pair as a ‘match’.

Algorithm 2

Classifies 501 desktop to mobile pairs as a ‘match’ while classifying all the other remaining pairs as ‘not matched’. Now let’s assume that it is always correct for 500 of the pairs which it classified as a ‘match’ and wrong for the one remaining pair.

According to the formula for Accuracy, both of these algorithms are 99% accurate, even though algorithm 1 identified only 1 match correctly while algorithm 2 identified all 500 matches correctly.

Why?

Because the formula for Accuracy weights matches and non-matches the same, yet when running cross-device campaigns, we’re only interested in the actual matches. This is why Accuracy isn’t a reliable formula for determining the effectiveness of probabilistic cross-device solutions.

Precision and Recall

Two metrics better suited for measuring cross-device identification solutions are Precision and Recall. When calculated in tandem, we’re able to reliably measure the effectiveness of our probabilistic cross-device identification solution.

While the formula for Accuracy included all possible pairs – 250,000 in the above examples – the formulas for Precision and Recall eliminate one sub-set of pairs – those which have been correctly identified as ‘non-matched’ (known as ‘true negatives’). By eliminating the sub-set which accounts for 99.99% of all pairs in the two aforementioned examples, we’re able to provide measurements for our cross-device solution which are far more meaningful.

As the name suggests, the formula for Precision shows how precise our sub-set of matched pairs is.

For Algorithm 1, the Precision of the matched pairs is 100%, meaning that the data is precise. For Algorithm 2, the Precision of the matched pairs is 99.8%. Thinking a bit about this, it is obvious that the 2^{nd} algorithm gives much better performance but this is not directly reflected in the Precision. For this to become apparent we need to also look at the Recall.

The Recall for Algorithm 1 (one ‘matched’ desktop to mobile device pair out of 500 pairs) is 0.2%, while the Recall for Algorithm 2 (500 ‘matched’ pairs out of 500 pairs) is 100%.

Think of Recall as “out of all the real ‘matched’ pairs that exist, what is the percentage of pairs that were correctly classified as ‘matched’, ” and Precision as “out of all the pairs that were classified as ‘matched’, what is the percentage of pairs for which this classification was correct”.

Now we can see that Algorithm 1, while having a very high Precision has a very low Recall thus rendering it completely useless. At same time Algorithm 2 which has comparable Precision has a very high Recall making it an ideal solution.

So why do cross-device identification data providers focus on Accuracy instead of Precision and Recall? I think providers of ad-related data have always sought out Accuracy as the Holy Grail, so with the rise of cross-device identification, they stuck with the same nomenclature. Perhaps some were concerned that using the term ‘Recall’ might cause confusion with ‘Ad Recall’, a commonly used term in advertising research.

Regardless of the reason, with the growing significance of cross-device identification, it’s important that we correctly define the metrics that impact our industry. With a range of new devices that might be included in future ad campaigns – like a smart watch or an Internet of Things-enabled appliance – it’s critical that we establish precise methodology for cross-device identification today.

(Click here to see all of the formulas described in this article.)