ADOTAS – Working in research, I often catch myself quoting stereotypes, but only because they end up being true. Minivans are more likely to be driven by moms; Neiman Marcus shoppers are higher-income. And in the world of research, survey-takers are more likely to be women and more likely to be older.
In research-speak, we refer to it as non-response bias or self-selection bias. Essentially, certain segments of the population (e.g., men) are more likely to be non-responders to surveys.
It’s the reality of what happens when you invite someone to fill out a survey and, typically, it’s not a big deal. If I’m looking for a sample of 50 men and 50 women, I could take the proactive step of inviting more men than women to fill out my survey, understanding that fewer men will be willing to take the survey, and at the end the numbers will even out.
Type A Problems
However, taking that proactive step is not always possible. Let’s assume we don’t know the gender of the people we’re inviting to our survey – something I’ll call a type A problem. In this case I just have to invite a bunch of people and look at the gender distribution after I collect the data. And since women are more likely to fill out surveys than men, I might look at my results and find I have 60 women and 40 men. Not exactly my 50/50 split. Well, research affords another tool to turn this situation into the results we want, namely weighting. I can weight down the women and weight up the men. If each man’s opinion is scored 1.25 times, I’ll have the equivalent of 50 men’s opinions, and if each woman’s opinion is scored 0.83 times, I’ll have the equivalent of 50 women’s opinions. So in essence, type A problems are easily solved.
Type B Problems
But what happens if I don’t know the gender breakdown of the people I’m sampling, and I don’t know how many should be men and how many should be women – a type B problem? Let’s walk through that scenario. I invite my sample to take my survey; 70 women and 30 men respond. In this case, I don’t know if the final result should be 50/50 or 60/40 or 70/30 — that data is just not available. Based on the realities of non-response, I can’t really trust the 100 responses I’ve collected, but, then again, I’ve no way to correct the data. This is the challenge with type B problems: They can’t be solved and they don’t produce sound research results. Yet clients are out there paying good money for this kind of data on a daily basis.
Online Ad Effectiveness = Type B Problem
Let me explain that last point in more detail. Your typical ad effectiveness study is run using a pop-up recruitment method. These invitations are triggered randomly at the time of exposure to the advertising. It’s great because it’s random, but at the same time you have no visibility into who’s going to fill out the survey until your study is complete. So what I have here is an example where I don’t have any advance insight into who’s going to get the survey invitation.
As I previously mentioned, this is not a big deal because we should be able to weight the data after the fact to account for any non-response bias. But wait — in an ad campaign, I don’t know who saw my ads in the first place, so I don’t know how to weight the data. I don’t know if I have too many men, or too many women, etc. Essentially, if I have no visibility into the sample prior to the campaign and no idea how the campaign was distributed across demographic audiences, then I’m completely in the dark as to the efficacy of my results. If you’re running studies using these kinds of invites (e.g. Vizu, SafeCount, etc.) then you have this problem whether you’re aware of it or not.
Let’s not forget to mention that targeted advertising exacerbates the issue. If I have a campaign that targets men, yet most of my ad effectiveness results are coming from women, I can come to one of two conclusions: 1) my targeting didn’t work and I actually reached more women than men, or 2) women are more likely than men to fill out online surveys. Now that we’re all up to speed on non-response, it’s obvious that the answer is number 2 – women are more likely to fill out surveys than men.
What many online ad effectiveness solutions lack is any understanding of the actual audience being reached, and in some cases they also lack information about who is being reached. Without those critical pieces of information, a researcher has no tools that can be used to adjust for biases in the data. When biases exist in a dataset, you can’t trust the conclusions you draw.
Solving the Type B Problem
We do things a little differently at my company, InsightExpress. First and foremost, we leverage our Ignite Network as a preferred approach to online ad measurement. Since it’s panel-based, we’re able to determine a significant amount of information about the audience being reached by advertising. In fact, prior to sending any surveys, we can determine who’s being reached by the campaign. So in those cases where an advertiser might be targeting men ages 18 to 34, we can quantify with the panel how many men 18 to 34 the advertiser is actually reaching. This fantastic bit of information serves as the necessary data to combat the issue of non-response. Even if our sample has some non-response biases (which all surveys have), we can correct those biases by balancing the survey results against our Ignite Network responses. Voila, correct data!
Non-Response and Shortcut Form Research
Think about it: This is another reason why short-form, one-question approaches are generating bad results. How would you correct for non-response in your data if you have no demographic information to enable that correction? Short-form research suffers the biggest challenge of all: They know nothing about the people they sample prior to sampling them, and they still know nothing about those people after sampling them. Even if you had a calibration source for one-question questionnaires, you don’t have any demos on the respondents to correct for the non-response bias; they’re engineered to produce bad data.
Of course, the counter-argument you’ll hear from short-form vendors (specifically Vizu) is that their response rate is so high it makes weighting irrelevant. While this line of reasoning might sound great, let me point out two fundamental flaws in their logic. First, even if your response rate is multiple times higher than the average response rate, you’re still at less than a 5 percent response rate, which is far from being representative (check out my previous post for more background on response rates). Secondly, non-response is non-response, which put another way is to say that people who don’t respond will still not be responding. Women will still be more likely to take surveys than men, and older people will still be more inclined to take surveys than younger people, and so on. No matter what any company does, there will still be biases in the population where specific segments are willing to take surveys and others are not willing. Nothing can change that fact, and the only tool you have as a researcher is data to adjust for and correct that bias. However, when you run your surveys as one-question questionnaires, you’ve engineered yourself into a corner. You can’t correct your data because you have nothing to use to correct the problem.
When in Doubt, Ask
If you’re buying ad effectiveness research from a vendor, make sure you ask them if they correct for non-response biases by weighting the in-tab results against an audience delivery report. If theydon’t, you can’t necessarily trust the data. To that extent, even if they adjust their results based on an audience delivery report, make sure you understand the source of that report, because it could also suffer from a non-response bias, depending on how the data is collected.