Last week at The Quirk’s Event Virtual Global, Quest took the General session stage to present our initial findings on the impact of Length of Interview (LOI) on data degradation. With much interest amongst attendees and viewers of our session, Quest is pleased to share the detail on the design of our case study and our findings in depth.
As the video above outlines, clearly this can take us into many different directions and conclusions, but for now, we will simply focus on our initial steps at this stage:
Establish acceptable factoring around data degradation as it relates to respondent engagement at various lengths of online interviews.
To be clear, at Quest, we are not attacking long surveys. Quite the contrary. Long surveys have a place. They are not going away. They have been around since someone decided to survey someone else and will be around as long as someone is still willing to survey someone else. What we want to do is help researchers to realize, and be able to advise their clients, about what the effects are on data quality, accuracy, reliability as a given question appears later in a survey. Less "short surveys are better than long", and more "here's how that question placement, at different spots in a "longer" survey, could affect the results for that information gained. That could mean a factor or an effect as simple as data collected at the eight to ten minute interval is 4% less accurate than data collected at the six to eight minute interval, and so on. Or it could end up being something entirely more complicated and different. The purpose of this factoring is to enable researchers the ability to present data at levels of confidence throughout. Analyzing margin of error at intervals as opposed to the survey as a whole. Clearly this can take us into many different directions and conclusions but for now, we will simply focus on our initial steps at this stage:
We decided on an approach to ask the same set of measured questions (with different text), separated by “filler” sections, to compare engagement metrics at various survey lengths. The primary goal was to focus on how engaged the respondents were – the amount of time spent for the overall section, and per question as well as the number of words and content for open ends as respondents progressed through the survey.
We wanted a topic which was commonly understood by consumers. Something “everyone” knows. Easing demographic targeting and providing a suitable basis for a respondent group that wouldn’t need to be screened for behavioral or attitudinal aspects to participate. We settled on grocery shopping. Who doesn’t know what you do in a grocery store? Of course, COVID may have changed the experience to some extent recently. But what is important to a shopper, what they look for, how they buy, what they buy – that endures.
So, after feeling that this topic was about as “neutral” as we could find, we went with this general spec:
Data Degradation Sections:
Results and Scott Worthge’s comments:
Average Section Time:
The respondent clearly is more engaged at the 2 minute mark.
Overall, the time spent for the entire metric section, the four questions asked, was telling. At two minutes into the survey, the respondents spent just past 3 and a half minutes completing the first data degradation section. When the next data degradation section was presented at the 10 minute mark, the time spent answering dropped to just under two minutes, and stayed there for the third section at 15 minutes. 90 seconds less spent answering 4 questions. You might argue for familiarity with the question format the second time around leading to less time needed. But we don’t think that’s it – we’re asking people about what kind of bread products they prefer, and if they like apples better than bananas. Not the kind of questions where a familiarity bias would account for that large a difference in time, in our opinion. ~SW
Evidence of a stronger engagement again at early sectors with some variance per question type.
For the multi-select question, the initial engagement time was 40 seconds. By the time the same questions came up later, the average time to answer took 27 seconds, and the third time around was about the same, 29 seconds. Given the semi-famous line that the average attention span of a squirrel is about 7 seconds (when compared favorably to that of many modern consumers) we got about a squirrel and a half’s difference. Well, seriously, we see this as an important metric – a LOT less time was spent on just the same question, presented the same way, later in the survey.
This was much more pronounced with our open-end – the first time we asked an O/E in the data degradation section1, the average time spent answering was 80 seconds – good thoughtful answers, right? O/E time spent dropped to 47 seconds at the second data degradation section, and further to 37 seconds by the third section. Less than half the time spent, when we got to 15 minutes, for an O/E answer.
Our results for the ranking and rating questions, however, were not what we expected. The average time was 50 seconds for question 2 and 35 seconds for question 3 during the first data degradation section. Question 2 increased to 59 seconds in the second section, then dropped to 43 seconds the third time around – up and then down. Question 3 dropped from 35 to 25 seconds, then increased to 44 seconds across the three data degradation sections – down and then up. What does this all mean? We did not see a consistent pattern of “later in the survey means less time spent on a given question” for these two of the four measured. When we looked at the data, our labels for the products we asked respondents to rate and rank varied in length – some comparisons such as “apples” and “oranges” were much faster to comprehend and compare, we postulated (and these questions took less time for the rate and rank). Compare that to “Artisanal bread (delivered fresh daily)” and “Specialty bagels (seeded, asiago, etc.)” – we figure a lot more words for the item description need to be read before a comparison could be made (and that is what we saw in the data – longer descriptors equaled longer tome to complete the questions. That is our hypothesis, and we will be testing that further as we dive deeper into this subject. ~SW
Thank you for watching the presentation, reviewing our outline here and following us as we continue this research. For more updates, please follow us:
~ ‘Data Degradation’ was first presented at the Quirk's Virtual Conference on Feb 23rd, 2021. A downloadable copy of the presentation video can be found here. For further information, slides or any questions, please do contact any of the following:
Greg Matheson (Managing Partner) firstname.lastname@example.org
Scott Worthge (Senior Director, Research Solutions) email@example.com
Moneeza Ali (Director, Marketing) firstname.lastname@example.org