AI

Grading the Grids: What Works and What Doesn’t Using Paradata to Assess Response Quality and Usability

Abstract

Grids (or matrix, table) are commonly used on self-administered surveys. In order to optimize online surveys for smartphones, grid designs aiming for small-screen devices are emerging. In this study we investigate four research questions regarding the effectiveness and drawbacks of different grid designs, more specifically do the grid design effect: Data quality, as indicated by breakoffs, satisfying behaviors and response errors Response time Response distributions Inter-relationships among questions We conducted two experiments. The first experiment was conducted in April 2016 in Brazil, US and Germany. We tested a progressive grid, a responsive and a collapsable grid. Results were analyzed for desktop/laptops only due to the small number of respondents who took the study via smartphones. We found the collapsable grid eliciting the highest amount of error prompts for item nonresponse. The second experiment was fielded in August 2016 testing grid designs on three types of answer scales: a 7-point fully-labeled rating scale, a 5-point fully-labeled rating scale, and a 6-point fully-labeled frequency scale. Respondents from the US and Japan to an online survey were randomly assigned to one of three conditions: (a) no grid, where each question was presented on a separate screen; (b) responsive grid, where a grid is shown on large screens and as single-column vertical table on small screens (with question stem fixed as header); (c) progressive grid, where grouped questions were presented screen-by-screen with question stem and sub-questions (stubs) fixed on top. Quotas were enforced so that half of the respondents completed the survey on large-screen devices (desktop/tablet computers) and the other half on smartphones. Respondents were 600 per grid condition per screen size per country. Findings showed that progressive grid had less straightlining and response errors whereas responsive grid had less break-offs. Differences were also found between grid designs in terms of response time and response distributions; however patterns varied by country, screen size and answer scales. Further analysis will explore the effect of grid design on question inter-relationships. While visual and interactive features impact the utility of grid designs, we found that the effects might vary by question types, screen sizes, and countries. More experiments are needed to explore designs truly optimized for online surveys.