Assessing data quality in a Big convenience sample of work wellbeing
Wellbeing Research Centre / Nuffield Foundation
William Fleming, George Ward and Jan-Emmanuel De Neve
Abstract
Survey research is facing a multitude of challenges to its validity, especially for the study of labour and organisations. Online surveys with non-probability, convenience samples are simultaneously seen as part of the problem and a promising solution. Methodological literature argues that researchers should not think of data quality of online surveys in terms of ‘good’ and ‘bad’ but in degrees, with a series of recommendations scattered across disciplines for assessing and managing data limitations. We present a case study of a Big, multi-level, online, convenience sample of subjective work wellbeing, the Indeed Work Wellbeing Score survey (IWWS). IWWS is an ongoing international survey of subjective work wellbeing, with over 20,000,000 responses and growing. In this study we evaluate the UK subsample collected by October 2023 (N = 1,463,503). While a prima facie valuable source of data, the data generation process raises concerns of selection bias and inattentive responses. We evaluate the extent of bias, variation in bias, response rates, internal consistency and employer cluster-level reliability. We then turn to considering what types of research questions a researcher may want to answer with the data, especially unit comparisons at different survey units and inter-item relationships. Overall, we suggest that at the individual, employee level, the survey suffers from selection and binary bias in responses, but that at the employer-level IWWS offers a valuable resource to supplement existing random probability surveys of work and wellbeing. In our conclusions we offer practical methodological recommendations for others using Big, online convenience samples. Finally, we provide commentary on the strengths and limitations of the IWWS for ongoing and future research, as well as the value for businesses, jobseekers and policy-makers.