Machine learning algorithms can predict human wellbeing better than traditional econometric models, according to new research.
The significant findings, published in the journal Scientific Reports, are the first of their kind and could dramatically change the way we measure, study, and consider human wellbeing.
Researchers from the Wellbeing Research Centre at the University of Oxford formed part of an interdisciplinary team which pitted two different ‘tree-based’ machine learning algorithms against a variety of standard econometric models, using nationally representative samples from Germany, the UK, and the United States.
Traditionally, economists and other researchers have relied on conventional linear models to attempt to model the variables – or ‘drivers’ – which positively or negatively impact individuals’ self-reported subjective wellbeing. These drivers include measures such as age, income, and household size.
Such traditional techniques only allow a limited number of variables to be tested together at one time and, since the variables to be tested must be selected by a human researcher, may be subject to unintentional bias like any other human-run experiment.
Instead, each machine learning algorithm was fed data across hundreds of different variables, and tasked with assessing the relative importance of each variable to self-reported wellbeing scores, at a population level.
This novel approach, when compared to conventional human-run linear models, allowed the researchers to better identify trends across time and – in particular – explore the interactions between different variables in their impact upon wellbeing.
Dr Ekaterina Oparina, a research economist at the London School of Economics and Political Science and joint first author of the study, said: “It is exciting to leverage machine learning in this context: it helps us better understand what makes people happy and allows us to test earlier knowledge on the subject. We can now confirm that factors identified as important in earlier works, like interpersonal relationships and health, continue to matter in this more nuanced setting.”
“What is this wellbeing ‘dark matter’?”
Dr Caspar Kaiser, Assistant Professor in the Behavioural Science Group at Warwick Business School, Research Fellow at the Wellbeing Research Centre, and joint first author for the study, said: “Perhaps counterintuitively, the thing I am most excited about is our finding that when using all available data in surveys, and when using the most flexible algorithms available, we can explain about 30% of people’s wellbeing. This means that a large share of people’s wellbeing remains unexplored. What is this wellbeing ‘dark matter’? Presumably, only moving beyond traditional surveys will allow us to uncover this – and that’s something I really look forward to.”
Dr Niccolò Gentile, a research data scientist at the University of Luxembourg and joint first author of the study, said: “It’s safe to say that Natural Language Processing – and Large Language Modelling (LLM) in particular – are increasingly capturing the scientific community’s interest. Most of the best known LLMs rely on the Transformer architecture, first published in 2017, well before the start of our research. GPT-1 and BERT, similarly, were first presented in 2018. Progress is instead found in innovative ways of how to combine and optimize them.
“While some literature applying LLMs to tabular data is slowly emerging, at the time of writing, no LLM-related technique has been found to consistently outperform the types of ‘traditional’ machine learning techniques we used in this study. Considering that tabular data still represents a large chunk of what researchers and companies work with, it remains crucial not to underestimate any solution just because it’s ‘old’.”
‘Machine learning in the prediction of human wellbeing’ is published in Scientific Reports.