Updating 'The Future of Coding': Qualitative Coding with Generative Large Language Models

In Sociological Methods & Research

Abstract

Over the past decade, social scientists have adapted computational methods for qualitative text analysis, with the hope that they can match the accuracy and reliability of hand coding. The emergence of GPT and open-source generative large language models (LLMs) has transformed this process by shifting from programming to engaging with models using natural language, potentially mimicking the in-depth, inductive, and/or iterative process of qualitative analysis. We test the ability of generative LLMs to replicate and augment traditional qualitative coding, experimenting with multiple prompt structures across four closed- and open-source generative LLMs and proposing a workflow for conducting qualitative coding with generative LLMs. We find that LLMs can perform nearly as well as prior supervised machine learning models in accurately matching hand-coding output. Moreover, using generative LLMs as a natural language interlocutor closely replicates traditional qualitative methods, indicating their potential to transform the qualitative research process, despite ongoing challenges.

Publication
Sociological Methods & Research
Laura K. Nelson
Laura K. Nelson
Associate Professor of Sociology

I use computational methods to study social movements, culture, gender, institutions, and the history of feminism. I’m particularly interested in developing transparent and reproducible text analysis methods for sociology using open-source tools.