There are many different Qualitative-AI tools – some very new, others that have been around a while – and just as many different views about their appropriateness and implications. The previous posts in this series touched on the history of AI in qualitative software programs (CAQDAS-packages), and reviewed three implementations of generative-AI that have been released in 2023; ATLAS.ti’s AI “open-coding”; MAXQDA’s AI Assist, and CoLoop’s AI copilot.
Rather than jumping on the AI bandwagon, or dismissing these tools out-of-hand, consider the middle-ground by exploring what AI actually does in the context of what you need to do with qualitative materials.
In evaluating the different opinions on these AI tools in this post, I consider:
Different types of qualitative research
which leads to the relevant units of data for generating insight
which in turn suggests the importance of considering when assistance happens
so we can then assess the kind of ‘intelligence’ we need in qualitative analysis.
Then, depending on how transparent AI tools are
we can consider the ethics and potential dangers of AI in qualitative analysis
What type of qualitative research are you doing?
Qualitative research and qualitative data analysis are not homogenous. There’s a lot of diversity in terms of the purposes of studies, materials analysed, techniques employed to analyse them, and the outcomes required. These characteristics are usefully considered in relation to the methodological spectrum.
On one end are the more purist qualitative projects where interpretation and reflexivity is the name of the game. On the other end are qualitative projects based on a more positivist position, where larger bodies of data are involved, and generalisability, measurement and replicability are prized. In the middle are a range of pluralist approaches that see the value of all these things, thus mixing methods with the aim of generating holistic, balanced perspectives. For more on this spectrum in the context of qualitative methods and tools check out this webinar I did on the impact of the use of CAQDAS-packages on reflective practice and reflexivity.
Qualitative research often means academic studies and applied projects towards the “purist” end of the continuum. But qualitative data is used right across the methodological spectrum, and analytic methods and tools also span the spectrum. It’s natural to think that AI tools are only appropriate towards the positivist end of the spectrum, because computer-assistance has an essential role with large sets of qualitative materials and when the focus is on quantification, measurement, validity, generalisation etc.
A powerful example of the impact of where you sit on this spectrum is coding in team-based projects. For projects at the more positivist end of the spectrum there is a need for a high degree of reliability in coding. Many CAQDAS packages include functions to calculate such a statistic, usually based on two or more humans coding the same data sources using a pre-agreed framework of codes to identify and then eliminate disagreement. On the other end of the spectrum, researchers tend to collaborate not because they want to avoid bias but because they want to incorporate a range of perspectives and therefore they would not only expect there to be variation in how humans code, but value the differences and discuss and reflect on them as part of the process.
This illustrates that where researchers sit with regard to this and many other debates has a huge impact on how qualitative software is incorporated inro analytic practice and reported on.
Now, with generative-AI coming into play, these spectrum-issues are even more important to consider and discuss.
Different perspectives are valid
If you’re doing a project that sits more towards the “purist” end you may therefore be more concerned about the potential implications of generative Qualitative-AI on your analysis (more on this below) than if your project sits more towards the “positivist” end, when you may naturally be super excited about the possibilities.
Both perspectives are valid, as are those in between.
When software designed to facilitate the analysis of qualitative data first made an appearance (the mid-1980s) the debate began about the role of computers in what was then considered a ‘craft’. Over the years the debate has ebbed and flowed, And only recently is there more wide-spread acceptance of the role of software in assisting analytic practice in all types of qualitative projects – perhaps due to the increasing normalisation of technology use in all areas of professional and personal life. There have always - and continue to be - advocates and critics of CAQDAS use, including a bunch (too many IMO) of students and researchers who just don’t know about the available tech.
This long experience of debates over CAQDAS-use is a good lesson for doing a better job of evaluating the appropriate use of the new tech of generative-AI across the methodological spectrum.
Keep your mind open
I urge those who work in the more “purist” space, as well as anyone who is cautious about Qualitative-AI, to keep your minds open… there are things that qualitative software generally, and AI-tools specifically, can help you do that are either tricky to manage or impossible to accomplish when working with paper-and-pen methods or using non-dedicated software tools (e.g. MS Word, Excel, etc.).
These tools don’t take away your control over the interpretive process, whatever you may have read that suggests they do. Neither do they homogenise methods.
Proceed with a critical eye
On the other hand, if you’re super excited about the possibilities that generative Qualitative-AI brings, proceed with creativity and caution in equal measure. And don’t be sucked in by marketing that claims Qualitative-AI will solve all your analytic problems and cut the time it takes to do qualitative analysis dramatically. This was not the case with CAQDAS when it first became available, and will not be the case for AI tools if your aim is high-quality interpretation.
The use of CAQDAS can speed up certain tasks (those that are laborious, time-consuming or impossible to do manually), and it can help us be more rigorous in our analysis (because we can access what we’ve done at the click of a button). But qualitative analysis is still time-consuming. It will be the same with Qualitative-AI tools.
Suggestion: Make sure you check what software tools, particularly generative-AI, are actually doing. Be sure to choose tools that allow you to have an appropriate level of input to the process.
What is a relevant unit of data to generate insight from?
Units of data are important in any qualitative analysis, and when discussing how to harness CAQDAS-packages for different analytic needs they’re especially important because they drive the appropriate choice of software component to accomplish each analytic task (Woolf & Silver, 2018). In considering the use of generative Qualitative-AI tools that do qualitative coding, data units are paramount because the program needs a cohesive chunk of data (aka a data unit) to look at in determining ‘what’s going on’. The codes the AI generates need to be attached to something; a sentence or paragraph being the typical choice.
The size of the data unit most appropriate to code – word, sentence, paragraph etc. – varies by data type and analytic focus, whether you’re using AI or not. For example, a word or phrase might be appropriate in a classic quantitative content analysis, whereas a paragraph might be appropriate for free-text responses to open-ended survey questions, and a sentence or paragraph may be more appropriate in more narrative qualitative data. Indeed, the appropriate size may vary within each item of data (transcript, report, field-note, etc.).
But it’s worth noting that Dr Stu Shulman whose been harnessing AI in qualitative software tools for circa 20 years, is absolutely clear about how AI tools are best suited for “small, coherent data units”. Stu is the developer of the software tool DiscoverText that has offered collaborative text analytics for human and machine-learning for two decades (see “it’s not new folks”). This is why he advocates the use of DiscoverText for qualitative data such as responses to open-ended survey questions, or social media posts, such as Tweets. For example, in my conversation with him for my #CAQDASchat podcast, he explicitly states that although researchers do use his tools for the analysis of interview transcripts, they are not best suited for that purpose, having not been designed for the task.
Similarly, for some types of qualitative material, the data units to which generative-AI applies its codes are not in themselves meaningful, so such tools will not be appropriate.
For example, a key issue for generative qualitative-AI tools applied to the transcripts of conversations in interviews or focus-group discussions is that the appropriate unit varies throughout the transcribed encounter. AI tools currently cannot make these critical distinctions. This issue needs to be resolved for those working towards the more “purist” end of the spectrum generative-AI coding tools are to contribute anything more than data familiarisation. Sometimes a sentence may be appropriate, other times a paragraph, and so on. A blanket choice is not likely to be appropriate across such materials.
When does the assistance happen?
When the assistance happens is as important as what the assistance comprises. Here are three scenarios:
ATLAS.ti and CoLoop initially implemented generative-AI with the assistance happening first: the program does its thing (whether it’s generating codes or summaries) and then we, the human interpreter, look at what it’s done, and adjust it as required according to our needs. So the computer assistance comes first, and the human correction comes after.
In contrast, MAXQDA initially implemented it’s AI Assist tool with the human coding happening first, and then the AI summarises what has been coded.
In DiscoverText the sequencing oscillates. Teams of humans code and their coding is adjudicated. Then the machine does it’s thing, looking at the coding that’s been done by the humans, and using that to code further data. This is then reviewed by humans, adjudicated, and sent back to the machine, which learns from the human, and so on. This is more in keeping with the use of AI in other contexts, from chess playing to business applications: human + AI consistently outperforms a human alone or an AI alone.
In reflecting on the appropriateness of Qualitative-AI tools for a given project, the sequencing, as well as the extent to which the machine learns is fundamental.
What kind of ‘intelligence’ do we want?
The clue is in the name - the ‘intelligence’ is artificial and at the current stage of development a bit of a misnomer. Do these technologies actually provide ‘intelligence’? Various alternative names have been floated for recent AI developments, such as applied statistics for CHATbots like GPT4. But “ AI” seems to be too established a label to change, so we will keep talking about “intelligence”.
What kind of “intelligence” do we want as qualitative researchers? The head teacher at my daughter’s school recently talked about how AI might impact education and how school teachers could best respond to these technologies. What resonated most in his talk was that the forefront of the school’s approach was to encourage “emotional intelligence” in the pupils. This led to discussion about how to equip schoolchildren with appropriate ways of engaging with AI (rather than banning it, a fruitless exercise). For information on developing AI-literacy in university students and staff it’s worth checking out the recently published principles on the use of AI in education by Russel Group universities in the UK (4th July 2023)
So what kind of ‘intelligence’ do you want as a qualitative researcher?
That’s a question we can usefully discuss much more elsewhere, but for now let’s assume it’s ‘interpretive intelligence’, which is relevant all the way along the methodological spectrum IMO, not just at the ‘purist’ end, because quantitative researchers are making interpretations all the time as well. Interpretation is not the sole preserve of the qualitative researcher (a topic for another day)
The idea of ‘interpretive intelligence’ implies many things in terms of what we need and want out of computer-assistance. And it’s worth knowing that there is ongoing debate about whether AI will be able to think, understand, interpret – which has huge implications for qualitative analysis - see this piece by David Brooks in the New York Times: Humans are soon going to be eclipsed.
Look under the hood: how transparent are the tools?
If as qualitative researchers we want ‘interpretive intelligence’, we must question what’s under-the-hood of these technologies? Being aware of how AI has been implemented in qualitative software and how it works is a fundamental question before using these tools in earnest for a real project. Otherwise how can you consider the usefulness of the results, let alone rely on them?
Some questions to ask yourself:
Are the Qual-AI tools you’re considering third-party or developed specifically for qualitative data analysis purposes?
How open are the developers about how their Qual-AI tools work? Do they explain this in their documentation in a way that you can understand? If not, can you justify their use?
How much input do you have on the set-up (before the AI does it’s thing), in influencing how the tool continues doing it’s thing (this is the learning bit – can you teach the tool what YOU need assistance with for this project…?), and in adjusting the output that it generates?
To what extent can your analytic objectives and research questions be used to focus what the AI is doing for you? If you are doing an empirical study based on primary qualitative materials you designed the data collection tools. If you’re doing a study drawing on secondary materials (e.g. archival materials, reports and documents, internet-harvested data), you decided the criteria for what to include in the study. Does the AI you’re considering using allow you to also drive the analytic focus or is it just looking at what’s there are telling you a bunch of things unrelated to your analytic focus?
The ethics and potential dangers of AI in qualitative analysis
Ethics is crucial to how we go about all aspects of qualitative projects, and when AI comes into the picture, there are some additional considerations.
How are the language models that underlie AI developed
Being aware of the nature of the work being done by millions of humans to “make tech seem human” in the AI factory is important. The large language models (LLMs) behind generative-AI have been trained. In the media we’ve heard about the biases embedded within them, but we should also understand how the training is happening. A massive task that as discussed in this article by Josh Dzieza Inside the AI Factory: the humans that make tech seem human , is being undertaken by a “vast underclass” of very low paid annotators who sort and tag enormous quantities of data so the AI can learn.
How does this sit with your values as a qualitative researcher? Does your qualitative-ai tool use technologies developed in this way?
If we want “interpretative intelligence” then is training text classifiers in relation to our research topics and analytic needs more appropriate? Something else to check out for the tools you’re considering using.
Model collapse and the complete loss of interpretation
Then there’s the question of what happens to these models when there’s so much AI-generated content out there that they are training on that content rather than human-generated content : AI researchers are already talking about ‘model collapse’ arising from this AI feedback loop.
This has huge implications for the value of these technologies for qualitative analysis, but also brings into focus another major potential issue; that AI technologies infiltrating all phases of the qualitative research process risks there being no human interpretation at all, if used inappropriately.
Machines taking over? The risk of qualitative-deepfake
It's feasible that qualitative research recordings be transcribed by AI and not checked for accuracy by a human, then put into a generative-AI tool and ‘analysed’ without a human checking the accuracy of the output, then the resulting ‘research report’ being summarised by AI for humans who don’t want to read the whole thing themselves.
This would be qualitative deepfake.
When CAQDAS-packages first became available there were concerns that computers would take over. Up to the release of generative-ai that fear has been a fallacy because the tools have never been able to do the analysis for us, it’s only ever been us, the humans operating the tools and doing the interpretation.
There’s no doubt that we’re in a different place now in terms of the potential for the machine to take over. But it remains us humans who make the decision as to whether to use these tools, how to use them, and how to make use of what they generate.
Do you have the right to upload participant data?
Then there’s the question of what you’re doing with data when you use AI tools. Generative AI tools that employ Open AI technology – including those embedded within existing CAQDAS-packages at the moment – require you to upload the research data in order for the AI to do its thing. Do you have the ethical clearance from your research participants and your ethics board to do so? If not, you shouldn’t be going anywhere near these tools. Tools that use AI technology within their own platform are different, as your data remains where you first uploaded it, and those tools have very clear ethics statements about data storage.
Our responsibilities to our research participants
In addition are our responsibilities to those who participated in our research studies. How would they feel if they knew we were using AI to analyse their experiences and opinions? We have a responsibility to treat their accounts appropriately. Be sure you can justify that it is appropriate in this context as well, to use AI tools.
Change happens fast
Of course, what’s possible in the Qualitative-AI space will continue to change, and probably quite fast. Since generative AI technology became widely available it has already developed significantly. It will continue to do so.
I’m most interested in how these tools will develop in ways that harness what computers can do best, in combination with what we as humans can do best. This is work that has already been underway for decades – so if you’re also interested in how humans and computers can work together, to balance their affordances (and they do both have affordances, let’s be honest) then you should be not only watching the new developments, but also engaging with the existing ones.
To that end, check out the second post in this series where I give an overview of CAQDAS-tools that have provided forms of AI long before the recent newer tools and others like them discussed in subsequent posts. In particular, those that incorporate supervised machine learning capabilities.
Wherever you naturally sit on the methodological spectrum and whatever the characteristics and needs of your research project, it makes sense to look into the existing and newly emerging AI tools for qualitative data analysis before you either dismiss or embrace them. As qualitative research is not a homogenous practice, we want a variety of things, depending on where along the methodological spectrum our research project is located. So, market researchers and others who typically have a very short turnaround time to produce findings from qualitative data, and/or do not require a very in-depth level of analysis or insight on an interpretive level, these new generative AI tools may have their place.
Research, reflect and experiment before you dismiss or embrace Qual AI
In other types of qualitative research project, located more towards the ‘purist’ end of the scale, projects located within interpretivism, constructivism, etc. their place is less obvious, and will likely be less likely to be embraced. These projects will need the ‘intelligence’ to be ‘interpretive’, and as yet most of these researchers will probably conclude – maybe without even trying the tools out – that they do not and cannot provide that.
This all just illustrates the sense in which these innovations, like their predecessors, are tools that need to be used appropriately. Just like you pick and choose the appropriate tools from your garden shed according to the gardening task at hand, you pick and choose software tools appropriate to the analytic task at hand.
That involves considering the same things we have always done when choosing tools: what are the objectives of the research study, what are the specific needs of this task, what kinds and amounts of data are we working with, what are the outcome requirements of the task, who is the audience that will consume the findings, and so on.
Don’t believe all the hype. Check it out properly
Instead look behind the marketing slogans and the claims being made and find out how these tools actually work so you can make an informed choice as to whether they are appropriate for your project.
How much is the value of interpretive intelligence at the heart of how the tools you’re considering using, as opposed to simple speed of process? And if the information about the functioning of the tools you’re interested in is opaque, call the developers out.
Finally, make sure you can justify the use of Qualitative-AI tools ethically as well as methodologically, and make sure you document how you are using them.