AI-Assist beta from MAXQDA - what's it good for?

Updated: Jul 7, 2023

UPDATES: new features have been released in AI-Assist by MAXQDA since this post was first published - these are discussed at the bottom

In my previous post I discussed my reflections on the recently released beta of generative-AI tools by ATLAS.ti and my next post discusses one of the new players in the field, CoLoop. Here I share thoughts on MAXQDA's beta of it's AI-Assist tool. See also the intro post to this series, AI in QDA: Hoo-Ha or Step-Change? and the first post: What's afoot in the Qualitative-AI space?

MAXQDA AI-Assist

The first implementation of generative-AI in MAXQDA came in version 2022.6 in the form of a “virtual research assistant” that automatically summarizes text segments that have already been coded within the software. It was released at the end of April 2023 as an add-on to the software for MAXQDA subscription users and student license holders.

UPDATE: version 2022.7 added the ability (June 2023) to summarize selected text passages that haven't already been coded, and an option to have sub-codes suggested. These new developments are discussed towards the end of this post

MAXQDA AI Assist is powered by OpenAI and in addition to the newly implemented sub-code suggestion feature (see below) as it is currently implemented it can be used to summarize coded text segments in three ways:

via the Summary Grid tool (for document-based summaries of coded text)
via Code Memos (for code-based summaries across documents)
via selecting portions of text within Documents (for text-bases summaries within data files)

AI generated summaries via the Summary Grid

Summary Grids in MAXQDA are tabular displays of selected codes by selected data files, offering spaces to summarise coded-segments document-by-document – i.e. each data file in turn. Those summaries can then be displayed as Summary Tables such that you can compare the summaries, thus abstracting a level away from the original data. This way of working is a classic form of data reduction, as described by Miles and Huberman (1994), and facilitates matrix-based summarisation analytic methods, such as Framework Analysis (developed in the 1980s by researchers at NatCen – see Ritchie et al ).

Generating automatic summaries using MAXQDA AI Assist does the summarising for you, based on the coding you have already done. My experimentations show that the summaries generated at this level – i.e. from the coded-segments for each data file separately, one-by-one, are pretty accurate and therefore useful as descriptive summaries on the level of a data file (which could represent an individual respondent if working with interview-type data, or anther unit of analysis depending on what documents contain – i.e. a group if working with focus-group transcripts, or a piece of literature if working with journal articles, etc.)

For more information about the Summary Grid/Table functionality in MAXQDA see the Spotlight Session I gave at the recent MAXdays23 conference (before MAXQDA AI Assist was released).

AI generated summaries via Code Memos

The other way MAXQDA AI Assist is currently available is from Code Memos. Running it from here will summarise the coded text segments for the chosen Code across all data files. So the summaries are concept/theme/topic-based rather than document based as they are from the Summary Tables.

Are code-based AI-generated summaries actually useful?

The summaries generated from here are descriptively useful, but depending on what the code represents, may be more or less useful. For example, for Codes that represent a concept or theme that occurs across a diverse set of data files, the summaries generated do not always capture the nuance contained within the data segments collected at the code. So I found that some things that I would have considered important had I summarised the coded-segments myself, were not captured in the AI-generated summaries. And the level of nuance contained within the previously coded-segments would be something that would be important to maintain having gone to the interpretive effort to do the coding myself first.

So why would I do it?

Which leads me to wonder why I would actually ever do this sort of automated summarisation? When I’ve gone to the time and effort to code segments myself, I know the data pretty well, very well actually. Therefore, if I need summaries on this level, I’d likely be best placed to do them myself. Unless I’d run out of time, I guess…

Topic, question & speaker based summaries are perhaps more useful

In my experimentations, I found the summaries generated using MAXQDA’s AI Assist tool more useful where codes represent topics, questions or speakers.

When participants have been asked the same questions – such as in a structured interview project, or a set of free-text responses to an open-ended survey question - a code captures all responses to each specific question. In these instances it is more likely that responses warrant direct comparison because design effort has been put into the way the questions are asked. Where responses to such structured questions are fairly short – such as is often the case in open-ended survey questions, the level of interpretation that happens is quite different than in more narrative forms of qualitative data (such as in-depth interviews or field notes etc.). I was pretty impressed with the AI-generated summaries generated in both the structured interviews and open-ended survey data I trialled this on.

Similarly in more semi-structured data, where a code may be used to capture much longer sections of discussion, around a broader, but still relatively cohesive topic, the summaries were also more useful than I found them to be for interpretively human-generated coded segments.

Finally, I tested MAXQDA AI Assist on focus-group speaker sections – to generate quick overviews of the contributions of each speaker in a set of group discussions. This worked reasonably well to understand the general topics discussed by each speaker, and I can clearly see the benefit of doing this as a form of data familiarisation for lengthy transcripts where all participants have made significant contributions. However, the functionality only works when there is ‘sufficient’ text at a speaker code, so summaries will not be automatically generated where speakers contribute only once, or very little. That makes sense, technically, but if I was to rely on AI-generated summaries for some speakers but not others, that would be methodologically questionable, so this would be an important consideration when considering auto-summarising by speaker section.

Summary options: lengths and language

For both types of AI-generated summaries (via Summary Tables and via Code Memos) you can choose the length of summary: either standard (generally a paragraph made up of a few sentences); Shorter (a paragraph of fewer sentences, typically 2); or Text in bullet points (generally 3 or 4 bullets). You can also choose for the summaries to be generated in one of several languages (English, German, Spanish, Turkish, French, Italian, Portuguese, Japanese, Chinese, Russian or Arabic).

Useful for quick summarisation at a descriptive level

Overall I’m pretty impressed with the way MAXQDA have implemented AI Assist. Methodologically and practically it makes sense to me that I might employ the assistance of the computer at the summarisation stage of my work – especially if I need to quickly develop an overview of the coding so-far achieved, perhaps for an interim report or as the basis of discussing with colleagues the next stage of more in-depth interpretive work.

My initial experimentations lead me to the conclusion that the AI-generated summaries are more useful at the document-level (i.e. via Summary Grids) than at the code level (i.e. via Code Memos). This is because the summaries generated of coded-segments across the dataset (via Code Memos) tend to be at a level that misses some of the important nuance in the data. In contrast, because the summaries generated via the Summary Grids, are based on each individual data file, their focus is specific to the unit of analysis that each data file represents.

Make informed decisions: try it out for yourself

It's always best to make your own decisions about what tools work best for you. It's great to hear other researcher's thoughts and experiences, and hopefully this post is useful in that respect, but your project is yours, you are the expert in what is needed, whether its data, methods or tools. The best way to decide whether a new development is appropriate is to try it out for yourself. That's the only way any of us can make truly informed decisions

UPDATE: AI generated summaries via selected text passages

UPDATE: this functionality was after this post was first published

It's now possible to additionally ask AI Assist to summarize selected text passages within individual Documents. This can happen independently of any coding that may have been undertaken already. The user can select any sized segment (although very short segments will be too short for the tool to be able to summarize) to have summarized.

As such, whole transcripts, reports, articles and so on can be summarized by selecting the whole content of a data file. The summarized text is inserted within a memo linked within the Document.

UPDATE: AI generated summaries of coded segments

UPDATE: this functionality was added after this post was first published

Similar to summaries generated via selected text passages is now the added ability to summarize individual selected coded-segments - previously as discussed earlier in this post, coded-segments could only be summarized together, via the code they were linked to. This added functionality offers the summaries to be presented as sentences or lists of topics contained within the coded-segment. They are connected to the coded-segment via a comment, and displayed alongside the transcript.

UPDATE: suggested sub-codes via AI Assist

UPDATE: this functionality was added after this post was first published

It's now also possible to ask AI Assist to suggest sub-codes. This happens from a code that already contains coded data - such as a broad concept, topic or theme code, and ask AI Assist to suggest potential sub-codes. These are listed in the linked Memo. In addition to a simple list, based on the content of the segments coded, you can ask for examples from the data for each suggested sub-code. This provides you with the ability to more easily assess whether the sub-code suggestions are useful

What are these new updates good for?

As with the initial implementation, I'm pretty impressed with these tools. They're pretty accurate in descriptive terms, and there are a number of situations when they can be useful. In particular, I was pleased to see the sub-code suggestions are just that - AI Assist is staying true to its name; it's providing assistance rather than doing any coding automatically. This keeps a good balance between the role of the AI and that of you, the human interpreter. I like that.

It's also worth noting that if you run AI Assist multiple times, you're likely to get different results, as is usually the case with these technologies. Because the summaries and code suggestions are saved in memos/comments it's easy to amalgamate the suggestions into your analytic toolbox.

Next up

I'm reporting on my experimentations with the AI copilot from CoLoop...

References

Miles, M. B. and Huberman, A. (1994) Qualitative Data Analysis: An Expanded Sourcebook. London: Sage Publications.

Ritchie J, Spencer L (1994) Qualitative data analysis for applied policy research. In: Bryman A, Burgess RG, eds. Analyzing qualitative data. Routledge, London: 173–94