Christina Silver
- Jul 11, 2019
- 11 min read

Pull up your socks CAQDAS developers - we need finer tools for visual analysis

Updated: Jan 4, 2021

I was chuffed to be asked to present at the Social Research Association (SRA)’s Summer Event "New ways of seeing: Social research in a digital, visual age" on 3rd July 2019 alongside Tim Highfield and Helen Lomax. My talk was called ‘digital tools for visual analysis’. This post gives an overview of some aspects of my talk. My slides - along with Tim and Helen's - are available from the SRA website.

I was last to present. Tim and Helen had already discussed the theoretical and methodological aspects of collecting and analysing images, in terms of role, representation, cultural contexts and meanings. I shifted the focus to the more practical – discussing how we can actually go about systematically analysing visual materials, using dedicated software designed for the purpose.

My talk draws on an article I co-authored with Jennifer Patashinck back on 2011 - Finding Fidelity - Advancing Audiovisual Analysis using Software in which we critiqued the then available tools for visual analysis.

Many digital tools

There are many digital tools available that can be harnessed to facilitate the analysis of visual materials (still and moving images). Choices, therefore, need to be made concerning which program(s) to use – and which tools within them are appropriate for particular analytic tasks. As discussed in detail in our publications on the Five-Level QDA® method, these choices are driven by the needs of each analysis. I focused on the digital tools collectively referred to as Computer Assisted Qualitative Data AnalysiS (CAQDAS) packages. These are not the only options available, but they were specifically developed to support the analysis of qualitative data (of which visual materials are one form), and therefore promote themselves as dedicated solutions.

My intention with this talk was not to provide a comprehensive comparison of the tools available in CAQDAS-packages for the analysis of visual materials – I’d need a lot longer than the allocated 30 minutes to do that! So I used selected examples from five CAQDAS-packages (ATLAS.ti, MAXQDA, QDA Miner, NVivo, and Transana) that have contrasting tools, to make…

three key points…

There are big differences in how CAQDAS-packages facilitate visual analysis
There is a continued mismatch between the analytic needs of visual data and the tools available
Visual analysts have responsibilities to foster change so analytic needs are better served by digital tools

Managing, analysing and representing visual materials

In thinking about the many tools available, it’s useful to make broad distinctions between tools for the management of visual materials, tools for the analysis of visual materials, and tools for representing connections between and within visual materials. Most CAQDAS-packages that handle visual materials provide dozens of tools that can be harnessed for these three purposes, and usually individual tools within software programs can be used for several purposes – such is the flexibility of dedicated tools.

For the purposes of this talk, I focussed on three sets of related tools – transcribing and synchronicity, textual and visual annotation, and linking, mapping and outputting. These are not the only tools available for visual analysis (and I do not provide a systematic comparison in this post), but they highlight key general differences and illustrate one of my arguments in this presentation - that there is a mismatch between analytic needs of visual data and available tools.

Transcribing and synchronicity (video data)

Transcription is the process of creating a written representation of an interaction, event or setting for analytic purposes.
Synchronising refers to how multiple representations of the research encounter can be associated with concurrent analysis.

CAQDAS-packages that handle video data vary quite significantly in how transcripts can be created, are formatted and displayed, can be synchronised with associated video files, and can be flexibly worked with as analysis proceeds. For example…

ATLAS.ti allows one transcript to be synchronised with each video file, and the two are displayed adjacently. Transcripts are generated outside of ATLAS.ti and synchronised with their associated video upon importing via timestamps inserted in the transcript. It is not currently possible to edit the once it has been imported.
MAXQDA allows one transcript to be synchronised with each video file. The video and the transcript are synchronised via timestamps but each is displayed in a separate window. Transcripts can be created within MAXQDA and synchronised as part of the transcription process, or generated outside and synchronised when imported via timestamps in the transcript. The transcript can be edited and additional timestamps inserted at any time.
NVivo allows one written transcript to be associated with each video file, which is displayed in tabular format adjacent to the video. Transcripts can be created within NVivo and synchronised as part of the transcription process, or generated outside and synchronised when imported via timestamps in the transcript. Each row represents a user-defined duration of time and is the means of associating the transcript with the video. The duration represented by each row need not be continuous, such that analytically meaningful segments can overlap. Rows can be retrospectively altered in terms of the time duration they represent. The number and title of the columns in the transcript can be specified enabling different types of commentary to be displayed adjacently. However, specifying columns is a global process – meaning the same columns are present for all video files in each NVivo-project.
Transana allows up to five separate transcripts to be synchronised with each video file which is enabled by inserting timestamps. Transcripts can be created within Transana and synchronised as part of the transcription process, or generated outside and synchronised when imported via timestamps in the transcript. Transcripts and the placement of timestamps can be edited at any time. Transana includes short-cut tools for inserting Jeffersonian transcript notations such as rising or falling intonation, audible breath etc. All the transcripts created can be viewed together, and when synchronised with the video can be played back concurrently. Alternatively, individual transcripts can be hidden when there is a need to focus on one aspect at a time. The placement of the transcripts in relation to the video on the screen can be altered as required. In addition, up to four video files can be synchronised with each other – enabling several perspectives of the same event to be viewed and analysed together.

Capturing meaning directly and indirectly

Transcription is an important analytic act when working with visual materials in that decisions about what to transcribe, and how to format transcriptions have significant impacts on how meaningful aspects of data are captured, interrogated, interpreted and represented. Generating synchronised transcripts thereby becomes the vehicle through which the visual is analysed.

But ATLAS.ti, MAXQDA, NVivo and Transana also allow video (and audio) data to be analysed directly – i.e. without the need for a synchronised transcript. Direct analysis involves capturing what is interpretively meaningful straight onto the visual material itself – for example by coding, annotating and linking. CAQDAS packages vary in the tools they provide for direct and indirect analysis for still and moving images, therefore in choosing the appropriate tool the needs of the analysis must be fore-fronted. In this talk, I discussed annotation and linking (not coding).

Textual and visual annotation

Textual annotation is the process of selecting a segment of interesting data and making a written note of how it is meaningful in the context of the analysis
Visual annotation is the process of marking data using shapes and colours to indicate an aspect of interest

Annotating visual data in CAQDAS-packages is enabled very differently for still and moving images and with respect to textual and visual annotation.

ATLAS.ti allows still and moving images to be textually annotated in the same way. A rectangular portion of a still image or a clip of video can be selected and captured as a ‘quotation’ and then commented upon. Image and video quotations can be retrieved and visualised along with the comments. It is not currently possible to visually annotate any form of data within ATLAS.ti
MAXQDA allows still and moving images to be textually annotated in two ways – either by creating an in-document memo adjacent to an image or video or by commenting on a coded-segment. Both these ways of textually annotating visual materials are indirect. It is not currently possible to visually annotate any form of data within MAXQDA.
QDA Miner allows still images to be textually and visually annotated indirectly – via the coding process. Having coded a rectangular segment of an image that coded-segment can be commented upon. Comments can be given a colour attribute which is visualised on the coded segment in the margin view. Visual annotation of images involves QDA Miner does not currently handle video data.
NVivo allows still images to be textually annotated in two ways. Rectangular selections of a still image can be made and directly commented on using an annotation. Alternatively, a log can be created alongside the image (displaying in a similar way as a transcript associated with a video) and associated with rectangular sections of the image. Video data can be directly textually annotated by selecting a clip of video and writing a comment about it in an annotation, which displays at the bottom of the screen (a bit like a footnote). It is not currently possible to visually annotate any form of data within NVivo
Transana allows still and moving images to be indirectly textually annotated by linking notes to segments of a transcript, which are thereby linked to the corresponding video clip. Still images can be visually annotated using a variety of shapes and colours - by creating a series of Snapshots from the video, drawing on the shapes, and dropping those coded Snapshots into a visual Transcript time-coded to the video. Thus, as the video plays, you see a coded frame from that video along-side in the Transcript. It is not currently possible to visually annotate video (unless snapshots have been created, in which case it becomes a still image).

Visual and textual annotation is particularly important when both analysing and presenting visual data and are areas that require finer tools. In these CAQDAS-packages still and moving images can also be directly coded, but this was not a focus of this talk.

Linking, mapping and outputting

Linking refers to making and labelling associations between segments of data
Mapping refers to graphically visualising those associations
Outputting refers to exporting analytic connections from CAQDAS-packages in order to disseminate

Linking, mapping outputting visual materials are ways in which visual materials and the analysis undertaken on them can be represented - but these are tools that are generally less developed in CAQDAS-packages than transcription and annotation (apart from ATLAS.ti – see below). In my talk at the SRA Summer Event (and therefore also in this post) I focus on linking, mapping and outputting in relation to working directly with segments of visual materials - i.e. not in relation to how they may be qualitatively coded (although it should be noted that coded visual materials can be outputted in various ways).

ATLAS.ti allows any number of quotations to be linked to one another directly via user-specified named relations (e.g. ‘associated with’, ‘contradicts’, ‘results in’, etc.). Hyperlinked quotations are visible in the margin view and can be visualised, mapped and manipulated in the Network view. An individual quotation can have any number of named links to any other number of quotations of any type (text, image, audio, video etc.) and can be retrieved and worked with independently from other processes (such as coding). Networks can also be exported.
MAXQDA allows rectangular segments of images and clips in video files to be linked to one another directly – but only in pairs. Hovering the cursor over one of the linked segments brings the other up in a pop-up window but it is not possible to view or work with linked segments independently.
QDA Miner allows rectangular segments of images to be linked to one another indirectly – via previously coded segments – but only in pairs. Linked coded segments are indicated in the code margin view and can be navigated from there, but it is not possible to view or work with linked coded-segments independently.
NVivo allows selections within images and/or videos to be directly linked together via ‘see also links’, but only in pairs. The links are listed at the bottom of the data file and can be navigated from there, but it is not possible to work with linked segments independently.

Blunt tools for fine purposes?

The summary of tools for visual data provided here is not exhaustive – the CAQDAS-packages I have discussed here each have additional tools that can be harnessed for visual analysis, and other CAQDAS-packages also have tools for visual analysis. The tools I discussed in this talk were chosen to highlight some key differences and deficiencies. The point being that dedicated CAQDAS-packages vary significantly in this regard, and in many cases, the tools available are too blunt for the fine purposes of visual analysis.

Mismatch between fidelity of visual materials and tools

As is often discussed in the methodological literature, visual materials are complex and multidimensional - in other words they have high fidelity (see our discussion of the fidelity of visual materials in the article Jennifer Patashnick and I wrote in 2011, where we discuss fidelity of data in terms of quality, temporality, emotional tone, and loyalty of reproduction and discuss the sliding scale of fidelity of qualitative data from text - image - audio - video).

But are tools for managing, analysing and representing visual data adequate - in other words, are tools sufficiently fine for the high fidelity nature of visual data? In our 2011 article, Jennifer and I argued not. Unfortunately in most CAQDAS packages, the tools for the analysis of visual materials have not been developed in those 8 years (the notable exception being the ability now in Transana to visually annotate still images - one the things we discussed a need for in our article).

What's going on CAQDAS developers?

There have been many software updates in your programs since 2011, why so few for the specific purposes of visual analysis? More and more researchers are using visual materials and that will only continue as interest increases and it becomes easier and easier to access high-quality visual materials (whether generated by participants, researchers or harvested from the internet). It's perplexing to me why this area remains relatively underdeveloped in most CAQDAS packages.

But my critique is not a blanket one - some of the tools described earlier in this post are very good - fine enough for high fidelity visual materials and the complex and multidimensional purposes of visual analysis. But many are not.

I often reflect on how the developmental impetus of software impacts on the tools provided - it's no surprise to me, for example, that programs developed specifically for the transcription and analysis of video (like Transana) have finer tools. Programs such as MAXQDA, QDA Miner and NVivo, in contrast, were originally developed for the analysis of textual data, adding the ability to handle visual data later on. Within these programs, some tools designed for the analysis of text have thereby been extended to visual materials rather than bespoke tools being developed for the specific purposes of managing, analysing and representing visual materials.

There is something to be said for having consistency in tools across all data types - programs, where this is the case, maybe easier to learn because the logic of tools and their operation is the same or similar whether working with video, audio, still images or text.

But do the affordances of consistency in software architecture and the way tools operate outweigh the need for bespoke tools for different data with different levels of fidelity? I think not. It should be possible to develop user-friendly, intuitive software that is easy to learn and operate and also providing sufficiently fine tools for the needs of all types of qualitative data (if a program sells itself as handling those data types).

Shared responsibilities

But if we want the tools available for visual analysis to improve then it’s not good enough to compare and critique them - we also have responsibilities to highlight what we need, and why we need it. As a community of practice, qualitative researchers haven’t been very good at concretely documenting the processes of analysing visual materials and the requirements they have for digital tools. In my opinion, this isn’t good enough.

Software developers need use-case examples in order to develop finer tools so visual analysts need to get better at systematically evaluating tools and illustratively highlighting their needs using real-world examples. Researchers using particular products must feedback to software developers and the research community the tools they find useful, those they do not and what new tools they need for their analysis.

If we want better tools, we need to ask for them and be explicit about our needs