An illustration of how CAQDAS tools can bring us closer to data

An illustration of how CAQDAS tools can bring us closer to data

By Christina Silver on Nov 14, 2016 at 10:49 AM in CAQDAS commentary

by Christina Silver, 14th November, 2016

An illustration of how CAQDAS tools can bring us closer to data

In my previous post I argued that using dedicated CAQDAS packages for analysis could bring us closer to our data, rather than distance us from it, as some critics suggest. Here I illustrate this by outlining how different CAQDAS tools can be used in to fulfil a specific analytic task, thus bringing us closer to data. 

Let's imagine we are doing a project in which we need to generate an interpretation that is data-driven rather than theory-driven. It could involve one of a number of analytic methods, for example, inductive thematic analysis, narrative analysis, grounded theory analysis, interpretive phenomenological analysis…. Whatever the strategy, an early analytic task may be to familiarize with the transcripts in order identify potential concepts. There are several different ways we could go about fulfilling this analytic task using dedicated CAQDAS packages. Here I discuss three.



An illustration of how CAQDAS tools can bring us closer to data

We might decide to read through our transcripts, one by one, and mark them up and annotate them with the insights we have whilst reading. If working manually we would likely print out our transcripts and using different coloured highlighter pens, mark segments which strike us as interesting, note in the margin why it's interesting, and indicate the potential concept we see in the segment. If we were using a word processing application, we might use formatting features, like bold, italics or font highlighting and commenting tools for the same purpose. Or we might put our transcripts in a table and write our annotations in a specially allocated column. 

That’s all fine, and can work well for the purpose of data familiarization. But what happens next? If working manually or using non-dedicated software, we have a problem managing these insights, collating them, building on them. Because they’re all on separate pieces of paper, or separate computer files and we do not have tools designed to pull things together. 

But if we did this in a dedicated CAQDAS package, using specially designed annotation tools, we would be able to retrieve those insights in various ways. For example,

  • We could retrieve them on the level of individual transcripts - asking to see within the program (or as output into a word processing or spreadsheet program) all the comments we’d made within a particular transcript (or any data source).
  • We could also ask to retrieve all our annotations across all transcripts of a certain type (e.g. just the focus-group transcripts or just the interview transcripts, or just the field-notes).
  • We could also ask to retrieve all our annotations from data files that shared a particular characteristic (e.g. just the data that had been provided to us by the male participants, not the females, etc.). 

Being able to retrieve our initial insights in these sorts of ways means we can reflect on them systematically, in order to start building on them. In addition, these insights remain linked to the data segments that prompted them, so we remain always grounded in the data, close to it.



If we were interested in the language our participants use to discuss topics, to the exclusion of, or as well as interpreting what they say, we may decide to explore their use of individual words. Most dedicated CAQDAS packages provide word frequency tools which will quickly generate frequency lists based on individual words and collections of words - for example using stems (e.g. ‘nurs*' would retrieve nurse, nurses, nursing, nursed). 

Some qualitative researchers initially resist the use of word frequency tools, rightly suggesting that these sorts of searches miss where participants’ discuss certain topics without using particular words. That’s certainly true. But let’s just remind ourselves of the objective we started with: to familiarize with the transcripts in order identify potential concepts, in order to generate an interpretation that is data-driven rather than theory-driven. Focusing on individual words – and considering the least as well as the most frequently used words - is one way we can do this. We might not only do this, but it does offer a way into our data that is practically unmanageable if we were working with hard-copy print-outs. Just think about how long it would take to manually identify and count the occurrence of individual words in print-outs… 

An illustration of how CAQDAS tools can bring us closer to data

Of course, we actually might not be interested in the frequency count itself…that would depend on our analytic strategy. But many CAQDAS packages incorporate Key Word In Context (KWIC) functionality which allows us to access the context within which the ‘hits’ occur - pretty-much instantaneously. So we can use this high-level tool to quickly and reliably access qualitative content - and then do whatever we want (e.g. code, link, annotate/comment, write memos…). In my projects I often do a word frequency as one of my initial tasks, just to get an overview of what’s ‘going-on’ on the level of individual words. Sometimes I’ll code context around hits, but often I don’t. I'll just use this as a way of getting an overview, viewing the context in which words occur and writing about what I see, as a means of identifying areas that are potentially valuable to focus upon in more qualitative depth. And then write about the potential concepts. 

If my analytic strategy means I have the intention to conceptualise my data through coding, then this helps to ensure that coding is purposeful, and driven by the actual content of the data. 



Sometimes we know at the outset what some of the concepts are that we need to focus on in an analysis, and we can be fairly confident that certain keywords and phrases will be present in our data and that their very presence is indicative of the concepts we are interested in. This is often the case when undertaking a literature review, doing a project where a lot is already known about the topic area, and when undertaking analysis which prioritizes the use of language, for example forms of discourse analysis and content analysis. 

An illustration of how CAQDAS tools can bring us closer to data

In these situations generating an overview using word frequency tools, which only retrieve individual words, may be insufficient, as we may know that collections of words and longer phrases are indicative of the concepts we are interested in. Most CAQDAS packages include text search options which enable us to decide what to look for. This gives us more flexibility and control over a high-level content-based exploration.

The results of these text searches can be viewed in context, in the same ways as outlined above, in relation to word frequency tools. We don’t have to use the retrievals these tools provide us as the basis of coding, but they can inform our thinking, and we can systematically explore the presence and absence of concepts based on the words and phrases contained in the materials. In addition, viewing the context of the use of particular words and phrases can highlight possible synonyms or antonyms that may be useful to explore. 

 Way, way closer....

An illustration of how CAQDAS tools can bring us closer to data

These three tools are examples of ways in which dedicated CAQDAS packages, designed as they are to facilitate the types of analytic activities that qualitative and mixed methods researchers undertake, can bring us closer to our data. It’s not just about retrieval though, everything we do in a CAQDAS package can be saved, such that we can easily review what we did and thought earlier, and crucially we have pretty much instantaneous access back to the source material that prompted those thoughts. Everything we do can be linked, which means we can view, reflect on and build on our insights from different angles. And this brings us way, way closer to our data.