Kath McNiff’'s post on the NVivo Blog about classifying data in NVivo has prompted me to get writing about how I deal with this teaching challenge. For me, teaching students to choose between the available tools for classifying data and how to harness them appropriately revolves around units.
For years I've experimented with different ways of teaching how to harness the NVivo tools for classifying factual characteristics of data and respondents - for example the socio-demographics of participants or the metadata about documentary evidence. One of the great things about NVivo is that it offers several different ways of doing this, making it a very flexible tool.
But this flexibility can cause confusion...Classifications and Cases are what many new users struggle most to get their heads around. What are Classifications for? What’s the difference between Source Classifications and Case Classifications? Are they both needed? What should they be used for?
Start from the analytic task
The best way to work this all out is to start from the needs of your project – your analytic strategies.
Not all projects need Source Classifications, not all projects need Case Classifications, and some projects don't need any type of Classification. Starting out by looking around at all the features NVivo provides that could be used for classification purposes and trying to work out how to use each one is inefficient and confusing - because it requires you to learn to understand what each feature is doing and simultaneously work out how they may be applied to your own work. Too much cognitive load.
But starting from the other direction by focusing on the needs of your analytic tasks encourages the mindset of prioritizing analytic strategies over software tactics. Focusing first on the needs of your specific analytic tasks will be most efficient because you won’t need all the features at your disposal anyway.
Classifying data is all about units
To help learners understand the potential power of NVivo Classifications, and at the same time decide whether they need either or both Source and Case classifications, as well as work out how to go about assigning the relevant Attribute-Values, I focus on units.
The concept of units is central to research. Quantitative researchers tend to be clear about units because they are the basis of identifying the variables they use to calculate statistics.
But units are equally important to qualitative research design even though qualitative researchers do not always think about them explicitly. Each project has many different units of different types. Some are known at the outset, others emerge as the analysis proceeds. Units are therefore central to the Five-Level QDA method, and in our forthcoming books on the method we unpack the idea of units in much more detail.
It’s not surprising then that units are central to my teaching about Classifications and Cases and how I use them myself.
Source or Case Classifications?
Deciding whether to use Source or Case Classifications is driven by asking the following question: is the unit I want to classify equivalent to a Source or not?
Let's take a literature review project as an example. We may be working directly with full-text articles that have been imported as Internal Sources. Or we may be working indirectly with literature files - perhaps writing appraisals in Memos or using External Sources that are linked to the full-text articles outside of NVivo. Or we may be using a combination of these tactics. Either way, let's assume the analytic task is to classify the literature according to the factual characteristics we wish to use to later interrogate patterns and relationships. These might include the Date of Publication, the Publication Name, the First Author and the Focus of each piece of literature (e.g. whether Methodological, Theoretical, Substantive or a combination).
Answering the question Is the unit I want to classify equivalent to a Source or not? requires us to first to identify what the unit is that we are classifying. In this example the unit is an individual piece of literature (whether a journal article, conference presentation, book, or book chapter...etc.). If we have one Source in NVivo containing each piece of literature then then the answer to our question is yes - because the unit we want to classify is equivalent to an individual Source (whether it's an Internal, External or Memo). We can therefore use Source Classifications because the whole of each source can have the relevant Attribute-Values associated with it. In other words, the whole of each Source is e.g. Published in 2002, in the British Medical Journal and is written by Joe Blogs...etc.
When might we need Cases?
If the answer to our question is no, we need to create Cases, and use a Case Classification and associated Attribute-Values to classify the Cases.
The units that we may want to classify will not be equivalent to Source either when we have multiple units within one Source or when data that represents or belongs to a single unit is contained within multiple Sources.
When there are multiple units within one Source
The first situation might arise when working with focus-group data, when a single transcript contains the whole discussion between a number of different individuals. If the different speakers are identified within the Source and we know some socio-demographic characteristics about them that we want to use to interrogate the data, then we couldn't use Source Classifications because the unit we wanted to classify - the participants - is not equivalent to the Source – the focus-group transcript. In other words, we couldn't tell NVivo that the whole Source is, for example, male if both men and women took part in the discussion.
When the units are at a higher level
However, your research design may not necessitate going through this step so it is not always appropriate to use Cases and Case Clarifications when working with focus-group data. Often, for example, researchers undertake focus-groups to understand the general views of groups of people with similar characteristics or experiences and are not interested in the individuals' view. For example, we might have conducted one focus-group with doctors and another with nurses in order to understand the general view of doctors in comparison to nurses, rather than the views of individual doctors or nurses.
In this situation we may not collect socio-demographic information about the individual doctors and nurses because the units we need to represent are at a higher level - the group level. Therefore it might be more appropriate to use Source Classifications – and apply Attribute-Values such as ‘Santa Barbara Hospital’ to a whole Source (if all the doctors in a focus-group discussion worked in the same hospital).
When data about a single unit is contained within multiple Sources
Another example of when the unit we want to classify is not equivalent to a Source is when the same respondent has contributed data to the project in more than one form - perhaps by responding to an online survey and then being interviewed, or being observed in a natural situation and then being interviewed. In this situation the answer to our question is also no - the unit we want to classify is not equivalent to an NVivo Source - so we could create a Case for each respondent and link all the data they have contributed from each of the Sources to it, and then apply the relevant Attribute-Values.
What about more complex research designs?
Many projects are more complex than these examples, for example longitudinal projects where data have been collected from the same respondents several times over a specified time-period. In this situation it might not be appropriate to have one Case per participant because we would then not be able to track the changes in their socio-demographic characteristics over time. This doesn't mean we cannot handle longitudinal data with NVivo, just that we have to work out how best to organise this sort of material for our research objectives. One solution would be to have a Case per 'moment of contact' with each respondent, so that we could separate out their contributions at each time phase. An attribute for Respondent ID as well as the relevant socio-demographic Attribute-values would then enable us to do both within and across case analysis.
The flexibility of units
I've talked mainly in this post about units being used to represent individual respondents in research projects. This is one classic use of what are usually called units of analysis. But a research project often has several units, existing at different levels, and these can also be represented and worked with in NVivo in different ways.
But starting from the other direction by focusing on the needs of your analytic tasks encourages the mindset of prioritising analytic strategies over software tactics. Focusing first on the needs of your specific analytic tasks will be most efficient because you won’t need all the features at your disposal anyway.give a flavour of this part of the method.