Concordances on paper | Nathan Dykes

In September, I taught two workshop sessions at the Data, Gender and Society summer school in Erlangen. The event gathered around 30 participants, including students, PhD researchers and others, from digital humanities and neighbouring fields, for a week of keynotes, workshops, and interdisciplinary exchange.

My sessions were about concordance analysis: what it is, what you can find with it, and how we approach it in the RC21 project. The challenge was that almost none of the participants were linguists. At the same time, several of them were more technically skilled than the students I usually teach — comfortable with software, used to working with data, just not linguistic data.

My first session was methodological: since most of the audience was from fields other than linguistics, and since concordances are arguably somewhat undervalued even in corpus-based workflows, after covering the basics with some Sinclair examples, we talked about the methodological side: what happens when you look at concordances in different ways, and why should we care about research documentation? This ran right after the day’s keynote, so there was already some momentum.

To render the methodological challenge accessible, I decided to start with as little technology as possible: a printed paper concordance. Participants were handed a physical version of selected lines from the 19C corpus of English novels, showing uses of her cheeks. Two groups of participants received the same sample of lines and were asked to discuss what patterns they noticed. However, the lines were sorted differently: one on the left context, the other on the right. The idea was based on our ongoing work in the project and followed a simple premise: sorting guides the analyst’s attention towards different kinds of repetitions. And indeed, people who had looked at the right-sorted concordances pointed out that cheeks are described in terms of colour changes: her cheeks flushed, glowed, and grew paler. A similar pattern, but pertaining to a different subset of these same lines, describes colour from the right context: colour as rising into her cheeks, along with crying: the tears rolled down her cheeks.

The second session on the next day moved to the screen, using FlexiConc in CLiC to look at gendered patterns more systematically, with a running example of “his/her hands” in the same corpus as the day before. This was closer to what we normally do in RC21, but with a more diverse participant group. What stood out was how quickly the participants moved past the technical side and into the content. They picked up the conceptual discussion from the paper-based concordance almost seamlessly and integrated their observations from the corpus tools into this broader literary context. In other words, they were less interested in how to arrange concordance lines than in what the results meant across mediums and linguistic patterns — whether, for instance, the descriptions of male hands carried more emotional intensity than their female counterparts.

This is not a question I get from linguists very often, at least not as a first instinct. It was a good reminder that the same method looks different depending on who is looking at the data.

Enjoy Reading This Article?