June 2012

How to render conversation analysis style transcriptions in LaTeX

UPDATE: I’ve now found there is a better way to do this, which I’ve documented here.

A large part of my research is going to involve conversation analysis, which has a rather beautiful transcription style developed by the late Gail Jefferson to indicate pauses, overlaps, and prosodic features of speech in text.

There are a few LaTeX packages out there for transcription, notably Gareth Walker’s ‘convtran’ latex styles. However, they’re not specifically developed for CA-style transcription, and don’t feel flexible enough for the idiosyncracies of many CA practitioners.

So, without knowing a great deal about LaTeX (or CA for that matter), I spent some time working through a transcript from Pomerantz, A. (1984). Agreeing and disagreeing with assessments: Some features of preferred/dispreferred turn shapes. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action: Studies in Conversation Analysis (pp. 57-102). Cambridge: Cambridge University Press).

Here’s a image version from page 78:

Here’s how I figured that in LaTeX:

     & D: &  0:h (I k-)= \\
     & A: &  =Dz  that  make any sense  to  you?  \\
     & C: &  Mn mh. I don' even know who she is.  \\
     & A: &  She's that's, the Sister Kerrida, \hspace{.3mm} who, \\
     & D: &  \hspace{76mm}\raisebox{0pt}[0pt][0pt]{ \raisebox{2.5mm}{[}}'hhh  \\
     & D: &  Oh \underline{that's} the one you to:ld me you bou:ght.= \\
     & C: &  \hspace{2mm}\raisebox{0pt}[0pt][0pt]{ \raisebox{2.5mm}{[}} Oh-- \hspace{42mm}\raisebox{0pt}[0pt][0pt]{             \raisebox{2mm}{\lceil}} \\
     & A: &  \hspace{60.2mm}\raisebox{0pt}[0pt][0pt]{ \raisebox{3.1mm}{\lfloor}}\underline{Ye:h} \\
\caption{ Evaluation of a new artwork from (JS:I. -1) \cite[p.78]{Pomerantz1984} .}

Here’s the result, which I think is perfectly adequate for my needs, and now I know how to do it, shouldn’t take too long to replicate for other transcriptions:

I had to make a few changes to the document environment to get this to work, including:

  • \usepackage[T1]{fontenc}

    to make sure that the double dashes — were intrepreted as a long dash while in the texttt environment.

  • I also had to do

    to rename the “Table” to “Datum” – because I’m only using the table for formatting (shades of html positioning 1990’s style).

  • \usepackage{caption}

    to suppress caption printing where I wanted the datum printed without a legend (using


    instead of



The above example is designed to break into a full page centre-positioned spread from a two-column article layout, so those directives are probably not relevant to using it in the flow of text or in two-columns, but I found the (texttt) fixed width font (which, because of the evenly spaced letters, seems to make it easier to read the transcription as a timed movement from left to right) was too large to fit into one column without making it unreadably small.

I hope this is useful to someone. If I find a better way of doing this (with matrices and avm as I’ve been advised), I’ll update this post. Any pointers are also much appreciated as I think I’m going to be doing a lot more of this in the next few years.

There are other horrors in here, and it was a really annoying way to spend a day, but this method seems to get me as far as I need to go right now.

Many thanks to Chris Howes for holding my hand through this.

The Audio BNC

I’m sitting in my office overlooking Mile End listening to a 20 year-old recording of two people sitting in their kitchen, chatting over the sound of BBC Radio 3 about their friends, their weekend, what’s on TV, and about how prim and proper Swiss people are.

It feels like being magically transported back to 1993, when the British National Corpus recruited 124 men and women of balanced ages and demographically assigned social classes and asked them to carry around tape recorders to capture the conversations they had with friends, family, neighbours and co-workers every day.

Over 700 hours of conversation were recorded and painstakingly transcribed and annotated to enable researchers to analyse an immense corpus of naturalistic language data (still representing only 10% of the total data in the BNC, mostly comprised of written books and journals and transcribed broadcasts). All this data has been used as a primary resource by computational linguists, natural language researchers, sociologists and all kinds of researchers measuring their models of language learning and production against the empirical evidence.

However, for the most part, only the text transcriptions of this data rather than the audio itself have been easily accessible to researchers until very recently. In the last year, the Oxford Phonetics Lab has produced a British National Corpus Spoken Audio Sampler, after digitising, cataloguing and analysing the mountain of audio casettes that were hidden away in the British Library Sound Archive. They are soon going to make the entire “Audio BNC” available online to anyone who wants to listen to the original recordings on which so much research has been based, and the director Professor John Coleman kindly made selected recordings available to me as a beta tester.

Using Matthew Purver’s SCoRE BNC search tool, I’ve been able to do a full-text search of the Audio BNC, and find naturalistic examples of conversations on specific topics (I’m looking for people talking about art, design, fashion, architecture, or otherwise engaging in aesthetic discussions), and then just dip into their lives at those specific moments. It is fascinating. The sense of omnipotence is almost intoxicating, especially because sitting here, listening and reading along with the original 1990’s transcriptions, I get a strong sense how much has changed in terms of the knowledge production tools available to researchers since then.

The text transcriptions I’m reading are full of instances in which the transcriber says the speech is <unclear>, where references and names of things being referred to are omitted. Especially as I’m looking for people talking about art, I’ve found that almost all of the names of artists, musicians or other cultural references made by people in conversation are labelled <unclear> – understandably as how can the transcriber be expected to have a familiarity with relatively obscure painters from Swiss art history? With just a few contextual references, Google and Wikipedia make it trivial to identify about 90% of these <unclear> instances. Similarly, pressing my android phone to my headphone speakers and running Shazam, I’m able to identify what music they’re listening to on the radio in their kitchen while they chat.

One of the most powerful things about the Audio BNC being released today is the opportunity to apply contemporary search and analysis tools to finding instances of naturalistic conversation from a huge range of contexts and situations involving different professional, social demographic and cultural groups, and then drop in and listen to what’s going on. Having pored over the transcriptions of these people’s speech, it’s a fantastic revelation to hear their accents, intonations, and get a sense of the detail of how they do the work of ‘being ordinary’ in the privacy of their homes and intimate relationships, then in public, then at the office.

It’s the ultimate fly-on-the-wall experience, and it feels like sitting in front of a new telescope, suddenly able to inspect in great detail specific areas of a previously vague and undifferentiated view of a distant galaxy.