Conversation Analytic Transcription with CLAN

I have been looking for software tools and a sensible workflow for making Conversation Analytic style transcriptions, and I haven’t found any really useful resources that weigh up the pros and cons of different approaches.

Lorenza Mondada’s very useful presentation on using ELAN for transcription does the most concise job of summarising the main choice point in this decision:

Transcription and representation of the flow of talk and multimodal conducts:

Transposition from time to space

Representation of time is crucial

Two formats exist :

The list format (ex. CLAN, Transana,…)

The partition format (ex. Praat, ELAN, ANVIL,…) –> based on an infinite timeline

For a CA perspective on talk, the list format is more adequate for the representation of sequentiality; however, for a multimodal analysis of various simultaneous lines of action, the partition format is very useful

These formats have analytical implications

So I began looking at various list-format transcriber options: CLAN, Transana, and Transcriber were the ones I checked out.

Transana didn’t seem to work under Linux at all, so that was a non-starter – even though there were Unix python sources available they looked more or less abandoned to me.

Transcriber was actually in my apt repository! which was a nice surprise. I installed it and got it up and running in minutes. Unfortunately, it looked terrible, used ancient audio devices in Linux, and felt very awkward to use.

I decided to use CLAN for the following reasons:

It’s saves human-readable text files I can munge and edit in vim (or any other text editor)
it uses key-commands for almost everything (little mouse-work necessary)
clean, stable and simple interface and media player integration
It’s highly modular, separating a windowed transcription system from command-line-centric analytical tools

Basically, it has a very unixy-philosophy to it (specialised tools, loosely coupled) and it’s a joy to use.

Here’s my workflow:

Currently I am enhancing some existing transcriptions from the BNC using the original audio from the Audio BNC, which I wrote about in more detail here.

First I search for the rough transcription I’m after using Matthew Purver’s‘s SCoRE tool. Using my favourite text editor, I munge this into a text file with one turn per line, and no turn numbering.

Then I copy and paste this into CLAN’s text editor, which I’m running under WINE – there isn’t a unix version yet. The image above shows a partially complete transcription, along with the audio track below. In order to show just how useful this system is for both transcription and for enhancing existing text transcriptions, I’ve made a short screencast:

Finally, I run the ‘indent’ tool on the resulting .cha file which aligns all the overlap markers and other semi-diagrammatic elements of a CA transcription. For more information on the various utilities included with CLAN, check out the CLAN user manual.

The resulting annotation looks pretty good in CLAN, while being both editable, searchable, and allowing timed viewing and adjusting of the linked media – either using text editing or CLAN’s integrated media browser/editor. The output (a .cex file) can just be copied and pasted into a word/libreoffice document:

Of course before publication, the CA-transcription style will still need to be painstakingly rendered in LaTeX, which is no fun at all. I guess a LaTeX export option is my only feature request for the very impressive CLAN toolset.

Ketil Thorgersen

March 17, 2014 at 7:38 pm

Thanks for the post!
I have actually made Transana work under Ubuntu. I have described the process here:
http://rytmisk.net/index.php/2014/03/11/making-transana-work-on-ubuntu/

But I’m curious about Elan and Clan and it’s a pity that it is such a pain to make things work on Linux!

All the best
Ketil

November 29, 2014 at 11:51 am

Hi,
Wonderful post. Could you help me figure out how to put CLAN using wine? If you have a link or something? Thanks

saul

November 29, 2014 at 12:06 pm

Hi DV,

There seems to be a problem with the latest version of CLAN under WINE. If you are using Unix, I recommend you install the latest version of CLAN Unix-tools (still work fine) and then CLAN v.11 (August 2013) or earlier. There are links to archived copies on the CLAN page.

Jacob Davidsen

January 10, 2016 at 5:20 pm

Hi Saul,

Do you by any chance have Lorenza’s presentation – the link seems to be dead. It would be great if you could send me a copy.

/Jacob

January 10, 2016 at 5:27 pm

Hi Jacob,

Thankfully it’s still on archive.org (but I’ve downloaded a copy too!)

https://web.archive.org/web/20131101205841/http://web.sdu.dk/multimodality/pdf/Mondada_ELAN.pdf

Cheers,

Saul.

Cross

July 23, 2016 at 9:31 pm

I’m doing a study on conversation analysis language documentation. I’m using ELAN as a starting point to create a CA style transcript, which allows me to add gloss and translation in another tier. Then i stumbled upon CLAN as I read about corpus linguistics method in handling conversations. Now, i’m trying to learn CLAN so i can compare them. Is CLAN recommended when handling conversational text where you need to interlinearize them?

July 24, 2016 at 9:04 am

Hi Cross,

The short answer is yes: you can use CLAN when handling conversational text where you need to interlineraize a translation tier, a gesture tier, or a morphological tier. In CLAN, these ‘%’ prefaced tiers are called ‘dependent tiers’, and are placed following each speaker-tier line of a transcript.

For ease of use, check section 6.8 of the handbook for how to show and hide certain tiers while using the CLAN editor: http://childes.talkbank.org/manuals/CLAN.pdf

Depending on your use-case, you can define your own tier codes – see section 6.13 of the manual for how to use coder mode (you can create your own per-utterance coding templates, then use them on your corpus).

In general I highly recommend working through the CLAN tutorial (especially the CHAT-CA section) which really helps to understand how CLAN deals with these concepts of tiers, codes and the difference between CHAT and CHAT-CA etc.: http://childes.talkbank.org/clan/tutorial.zip – it’s a really well-made and enjoyable tutorial, and should only take a couple of hours to work through fully.

Conversation Analytic Transcription with CLAN

Transcription and representation of the flow of talk and multimodal conducts:

Here’s my workflow:

7 thoughts on “Conversation Analytic Transcription with CLAN”

Leave a Comment