research methods

How to use CLAN and ELAN together for EM/CA transcription

I’ve been teaching EM/CA at Berklee School of Music which has been a delight, and the students are often transcribing situations that involve music, dance, performance, composition and other settings where it’s equally important to transcribe talk and bodily action. What I’ve tried to show them is how to do transcripts that allow them to shift between turn-by-turn action and simultaneous, multi-activity interactions. Lorenza Mondada’s presentation on the differences between types of transcription software has a great explanation of the two basic transcription paradigms (partition-based, horizontally scrolling editors and turn-by-turn, vertically scrolling editors) and what they are most useful for.

I wanted to teach my students to use both paradigms and to be able to switch between them so I made this little how-to video. To follow along with this tutorial you’ll need two files: a small clip from “Game Night”(many thanks to Prof. Cecelia Ford for letting me use her often-cited CA data for this tutorial), and Game_Night.cha – the accompanying CLAN transcription file. They’re both zipped up together in this downloadable folder. You only get a tiny clip of the whole Game_Night.mov video file here. The idea of this tutorial is that you can use this as a starting point and then replace the video with your own data.

Here’s the youtube video. It was recorded for my Berklee students, so not all parts of it (especially where I refer to the data you’ll have available to you) are relevant to this blog tutorial.

How to use CLAN and ELAN together for EM/CA transcription Read More »

The Data Session

The ‘data session’ has become my favourite research activity since starting to work with ethnomethodology (EM) and conversation analysis (CA). However, this crucial bit of analytic trade-craft seems poorly documented as a research process – with minimal references scattered throughout textbooks, articles and course materials. This post pulls together some of the descriptions and tips I’ve found relating to the practical activity of doing data sessions, followed by a short account of why I am so fond of this wonderful research practice.

Early CA work

I can’t find any direct references to data session practices in any of the early CA literature from Sacks, Schegloff or Jefferson. However, there are some very interesting methodological discussions that provide insight into how data is prepared and collected prior to a principled collection being established in the following two papers:

I can only assume that the trade craft of CA was being established at this time, so the practice of the data session was not yet at a point where it was stable, well understood and ready to be written up for instructional purposes. I’ve heard stories (but can’t find any write-ups) of how Gail Jefferson was particularly involved in its development as a pedagogical/analytic practice. I would be very interested in reading these stories – and particularly learning about any rules / procedures for doing data sessions that may have been established in these early days.

First instructional descriptions

Paul ten Have’s “Doing Conversation Analysis” first published in 1999 provides one of the first instructional descriptions of the data session I can find – and lays out the essentials of what the data session consists of very clearly:

“The data session can be seen both as a kind of playground to mutually inspire one’s understanding of the data, and as an environment that requires a rather specific ‘discipline’. A ‘data session’ is an informal get-together of researchers in order to discuss some ‘data’ – recordings and transcripts. The group may consist of a more or less permanent coalition of people working together on a project or in related projects, or an ad hoc meeting of independent researchers. The basic procedure is that one member brings in the data, for the session as a whole or for a substantial part of it.”

He then provides – as far as I can find – the first description of the actual practical activity in the data session, and how it functions as a pedagogical as well as an analytic practice:

“This often involves playing (a part of) a tape recording and distributing a transcript, or sometimes only giving a transcript. The session starts with a period of seeing/hearing and/or reading the data, sometimes preceded by the provision of some background information by the ‘owner’ of the data. Then the participants are invited to proffer some observations on the data, to select an episode which they find ‘interesting’ for whatever reason, and formulate their understanding, or puzzlement, regarding that episode. Then anyone can come in to react to these remarks, offering alternative, raising doubts, or whatever. What is most important in these discussions is that the participants are, on the one hand, free to bring in anything they like, but, on the other hand, required to ground their observations in the data at hand, although they may also support them with reference to their own data-based findings or those published in the literature. One often gets, then, a kind of mixture, or coming together, of substantial observations, methodological discussions, and also theoretical points. Data sessions are an excellent setting for learning the craft of CA, as when novices, after having mastered some of the basic methodological and theoretical ideas, can participate in data sessions with more experienced CA researchers. I would probably never have become a CA practitioner if I had not had the opportunity to participate in data sessions with Manny Schegloff and Gail Jefferson.”

Have, P. Ten. (2007). Doing conversation analysis: A Practical Guide (2nd ed.). London: Sage Publications. pp. 140-141.

He also mentions that these sessions are poorly documented, writing (in the 1999 and 2007 editions of his book) that he can only find one real description in Jordan & Henderson (1995) quoted below. They also note that the data session – which they call the “Interaction Analysis Laboratory” is both vitally important, and difficult to describe in formal/procedural terms:

“Group work is also essential for incorporating novices because Interaction Analysis is difficult to describe and is best learned by doing. Much in the manner of apprentices, newcomers are gradually socialized into an ongoing community of practice in which they increasingly participate in the work of analysis, theorizing, and constructing appropriate representations of the activities studied.”

They also provide a great description of the actual mechanics of presenting data, and how specific heuristics in the organisation of the data session can mitigate against rambling, ungrounded theoretical speculation:

The tape is played with one person, usually the owner, at the controls. It is stopped whenever a participant finds something worthy of remark. Group members propose observations and hypotheses about the activity on the tape, searching for specific distinguishing practices within a particular domain or for identifiable regularities in the interactions observed. Proposed hypotheses must be of the kind for which the tape in question (or some related tape) could provide confirming or disconfirming evidence. The idea is to ground assertions about what is happening on the tape in the materials at hand. To escape the ever-present temptation to engage in ungrounded speculation, some groups have imposed a rule that a tape cannot be stopped for more than 5 min. This means in practice that rambling group discussions are discouraged and that no single participant can speculate for very long without being called upon to ground her or his argument in the empirical evidence, that is to say, in renewed recourse to the tape.

Jordan, B., & Henderson, A. (1995). Interaction analysis: Foundations and practice. The Journal of the Learning Sciences, 4(1), 39–103.

Recent accounts and empirical work

More recent instructional publications have included tips on data sessions. For example, in Heath, Hindmarsh & Luff (2010), there is one tip section, and one appendix on data sessions. They introduce the basic idea, and also mention its pedagogic function, as well as highlighting the opportunity for its use in interdisciplinary / workplace studies that involve practitioners from other fields:

“[Data sessions] can also be used to introduce new members of a team or research group into a particular project and can be very important for training students in video based analysis. On occasions it can also be helpful to have ‘practitioners’, personnel from the research domain, participate in data sessions, as they can provide distinctive insights, and can often help to clarify events that have proved difficult to understand. in data sessions it is important to avoid overwhelming participants with too much material. A small number of brief extracts of, say, no more than 20 seconds or so is fine. it is also helpful to provide transcripts of the talk as well as any other materials that may be useful for understanding the extracts in question.”

They also add a number of key points about the distinct benefits and caveats of running data sessions, paraphrased here:

  • identifying candidate phenomena for more detailed study.
  • Enforcing evidential demonstration of analytic claims.
  • Revealing issues/challenges in demonstrating analytic findings.
  • Eliciting alternative/complimentary perspectives.
  • Generating new analytic ideas/issues and suggesting improvements for future data collection.
  • Keeping one’s ‘hand in’, i.e. practising analysis on other people’s data to maintain a fresh eye/ear for your own research.

Then they say something explicit about the data session as a collaborative practice which I haven’t seen anyone else mention, but it seems absolutely crucial to me. The fact that this almost never comes up also reinforces my sense that EM/CA and its research practices in general are much less fraught by this particular problem than many other research contexts, which speaks extremely well for it as a community and its empirical/epistemic commitments in general:

“Data sessions are a collegial activity and are based on mutual trust. They should be treated as such and discussions of intellectual Property and the like should be avoided. It is up to individual participants to reveal or withhold ideas that they have, if they do or do not want others use those ideas in future analytic work.”

The appendix with more tips on data sessions (pp.156-157) has more very useful practical advice. To paraphrase:

  • Limit the numbers to no more than 20 or so.
  • Presenters should select 3-6 clips, ideally under 30s each.
  • Do bring transcripts – even rough ones are helpful.
  • Bring any supplementary material that is relevant/necessary for understanding the action.
  • Look at one fragment of data at a time – approximate ratio of 20-30m on each 5s of recording.
  • Don’t cheat and look ahead, or rely on analyst’s information exogenous to the clip itself.
  • When it’s done, sum up, take notes and get general reflections.

Heath, C., Hindmarsh, J., & Luff, P. (2010). Video in qualitative research: analysing social interaction in everyday life. Sage Publications. pp. 102-103.

There is a wonderfully reflexive EM/CA analysis of a data session in a chapter by Harris, Theobold, Danby, Reynolds & Rintel (2012) in a volume on postgraduate pedagogical practices that presents the analysis of a data session by the authors and data session participants themselves. They focus on the collaborative / peer pedagogical aspects of the session, and highlight the “fluidity of ownership of ‘noticing'” with reference to clear evidence of how these noticings can be done in this way.

Harris, J., Theobald, M. A., Danby, S. J., Reynolds, E., & Rintel, S. (2012). “What’s going on here?” The pedagogy of a data analysis session. In A. Lee & S. J. Danby (Eds.), Reshaping doctoral education: International Approaches and Pedagogies (pp. 83–96). London: Routledge.

A participant-observer account

Finally, my favourite description of work practices in the data session was written by John Hindmarsh in the affectionate and humorous Festschrift publication he and his colleagues edited for Christian Heath. In an uncharacteristically participant-observer style, he nonetheless describes the detail of both pedagogical and analytic processes of “Heath’s natural habitat: The data session” very vividly. He includes:

  • Delicate interrogations: where researchers are subtly probed as to why they selected specific clips.
  • Occasioned exclamations: in which the seasoned analyst will hoot with infectious laughter or joy at a clip – infectious partly because it can leave less experienced researchers either shamefully nonplussed or scrambling to find a grounding for the source of the laughter.
  • Transcription timings: opportunities to (delicately) rectify transcription errors.
  • Re-characterisations: moments where a banal, if well-targeted observation is picked up and re-packaged as an elegant and insightful analysis – a form of agreement with some extra pedagogical/analytic impetus.
  • Troubled re-characterisations: same as above, but done as an (initially veiled) disagreement, demonstrating poor targeting or a flawed analysis – again, always analytically useful and instructive, but less pleasantly so.

Hindmarsh, J. (2012). Heath’s natural habitat: The data session. In P. Luff, J. Hindmarsh, D. vom Lehn, & B. Schnettler (Eds.), Work, Interaction and Technology: A Festschrift for Christian Heath. (pp. 21–23). London: Dept. of Management, Kings College London.

Finally – some of my own reflections on the data session – and why it constitutes such an important methodological and pedagogical practice.

Why I love data sessions and why you should too

The last description of the trade craft of a particular researcher’s data session is my favourite because it shows what an excellent apprenticeship situation this is. Whereas instruction in environments where empirical data is less straight-forwardly ready-at-hand, there is a latency between the teaching moment and the understanding moment that is frustratingly difficult to bridge. In this situation, the data is really doing the teaching, but the skilled analyst elicits both the observation and its pedagogical thrust from the same few seconds of interaction that has been in plain sight all along.

Furthermore, this public availability of the data as a mutually assessable resource to the group provides a constant check on authoritative hubris. More than once I’ve seen a junior analyst grasping and holding onto a powerful observation that provides irrefutable counter-evidence to a more experienced analyst’s position on some piece of data. There is honesty and accountability that flows in each direction in the data session, which is what makes it such a wonderful occasion for learning, analysis and – literally – serious fun.

I also like Jon Hindmarsh’s description because it really captures what it’s like to attend data sessions with different people who love the practice. I’m new to it, but thanks to the generosity of my supervisor Pat Healey and his enthusiasm for this work I’ve had the great pleasure of analysing data with pros such as Steven Clayman, Chuck Goodwin, Christian Heath, John Heritage, Yuri Hosoda, Shimako Iwasaki, Celia Kitzinger, Dirk vom Lehn, Gene Lerner, Rose McCabe, Tanya Stivers, Liz Stokoe and Sandy Thompson, not to mention my fellow students in these sessions from whom – given the peer pedagogical structure of the data session – I was able to learn just as much.

My experience has been that everyone approaches data very differently, and each person has a very distinctive style, analytic focus and approach. Nonetheless, the dynamics and epistemic arrangements of the situation allows for an amazingly rich exchange of ideas and empirical observations between disciplines, across interactional contexts, cultures, languages and focal phenomena. I am convinced that it is one of the most crucial factors in how EM/CA projects have made such robust findings in studies of interaction, language and culture, and that there is a great deal more to be understood and appreciated about how they function.

I am also convinced that the data session has a very important place in disseminating EM/CA findings and practices beyond studies that centre onits traditional sociological/anthropological/linguistic contexts of study. There are sure to be ways of adapting some of its pedagogical/analytical dynamism to working with other contexts and types of recorded materials – although it’s debatable whether this form of analysis would really work with anything other than interactional data. In any case, as I mentioned at the beginning – I am very curious about other data session practices and would like to know more about the similarities and differences in how people run theirs, so, I would be very grateful if you would send me your data session experiences/tips/formats and training materials.

The Data Session Read More »

How to prepare for an EM/CA data session

Participants in the first EMCA DN Meting

At the inaugural EMCA Doctoral Network meeting (my write-up here), where there were a mix of researchers with different levels of familiarity with EM/CA, I realised that the process of preparing for a data session – one of the most productive and essential tools of interaction analysis – is really poorly documented. There are some guidelines provided in textbooks and on websites that usually include issues of how to do the analysis/transcription itself, but nowhere is there a simple guide for how to actually get ready to contribute your data to a data session. This short primer is intended to fulfil that function, and invite others to contribute their own tips and best practices. 

I was more or less in this situation (having crammed my head full of CA literature without having had a great deal of hands-on data session practice) last year when I went to the Centre for Language Interaction and Culture at UCLA. There I had the chance to witness and participate in four separate weekly data sessions run by Steven Clayman, Chuck Goodwin, John Heritage, Gene Lerner, Tanya Stivers and Sandy Thompson and their students. It was a bit of a baptism of fire, but I learned a lot from it.

Each of the pros had interestingly different approaches to preparing data for the sessions, all useful in slightly different ways so I decided to write up my synthesis of best practice for preparing your data for a data session. Feedback and comments are very welcome!

What/who this guide is for:

This guide is intended for researchers interested in participating in data sessions in the tradition of Ethnomethodology and Conversation Analysis (EM/CA), who already have data and want to figure out how to present it.

This is not about gathering data, specific analytic approaches or about actually doing detailed analysis or any of the meat and potatoes of EM/CA work which is amply covered in many books and articles including:

This guide is intended to help researchers who may not have had much experience of data sessions to prepare their data in such a way that the session will be fun and analytically useful for them and everyone else who attends.

This is also not intended to be a primer in the use of specific bits of audio/video/text editing or transcription software – there are so many out there, I will recommend some that are freely available but pretty much any will do. I do plan to do this kind of guide, but that’s not what this article is for.

Selecting from your data for the session

Doing a data session obviously requires that some kind of data selection has to be made, so it helps to have a focal phenomenon of some sort. Since the data session is exploratory rather than about findings, it doesn’t really matter what the phenomenon is.

That’s the great thing about naturally occurring data – you might not find what you’re looking for, but you will find something analytically interesting. Negative findings about your focal phenomenon are also useful – i.e. you might find out that you’ve selected your clips with some assumptions about how they are related – you might find that this is not borne out by the interaction analysis. That is still a useful finding and will make for a fun and interesting data session.

Example phenomena for a rough data-session-like collection of extracts might focus on any one or on a combination of lexical, gestural, sequential, pragmatic, contextual, topical etc. features. E.g.:

  • Different sequential or pragmatic uses of specific objects such as ‘Oh’, ‘wow’ or ‘maybe’.
  • Body orientation shifts or specific patterns of these shifts during face-to-face interaction.
  • Word repeats by speaker and/or recipient at different sequential locations in talk.
  • Extracts from interactions in a particular physical location or within a specific institutional context.
  • Extracts of talk-in-interaction where speakers topicalize something specific (e.g.: doctors/teapots/religion/traffic).

At this stage your data doesn’t have to be organised into a principled ‘collection’ as such. Having cases that are ostensibly the same or similar, and then finding out how they are different is a tried and tested way of finding out what phenomenon you are actually dealing with in EM/CA terms.

There are wonderful accounts of this data-selection / phenomenon discovery process with notes and caveats about some of the practical and theoretical consequences of data selection in these two papers:

Pre-data session selection: how to focus your selection on a specific phenomenon.

You can bring any natural data at all to a session and it will be useful as long as it’s prepared reasonably well. However if you want the session to focus on something relevant to your overall project, it is helpful to think about what kind of analysis will be taking place in the session in relation to your candidate phenomenon and select clips accordingly.

There are proper descriptions of how to actually do this detailed interaction analysis in the references linked above. However, here is a paraphrase of some of the simple tips on data analysis that Gene Lerner and Sandy Thompson give when introducing an interdisciplinary data session where many people are doing it for the first time in their wonderful Language and the Body course:

  1. Describe the occasion and current situation being observed (where/when/sequence/location etc.).
  2. Limit your observations to those things you can actually point to on the screen/transcript.
  3. Then, pick out for data analysis features/occasions that are demonstrably oriented to by the participant themselves.
  4. That is your ‘target’, then zoom in to line-by-line, action-by-action sequences and describe each.
  5. Select a few targets where you can specify what is being done as the sequence of action unfolds.

Then in the data session itself, you and other researchers can look at how all interactional resources (bodily movements / prosody / speech / environmental factors) etc. are involved in these processes and make observations about how these things are being done.

Providing a transcript

I find it very hard to focus on analysis without having a printed transcript but there are a few different approaches each with different advantages and disadvantages. Chuck Goodwin, for example, recommends putting Jeffersonian transcription subtitles directly onto the video/audio clips so you don’t have to split focus between screen and page. However, most researchers produce a transcript using Jeffersonian transcription and play their clips separately.

Advantages of printed transcriptions

  • You and other participants have something convenient to write notes on.
  • You can capture errors or issues in the transcription easily.
  • Participants can refer to line numbers that are off-screen when they make their observations.

Advantages of subtitles on-screen

  • You don’t miss the action looking up and down between page and screen.
  • Generally easier to understand immediately than multi-line transcript when presenting data in a language your session participants might not understand.
  • You can present this data in environments where you don’t have the opportunity to print out and distribute paper transcripts.

In either case you will need to take the time to do a Jeffersonian transcription so why not do both?

Jeffersonian transcription

There are lots of resources for learning Jeffersonian transcription, here are some especially useful ones:

Visual/graphical transcripts

Chuck and Candy Goodwin often also present carefully designed illustrations alongside their final analyses. Some people also present their data, usually at a later stage of research with detailed multi-modal transcripts incorporating drawings, animations, film-strip-like representations etc. (see Eric Laurier’s paper for a great recent overview):

  • Laurier, E. (2014). The Graphic Transcript: Poaching Comic Book Grammar for Inscribing the Visual, Spatial and Temporal Aspects of Action. Geography Compass, 8(4), 235–248.

How much work you want to do on your transcript before a data session is up to you but it is probably premature to work on illustrations etc. until you have some analytic findings to illustrate.

Transcription issues vs. errors

It’s inevitable that other people will hear things differently, so the data session is a legitimate environment for improving a transcript-in-progress. In fact, often analytic findings may hinge on how something is heard, and then how it is transcribed – this is a useful thing to discuss in a data session and will be instructive for everyone. However, it is important to capture as much as possible of the obvious stuff as accurately as possible to provide people with a basic resource for doing analysis together without getting hung up on simple transcription errors rather than the interesting transcription questions.

Introducing your data

It is useful to give people a background to your data before you present it. This does not have to be a full background to all your research and the study you are undertaking. In fact, it’s useful to omit most of this kind of information because the resource you have access to in the data session are fresh eyes and ears that aren’t yet contaminated by assumptions about what is going on.

In terms of introducing your study as a whole, it’s useful to have a mini presentation (5 mins max for a 1.5h data session) prepared about your study with two or three key points that can give people an insight into where/what you are studying. Once you’ve made one of these for each study you can re-use it in multiple data sessions.

In terms of introducing each clip, have a look at Schegloff’s (2007) descriptions of his data extracts. They have a brilliantly pithy clarity that provides just enough information to understand what is going on without giving away any spoilers or showing any bias.

Preparing your audio/video data.

Assuming you already have naturalistic audio/video data of some kind, make some short clips using your favourite piece of audio/video editing software. The shorter the better (under 30s ideally) – longer clips, especially complex/busy ones may need to be broken down for analysis into smaller chunks.

It can be time-consuming searching through longer clips for specific sections, so I recommend making clips that correspond precisely to your transcript, but noting down where in the larger video/audio file this clip is located, in case someone wants to see what happens next or previously.

Copy these clips into a separate file or folder on your computer that is specifically for this data session – finding them if they’re buried in your file system can waste time.

If possible, test the audio and video projection/display equipment in the room you’re running the data session to make sure that your clips are audible and visible without headphones and on other screens. If in doubt, use audio editing software (such as Audacity) to make sure the audio in your files is as loud as possible without clipping. You can always turn a loud sound system down – but if your data and the playback system you’re using is too quiet – you’re stuck in the data session without being able to hear anything.

There are many more useful tips about sound and lighting etc. in data collection in Heath, Hindmarsh & Luff (2010).

The mechanics of showing your data

I find it useful to think of this as a kind of presentation – just like at a conference or workshop, so I recommend using presentation software to cue up and organise your clips for display rather than struggling looking through files and folders with different names etc…

Make sure each clip/slide is clearly named and/or numbered to correspond with sections of a transcript, so that people can follow it easily, and make sure you can get the clip to play – and pause/rewind/control it – with a minimum of fuss.

The data session is probably the most useful analytic resource you have after the data itself, so make sure you use every second of it.

Feedback / comments / comparisons very welcome

I hope this blog-post-grade guide is useful for those just getting into EMCA, and while I know that data session conventions vary widely, I hope this represents a sufficiently widely applicable set of recommendations to make sense in most contexts.

In general I am very interested in different data session conventions and would very much welcome tips, advice, recommendations and descriptions of specialized data session practices from other researchers/groups.

More very useful tips (thanks!):

Dr Jo Meredith adds: “because data sessions can be a bit terrifying the temptation’s to take some data you can talk about in an intelligent way, best data sessions i’ve been to have been with new pieces of data, and I’ve got inspired by other people’s observations”

How to prepare for an EM/CA data session Read More »