July 2011 – Saul Albert

Conversational Annotation

July 29, 2011

Annotation of a conversation would usually be a post-hoc chore undertaken by someone charged with watching a documentary or ethnographic video and ‘making sense’ of the diffuse multifariousness of a conversation. Heckle‘s approach, that each visual/textual interjection might be used as an annotation, attempts to turn annotation into an augmentation of the experience of the conversation. Because it is concurrent and live, the participants who heckle may notice and incorporate all kinds of contextual markers outside the view of the video camera, as well as bring their own diverse interpretations and experiences into the heckled conversation.

Most crucially, the variety of ways that people use the Heckle system mirrors the diversity of people’s verbal and non-verbal contributions to the live conversation. The stream of images, video, text and links that result can be seen as a parallel conversation that ‘annotates’ the conversation around the Talkaoke table, but also interacts with it in real time: the representation of the conversation itself becomes conversational.

Research Strategy

There are so many questions to be asked about this approach: about the user interfaces, about how and whether Heckle does really ‘augment’ the experience, lead to further engagement, and how it influences people’s interpretation and behaviour. However, with the time and resources available at this stage, the goals will have to be very limited and specific.

My research task at hand is to enquire about this ‘conversational metadata’: what is it? To what extent can it be considered ‘metadata’? What objects does it’s metadata relate to; to the conversation around the Talkaoke table, or to the people Heckling? And to what extent does it correlate (or not) with other forms of annotation and representation of these objects?

Asking this question will involve re-purposing the Heckle system to create scenarios in which this correlation can be measured.

To be specific, rather than using Heckle to annotate a live conversation around the Talkaoke table, I will be using it to annotate a group of people watching Dr. Who Season 4 Episode 1, ‘Partners in Crime’.

This is an opportunistic choice of programme, suggested to me by Pat Healey because he happens to have supervised my MAT colleague Toby Harris on the BBC Stories project, with Paul Rissen and Michael Jewell to annotate this episode of Dr. Who in an exemplary ‘top-down’ fashion, developing and then using the BBC stories ontology.

The plan, then, is to gather a group of people to sit watch TV together, and to provide them with The People Speak’s Heckle system as a means of interacting with each other, layered on top of the Dr Who video. The resulting conversational metadata can then be compared to the detailed, semantic annotatations provided by the BBC Stories project.

Evaluation Strategy

There are a number of possible methods to use to make this comparison, although it will be hard to tell which to use before being able to look at the data.

It may be useful to simply look at mentions of characters, plot developments, and other elements in the BBC Stories ontology, and see whether they appear at equivalent moments in the BBC Stories annotation and the heckled conversation. A basic measurement of correlation could be gathered from that kind of comparison, and would indicate whether the two forms of metadata are describing the same thing.

Similarly, it might be useful to demonstrate the differences between the conversational metadata and the BBC Stories version by looking for conversational annotations that relate to the specific context of the experience of watching the episode: the space, the food, the sofa. These would (of course) be absent from the BBC Stories annotation.

However, another strategy, which Toby Harris and I concocted while playing with LUpedia, a semantic ‘encrichment’ service that takes free text, and attempts to match it with semantic web resources such as DBPedia, and return structured metadata.

If it would be possible to feed LUpedia the BBC Stories ontology, and then feed it with the episode of Dr Who in question as a dataset, it should be possible to submit people’s heckles to it, and see if LUpedia returns relevant structured data.

If LUpedia can enrich people’s Heckles with metadata from the BBC Stories dataset, that should indicate that the heckles are pertinent to the same object (in this case, the episode of Dr Who), and might therefore be seen as conversational metadata for it{{1}}.

[[1]]My conversational metadata will probably also describe the interactional experience of watching the show, and other contextual references that will be absent from the BBC Stories annotation. However, it is important to show that the two types of metadata relate to at least one of the same objects. If this is not demonstrable, it does create some ‘fun’ philosophical problems for my research such as what conversational *does not* refer to. That one might be harder to answer.[[1]]

Conversational Annotation Read More »

Heckle

Since 2007, our art collective The People Speak have been working on ways of trying to make the 13+ years of conversational oral history we have on archive public and searchable.

The conversations between people who meet around the Talkaoke table{{1}}, on street corners, at festivals, schools, or conferences have been recorded and archived on every format going from digi-beta to hi-8, RealMedia (oh God, the 90’s), and miniDV. For the last two years, we have finally moved to digital only, but the archival backlog is intimidating.

As challenging as the digitisation and archival issues are, the real problem is figuring out what people are talking about in this mountain of data. All the conversations facilitated by The People Speak are spontaneous, off the cuff, and open to people changing tack at any point. This has made it almost impossible to provide a thematically structured archive.

And this problem is not unique to this rather speciliased context. Aren’t all conversations, questions and answer sessions, and in fact, pretty much anything that involves people interacting with each other on video subject to the same contingencies of meaning?

If my early-stages training in Conversation Analysis have shown me anything, it’s that the apparent ‘content’ of a conversation is impossible to represent in any way other than through further conversations, and observations of how people work to repair their misunderstandings.

The Heckle System

The People Speak’s response to this problem has been the ‘Heckle’ system.

Using ‘Heckle’, an operator, or multiple participants in a conversation may search for and post google images, videos, web links, wikipedia articles or 140 characters of text, which then appear overlayed on a projected live video of the conversation.

Here is a picture of Heckle in use at the National Theatre, after a performance of Greenland.

As you can see, the people sitting around the Talkaoke table aren’t focused on the screens on which the camera view is projected live. The aim of the Heckle system is not to compete with the live conversation as such – but to be a backchannel, throwing up images, text and contextual explanations on the screen that enable new participants to understand what’s going on and join in the conversation.

The Heckle system also has a ‘cloud’ mode, in which it displays a linear representation of the entire conversation so far, including snapshots from the video at the moment that a heckle was created, alongside images, keywords, ‘chapter headings’ and video.

This representation of the conversation is often used as part of a rhetorical device by the Talkaoke host to review the conversation so far for the benefit of people who have just sat down to talk. A ‘Heckle operator’ can temporarily bring it up on a projection or other nearby display and the host then verbally summarises what has happened so far.

It also often functions as a modifier for what is being said. Someone is talking about a subject, and another participant or viewer posts an image which may contradict or ridicule their statement; someone notices and laughs, everyone’s attention is drawn to the screen momentarily, then returns to the conversation with this new interjection in mind. Some people use the Heckle system because they are too shy to take the microphone and speak. It may illustrate and reinforce or undermine and satirize. Some ‘heckles’ are made in reply to another heckle, some in reply to something said aloud, and vice versa.

If keywords are mentioned in the chat, those keywords can be matched to a timecode in the video, in effect, the heckled conversation becomes an index for the video recorded conversation: the conversation annotates the video{{2}}.

[[1]] Talkaoke, if you’ve never seen it before, is a pop-up talk-show invented by Mikey Weinkove of The People Speak in 1997. It involves a doughnut-shaped table, with a host sitting in the middle on a swivelly chair, passing the microphone around to anyone who comes and sits around the edge to talk. Check out the Talkaoke website if you’re curious.[[1]] [[2]] People don’t just post keywords. It’s quite important that they can post images and video too. The search terms they use to find these resources can also be recorded and used as keywords to annotate the video. A further possibility for annotation is that a corpus of pre-annotated images, such as those catalogued using the ESPgame could be used to annotate the video. This would then provide a second level of annotation: the annotations of the images used could be considered to be ‘nested’ annotations of the Talkaoke conversation. [[2]]