Conversational Metadata

In my last post about the Social TV research context I explained that I decided to focus on generating and evaluating conversational video metadata by eliciting mediated conversations through SocialTV.

Before drilling down into that choice to distill a set of research questions, however, there’s a more basic question about the context of research into SocialTV metadata: why gather SocialTV metadata at all?

slide showing a graphic from the Notube project's — A slide showing a graphic from one of the Notube.tv project’s presentations

The Notube project has a diagram on this issue, which shows the progression of a piece of TV content along a timeline from pre-broadcast to media archive. The assertion is that having more metadata about user preferences enhances the value of TV content because it provides many more opportunities to recommend programmes. Although the Notube project adds a great deal to existing research on recommendation systems, Notube’s central focus follows most existing research in focusing on developing more accurate and sophisticated user profiles (in this case through aggregating media consumption habits from heterogeneous sources on the web).

The way the diagram is re-used in my slide above emphasises the corrolary of the point that it is intended to illustrate: that having more TV programme metadata would also enhance the frequency and accuracy of recommendation for content (and thereby its value) throughout its lifecycle {{2}}.

If metadata (about profiles or programme content) can enhance the value of a TV show and facilitate production, discovery and delivery of TV programmes because it increases the likelihood of that programme being recommended, then it should follow that the greater the richness and referential diversity of that metadata, the more ‘recommendable’ the programme becomes.

This suggestion presupposes that the metadata in question is relevant, rather than a random spamming of references, intended to maximise recommendation in all possible contexts. So the question then becomes: what determines relevance in this context, and, even more importantly, relevance to what?

It may seem self-evident that the metadata about users should be relevant to their consumption habits, and that these habits, recorded and aggregated should constitute their preferences or dispreferences. Even if this widespread assumption holds true, what should programme metadata relate to? Are broadcaster’s assertions about their programes necessarily relevant to the way those programmes are discovered and interpreted? And crucially, in SocialTV – which almost by definition is about the interaction between viewers brokered via a networked TV infrastructure, how do those metadata correlate with the way TV programmes are used as a prop, touch-point, or stimulus to conversation between viewers{{1}}?

Current content discovery heuristics tend to rely on user profiles and broadcaster-provided metadata, without necessarily taking into account the context and quality of interactions between users.

The hypothesis of this research project is that metadata derived from conversations between viewers via SocialTV can provide a crucial additional component to support the interactional possibilities of of SocialTV.

To test this, an experimental scenario will be developed to involve concurrent, co-present viewers of a TV programme in a public multimedia chat system, designed to elicit metadata from their conversation. The transcript of their interactions will then undrego Conversation Analysis. This analysis will provide a baseline to evaluating to what extent conversations around a SocialTV experience can be correlated to a detailed and highly granular ‘top down’ semantic annotation of a TV programme.

An analysis of the data may also be used to test several related hypotheses:

that conversational metadata are likely to have more divergent subject matter and more external reference than a-priori programme data about actors, characters and plot developments
that conversational metadata are likely to be more responsive to ways in which the context of the conversation changes{{3}}.
Social TV Research Context – constrained to areas relevant to this project

So in terms of my earlier exploration of SocialTV as a research context, I can start to narrow my focus onto a few areas.

To achieve the study outlined above, I am proposing to deploy a multimedia discussion interface that will allow co-present concurrent viewers of a TV programme to interact and converse as freely as possible. Although it may have significant design issues, the state of the art in this context is definitely the ‘hot topic’ of Second Screen/Companion Devices. This is not the central point of the study, however, so I will be approaching this part of the project as a design process – building on prior art – which is abundant at the moment – and iterating out a customised version as quickly as possible to find something basic that works well enough for me to get the conversational data I’m looking for.

The other research objectives in the slide above are already in order of priority: eliciting and capturing discussion is the most pressing need for the system. Other functions and features are interesting, but probably out of scope for the time being. However, I might well post some ideas to this blog about how conversational metadata might underpin new approaches to searching, filtering, annotating and segmenting video.

[[1]] There’s still the thorny issue of what determines ‘relevant’ in this context. There may be perceptual studies comparing recommendation engines or other ways of trying to determine relevance of metadata statistically. However, for the purpose of this study, the question is moot. Relevance, especially in terms of conversational metadata, is subjective unless based on evidence of how it is used to broker or support an interpersonal interaction via SocialTV.[[1]] [[2]] This is also core to Notube’s aims and methods: using highly granular metadata about programmes to create ‘serendipitous’ trails through content that can break through the sameness of personalisation recommendation systems that can tend to channel users into relatively static, homogenous clusters of content. [[2]] [[3]]For example, a programme might be broadcast 50 times over 15 years. Conversational metadata associated with the programme might change significantly over time and in the different contexts in which it is watched, accumulating qualified layers of annotations, whereas broadcaster-provided metadata is likely to remain static.[[3]]

Leave a Comment