Msc

3 Representations of Dr Who

Three representations of Dr Who?
Three representations of Dr Who? Script, RDF and Chat

I have three representations of Dr Who. S4E1 sitting in front of me:

  1. A Semantic annotation of the episode based on the BBC Stories Ontology, both by Michael O. JewellPaul Rissen, and Toby Harris{{1}}.
  2. The script for the episode, by Russel T Davies
  3. A transcript of a couple of very rowdy screenings of the episode I organised at The People Speak HQ during which people heckled at the screen using short messages, images and video.

What’s hurting my brain at the moment is a question of representation. In this triple, if ‘represents’ is the predicate, which is the subject and which is the object?

  • Is the Semantic annotation a representation of Dr Who S4E1: Partners in Crime the TV show, or is it a representation of the experience and interpretation of the person watching and annotating it? Or both?
  • In the same way, is the transcript of the conversation a representation of people’s experience of watching the episode and making social sense of it together, but with a lot more context?
  • Is the episode itself a representation of the shooting script?

Which philosophical texts can I turn to to help me make sense of this?

But most crucially (for my purposes), how can I best understand the similarities and differences between 1 (the semantic annotation) and 3 (the conversational transcript)?

I had a few ideas about this, mostly based on text-mining the conversation transcript via concept-extraction services such as LUpedia or Alchemy API to see if snatches of conversation can identify related entities within the annotation’s timeline, but feedback from the wonderful Jo Walsh was sceptical of this approach.

Basically, her critique was that

  1. Using text-mining/concept extraction privileges text, whereas the heckle stream is very visual, and that seems important.
  2. Entity-recognition/tagging services will yield a very variable quality of metadata. They’re designed to look for something specific in text and match it, and tend to require quite a bit of context (more than 140 characters of text)
  3. Asking the question “to what extent can this be considered metadata” will get very inconclusive answers, which will question the point of asking the question in the first place.

I think I agree with point 3 – which questions the point of this blog post, but I think I still need some kind of bottom-up analysis of the relatedness of the data, and although I’d like to just disregard the slightly solipsistic question of what is representing what, it would be nice to be able to attribute any philosophical assertions to someone other than myself!

[[1]] Here’s the OWL for the episode. Here’s the n3 formatted annotation of the episode [[1]]

Conversational Scenario Design


This scenario is designed to elicit and capture conversation between a group of people who are watching a specific episode of Dr. Who together.

The aim is to be able to compare existing formal metadata for this episode with this speculative ‘conversational metadata’, and evaluate it as an alternative representation of the same media object: Dr Who, Season 4, Episode 1, Partners in Crime.

The Setup

Two groups of eight people are invited to watch of an episode of Dr Who together on a large screen, during which they use their laptops and a simple text/image/video annotation interface to type short messages or send images onto the screen where they are visible as an overlay on top of the video of Dr Who.

The room is laid out in a ‘living room’ arrangement to support co-present viewing and interaction between participants, with comfortable seating arranged in a broad semi-circle, oriented towards a large projected video screen about ten feet away. Each participant is asked to bring their own laptop, tablet PC, or other wifi-enabled device with a web browser.

After making sure that all participants are on the network, there is an introductory briefing where they are given a presentation explaining the aims of the project and that they are free to walk around, use their laptops or just talk, and help themselves to food and drink during the screening.

The Annotation Tool

The system that the participants are using on their laptops/tablets or mobile phones has a simple web-based client, enabling viewers to choose a colour to identify themselves on the screen, and then type in 140 characters of text or search for images and video, before sending them to the main screen.

Users are asked to choose a colour
Users are asked to choose a colour
The 'red' user's annotation interface with image search
The ‘red’ user’s annotation interface with image search
Search results for 'knitted adipose' before posting to screen
Search results for ‘knitted adipose’ before posting to screen

The Display Screen

The video of Dr Who is projected on a ‘main’ screen, alongside text, images and video clips sent by viewers in a fullscreen browser window. The images and videos sent by users have a coloured outline, and text-bubbles are coloured to indicate who posted them.

Dr Who layered with text, image and video annotations.
Dr Who layered with text, image and video annotations.

Images and videos run underneath the video in a ‘media bar’, while text bubbles posted by users drop onto the screen in random positions, but can be re-arranged on the screen or deleted by a ‘facilitator’.

Rationale

This ‘conversational scenario’ is a hybrid of various methods in which researchers have contrived situations to elicit data from participants. Before making any claims about the data gathered, some clarification of the purpose and methods of the scenario are necessary.

Ethnographic Studies of Social TV have tended to use audiovisual recordings of TV viewers in naturalistic settings as their primary source, and analytical methods such as Conversation Analysis and participant observation have been used to deepen their understanding of how people use existing TV devices and infrastructures in a social context.

HCI approaches to designing Social TV systems have built novel systems and undertaken user testing and competitive analysis of existing systems in order to better understand the relationship between people’s social behaviours around TV, and the heuristics of speculative Social TV{{1}} devices and services.

Semantic Web researchers have opportunistically found ways to ‘harvest’ and analyse communications activity from the Social Web, as well as new Social TV network services that track users’ TV viewing activity as a basis for content recommendations and social communication.

All of these approaches will be extremely useful in developing better conversational annotation systems, and improving understanding and design of Social TV for usability, and for making better recommendations.

Although the conversational scenario described borrows from each of these methods, it’s primary objective is to gather data from people’s mediated conversations had around a TV in order to build a case for seeing and using it as metadata.

System design, usability, viewer behaviour, user profiles, choices of video material, and the effect those issues have on the quality and nature of the captured metadata are a secondary concern to this first step in ascertaining whether conversations can be captured and treated as metadata pertaining to the video in the first place.

[[1]]I am using the term Social TV, following one of the earliest papers to coin the phrase by Oehlberg et. al (2006) to refer to Interactive TV systems that concentrate on the opportunities for viewer-to-viewer interaction afforded by the convergence of telecoms and broadcast infrastructures. Oehlberg, L., Ducheneaut, N., Thornton, J. D., Moore, R. J., & Nickell, E. (2006). Social TV: Designing for distributed, sociable television viewing. Proc. EuroITV (Vol. 2006, pp. 25–26). Retrieved from http://best.berkeley.edu/~lora/Publications/SocialTV_EuroITV06.pdf [[1]]

Conversational Annotation

Annotation of a conversation would usually be a post-hoc chore undertaken by someone charged with watching a documentary or ethnographic video and ‘making sense’ of the diffuse multifariousness of a conversation. Heckle‘s approach, that each visual/textual interjection might be used as an annotation, attempts to turn annotation into an augmentation of the experience of the conversation. Because it is concurrent and live, the participants who heckle may notice and incorporate all kinds of contextual markers outside the view of the video camera, as well as bring their own diverse interpretations and experiences into the heckled conversation.

Most crucially, the variety of ways that people use the Heckle system mirrors the diversity of people’s verbal and non-verbal contributions to the live conversation. The stream of images, video, text and links that result can be seen as a parallel conversation that ‘annotates’ the conversation around the Talkaoke table, but also interacts with it in real time: the representation of the conversation itself becomes conversational.

Research Strategy

There are so many questions to be asked about this approach: about the user interfaces, about how and whether Heckle does really ‘augment’ the experience, lead to further engagement, and how it influences people’s interpretation and behaviour. However, with the time and resources available at this stage, the goals will have to be very limited and specific.

My research task at hand is to enquire about this ‘conversational metadata’: what is it? To what extent can it be considered ‘metadata’? What objects does it’s metadata relate to; to the conversation around the Talkaoke table, or to the people Heckling? And to what extent does it correlate (or not) with other forms of annotation and representation of these objects?

Asking this question will involve re-purposing the Heckle system to create scenarios in which this correlation can be measured.

To be specific, rather than using Heckle to annotate a live conversation around the Talkaoke table, I will be using it to annotate a group of people watching Dr. Who Season 4 Episode 1, ‘Partners in Crime’.

This is an opportunistic choice of programme, suggested to me by Pat Healey because he happens to have supervised my MAT colleague Toby Harris on the BBC Stories project, with Paul Rissen and Michael Jewell to annotate this episode of Dr. Who in an exemplary ‘top-down’ fashion, developing and then using the BBC stories ontology.

The plan, then, is to gather a group of people to sit watch TV together, and to provide them with The People Speak’s Heckle system as a means of interacting with each other, layered on top of the Dr Who video. The resulting conversational metadata can then be compared to the detailed, semantic annotatations provided by the BBC Stories project.

Evaluation Strategy

There are a number of possible methods to use to make this comparison, although it will be hard to tell which to use before being able to look at the data.

It may be useful to simply look at mentions of characters, plot developments, and other elements in the BBC Stories ontology, and see whether they appear at equivalent moments in the BBC Stories annotation and the heckled conversation. A basic measurement of correlation could be gathered from that kind of comparison, and would indicate whether the two forms of metadata are describing the same thing.

Similarly, it might be useful to demonstrate the differences between the conversational metadata and the BBC Stories version by looking for conversational annotations that relate to the specific context of the experience of watching the episode: the space, the food, the sofa. These would (of course) be absent from the BBC Stories annotation.

However, another strategy, which Toby Harris and I concocted while playing with LUpedia, a semantic ‘encrichment’ service that takes free text, and attempts to match it with semantic web resources such as DBPedia, and return structured metadata.

If it would be possible to feed LUpedia the BBC Stories ontology, and then feed it with the episode of Dr Who in question as a dataset, it should be possible to submit people’s heckles to it, and see if LUpedia returns relevant structured data.

If LUpedia can enrich people’s Heckles with metadata from the BBC Stories dataset, that should indicate that the heckles are pertinent to the same object (in this case, the episode of Dr Who), and might therefore be seen as conversational metadata for it{{1}}.

[[1]]My conversational metadata will probably also describe the interactional experience of watching the show, and other contextual references that will be absent from the BBC Stories annotation. However, it is important to show that the two types of metadata relate to at least one of the same objects. If this is not demonstrable, it does create some ‘fun’ philosophical problems for my research such as what conversational *does not* refer to. That one might be harder to answer.[[1]]

 

Conversational Metadata

In my last post about the Social TV research context I explained that I decided to focus on generating and evaluating conversational video metadata by eliciting mediated conversations through SocialTV.

Before drilling down into that choice to distill a set of research questions, however, there’s a more basic question about the context of research into SocialTV metadata: why gather SocialTV metadata at all?

slide showing a graphic from the Notube project's
A slide showing a graphic from one of the Notube.tv project’s presentations

The Notube project has a diagram on this issue, which shows the progression of a piece of TV content along a timeline from pre-broadcast to media archive. The assertion is that having more metadata about user preferences enhances the value of TV content because it provides many more opportunities to recommend programmes. Although the Notube project adds a great deal to existing research on recommendation systems, Notube’s central focus follows most existing research in focusing on developing more accurate and sophisticated user profiles (in this case through aggregating media consumption habits from heterogeneous sources on the web).

The way the diagram is re-used in my slide above emphasises the corrolary of the point that it is intended to illustrate: that having more TV programme metadata would also enhance the frequency and accuracy of recommendation for content (and thereby its value) throughout its lifecycle {{2}}.

If metadata (about profiles or programme content) can enhance the value of a TV show and facilitate production, discovery and delivery of TV programmes because it increases the likelihood of that programme being recommended, then it should follow that the greater the richness and referential diversity of that metadata, the more ‘recommendable’ the programme becomes.

This suggestion presupposes that the metadata in question is relevant, rather than a random spamming of references, intended to maximise recommendation in all possible contexts. So the question then becomes: what determines relevance in this context, and, even more importantly, relevance to what?

It may seem self-evident that the metadata about users should be relevant to their consumption habits, and that these habits, recorded and aggregated should constitute their preferences or dispreferences. Even if this widespread assumption holds true, what should programme metadata relate to? Are broadcaster’s assertions about their programes necessarily relevant to the way those programmes are discovered and interpreted? And crucially, in SocialTV – which almost by definition is about the interaction between viewers brokered via a networked TV infrastructure, how do those metadata correlate with the way TV programmes are used as a prop, touch-point, or stimulus to conversation between viewers{{1}}?

Current content discovery heuristics tend to rely on user profiles and broadcaster-provided metadata, without necessarily taking into account the context and quality of interactions between users.

The hypothesis of this research project is that metadata derived from conversations between viewers via SocialTV can provide a crucial additional component to support the interactional possibilities of of SocialTV.

To test this, an experimental scenario will be developed to involve concurrent, co-present viewers of a TV programme in a public multimedia chat system, designed to elicit metadata from their conversation. The transcript of their interactions will then undrego Conversation Analysis. This analysis will provide a baseline to evaluating to what extent conversations around a SocialTV experience can be correlated to a detailed and highly granular ‘top down’ semantic annotation of a TV programme.

An analysis of the data may also be used to test several related hypotheses:

  • that conversational metadata are likely to have more divergent subject matter and more external reference than a-priori programme data about actors, characters and plot developments
  • that conversational metadata are likely to be more responsive to ways in which the context of the conversation changes{{3}}.
    Social TV Research Context - constrained to areas relevant to this project
    Social TV Research Context – constrained to areas relevant to this project

So in terms of my earlier exploration of SocialTV as a research context, I can start to narrow my focus onto a few areas.

To achieve the study outlined above, I am proposing to deploy a multimedia discussion interface that will allow co-present concurrent viewers of a TV programme to interact and converse as freely as possible. Although it may have significant design issues, the state of the art in this context is definitely the ‘hot topic’ of Second Screen/Companion Devices. This is not the central point of the study, however, so I will be approaching this part of the project as a design process – building on prior art – which is abundant at the moment – and iterating out a customised version as quickly as possible to find something basic that works well enough for me to get the conversational data I’m looking for.

The other research objectives in the slide above are already in order of priority: eliciting and capturing discussion is the most pressing need for the system. Other functions and features are interesting, but probably out of scope for the time being. However, I might well post some ideas to this blog about how conversational metadata might underpin new approaches to searching, filtering, annotating and segmenting video.

[[1]] There’s still the thorny issue of what determines ‘relevant’ in this context. There may be perceptual studies comparing recommendation engines or other ways of trying to determine relevance of metadata statistically. However, for the purpose of this study, the question is moot. Relevance, especially in terms of conversational metadata, is subjective unless based on evidence of how it is used to broker or support an interpersonal interaction via SocialTV.[[1]] [[2]] This is also core to Notube’s aims and methods: using highly granular metadata about programmes to create ‘serendipitous’ trails through content that can break through the sameness of personalisation recommendation systems that can tend to channel users into relatively static, homogenous clusters of content. [[2]] [[3]]For example, a programme might be broadcast 50 times over 15 years. Conversational metadata associated with the programme might change significantly over time and in the different contexts in which it is watched, accumulating qualified layers of annotations, whereas broadcaster-provided metadata is likely to remain static.[[3]]

How relevant are user profiles in SocialTV?

An illustrative slide of The Beancounter, by the Notube Project
An illustrative slide of The Beancounter, by the Notube Project

As Libby Miller’s presentation of the notube project’s ‘beancounter’ profiling engine points out, there are considerable privacy concerns with user profiling in SocialTV. To what extent will users be willing to share data on their personalised viewing in order to benefit from IPTV services?

Despite privacy concerns, much of the research into SocialTV focusses on
developing and using user profiles to provide more personalised TV services.

You don't tend to login when watching TV
You don’t tend to login when watching TV

Many iTV systems and research projects, take their cue from (mostly)
single-user devices such as smartphones / tablets and computers, and focus
their innovative edge on providing services to users via ‘user profiles’.
However, because of social conventions of TV use (a shared room, a shared piece
of equipment, without a ‘login’ paradigm for use, it is a significant challenge
to find out which user in a household is watching TV, and provide appropriate
recommendations and other services (Yu, Zhou, Hao, Gu, 2006) {{1}}.

This presents practical and theoretical issues to providing single-user and
group-centric services via iTV, for which a variety of approaches are being
adopted, including household and group profile aggregation (Yu et. al, 2006),
(Shin & Woo, 2006){{2}} and context profiling (identifying who is likely to watch
at different times of day/days of week) (Vildjiounaite, Kyllanen, Hannula &
Alahuhta, 2008) {{3}}.

Recent research has developed more complex and multi-dimensional methods for
collecting very detailed logging data in order to identify groups and group
behaviour (for example, channel hopping) and characterising group dynamics (as
homogeneous, eg. group of friends or heterogeneous eg. a family) before
applying recommender systems. (Sotelo, Blanco-Fernandez, Lopez-Nores,
Gil-Solla, Pazos-arias, 2009) {{4}}.

But just as the social dynamic of a group may have a huge impact on the kinds
of recommendations and services that are appropriate in different contexts,
individuals may be just as complex and variable in their tastes depending on
the context and particularly on the interactions they are engaged in at the
time. Furthermore, by dint of their interactions (whether co-present or remote
via social networks or chat), viewers of iTV may be seen as a constantly
remotely connected group or series of groups – both heterogeneous and
homogeneous – that viewers drop into and out of depending on their
communications activity.

In this case, where the make-up of groups and individuals is constantly
shifting, the pre-selection of content by user-profiles may become an obstacle
to the fluidity and fluency of the ways people deploy and share iTV as a means
of interacting with each other.

The research aim of this project is to investigate how people deploy
the components of Social TV in order to interact
. Therefore, rather
than looking at user’s choices and behaviours as a way of understanding and
profiling them, this project looks at how users interact with each other as a
way of understanding the media they deploy and the systems they use and adapt
to do so.

From this perspective, the user profile as a way of understanding the user becomes a secondary concern because the user does not ‘deploy’ their profile: it is built up around them, based on data gathered from their interactions with content, networks, services and other users, using predetermined a-priori heuristics.

If the question of interactive TV is about how users interact with each other, rather than how they interact with the TV, then the crucial elements to understand are the fluency and subtlety of the interfaces they have to each other, and how accessible and readily usable the components of iTV can be in their conversations. How readily can viewers find and manipulate media that they want to use to participate in an interaction? How specific can they be about a piece of media or a sub-section of a piece of media that they’re sharing or commenting on? How are their conversations brokered? And what can their interaction tell us about the media they’re using in order to express themselves and interact?

[[1]] Yu, Z., Zhou, X., Hao, Y., & Gu, J. (2006). TV Program Recommendation for Multiple Viewers Based on user Profile Merging. User Modeling and User-Adapted Interaction, 16(1), 63-82. doi: 10.1007/s11257-006-9005-6. [[1]] [[2]]Shin, C., & Woo, W. (2009). Socially aware tv program recommender for multiple viewers. IEEE Transactions on Consumer Electronics, 55(2), 927-932. doi: 10.1109/TCE.2009.5174476.[[2]] [[3]]Vildjiounaite, E., Kyllanen, V., Hannula, T., & Alahuhta, P. (2008). Unobtrusive Dynamic Modelling of TV Program Preferences in a Household, 82-91. [[3]] [[4]] Sotelo, R., Blanco-Fernandez, Y., Lopez-Nores, M., Gil-Solla, A., & Pazos-arias, J. (2009). TV program recommendation for groups based on muldimensional TV-anytime classifications. IEEE Transactions on Consumer Electronics, 55(1), 248-256. doi: 10.1109/TCE.2009.4814442.[[4]]

Social TV Research Context

Outline of the Social TV research context
Outline of the Social TV research context

iTV has had a huge impact on all areas of television research, from production, and delivery to discovery and enrichment of television content and communications.

The delivery of TV media, as discussed in an earlier post, is no longer limited to one, static device in the home. Set top boxes (STBs) from Sky or BT, Over The Top (OTT) video / media services like youtube or vimeo, and the increasing use of smartphones, PSPs and other mobile devices has fragmented the experience of TV viewing into a multitude of discrete media delivery contexts.

For example, this video by the German design firm syzygy shows an consumer-centric overview of the technologies and interaction scenarios currently available in iTV. Simply put: the user controls the TV with a fancy remote (based on an iPad or tablet), which synchronises content, games and communications between ‘friends’, with product placement and advertising.

Another significant area of research in iTV is the use of network communications technology to enhance the usability of Electronic Programme Guides (EPGs), which, faced with ever increasing amounts of available content, present significant challenges to broadcasters and consumers to highlight and recommend appropriate content. More sophisticated, browser-like iTV interfaces and devices have opened up the possibility of using techniques for content navigation and selection derived from social networks and other web services.

Vidque - an example of an innovative iTV content discovery service
Vidque – an example of an innovative iTV content discovery service

For example, vidque.com uses a twitter-like user interaction model of users’ personal ‘streams’ of content (in twitter, 140 characters of text), to share videos. By watching and posting videos, each person’s viewing habits become a channel, which other users can watch and ‘follow’, providing a more personalised and network-centric alternative to the a-priori genre or time-based list format of most EPGs. Vidque incites users to ‘curate’ their streams of videos, expressing something about their tastes and interests, creating another mechanism for them to project their identity into the social network through their choice of media.

iTV’s impact on TV production has only recently started to propose new business models for how media production is funded. For example, independent producers
such as Robert Greenwald, whose politically charged documentaries have found it hard to gain funding from traditional studios or broadcasters, have used the web to connect directly with media consumers who are happy to ‘pre-buy’ the film on DVD to fund its production. Many more of these ‘crowd funded’ projects are being produced in this way via services such as Kickstarter.com.

Resonant Object (ARG) and Where are the Joneses (UGC)
Resonant Object (ARG) and Where are the Joneses (UGC)

As well as funding, iTV has also enabled other, more participatory ways for viewers to participate in TV production. For example Resonant Object is a new science fiction TV series that involved viewers in an expanded narrative via an Augmented Reality Game (ARG), happening in various different channels. Their involvement may extend from sitting in front of a TV to participating in a mobile phone-based game or even attending live events and contributing audiovisual content to what becomes part of the TV show.

Where are the Joneses was a pioneering project by David Bausola at Imagination, which got a brand (Ford in this case) to fund a soap opera which was scripted collaboratively by viewers of the sohw, participating on a public wiki. Their scripts were then produced by a professoinal crew and distributed via Youtube.

The fact that a brand like Ford funded Where are the Joneses demonstrates that this is a viable and potentially very disruptive business model. Rather than Ford paying an advertising agency to do product placement in an established TV show or film, they simply directly commissioned the use of a existing web platforms (wikidot, wordpress and youtube) to enable the voluntary user-generated scripting of the show, and funded the professionals to produce it, demonstrating how the advertising and broadcasting industry could be disintermediated by this kind of TV production process.

More generally, the infrastructure of iTV lends itself more to a network paradigm of communication than traditional broadcast. Opportunities abound for the enrichment of an expanded TV experience, in which viewers, broadcasters and other interest groups can participate in a ongoing conversations about media. Opening up the mass of these conversations about TV (via social media and other networked communications channels) that have always been an important part of the viewer experience, drives the changes to how TV and media is being produced, discovered and delivered.

Soundcloud - a great example of enriching content through user interaction
Soundcloud – a great example of enriching content through user interaction

A recent example of how facilitating people’s interactions with each other can enrich understanding of media is the Berlin-based start-up soundcloud. Started by music professionals who had been used to sending each other audio files by email, and were dissatisfied with the inability to discuss and enrich the media they were sending each other in any level of detail.

In the slide above you can see that comments on the audio track are embedded in the waveform of the audio. Their comments actually segment that track into sections that can be seen to relate to each comments, giving a much more granular set of annotations than is possible via email or on other media sharing platforms such as youtube, where comments are attached to entire media files.

A quick comparison of the quality and tone of comments on soundcloud and youtube could suggest that having to take the time to comment on a specific segment of an audio file encourages more context-relevant and incisive discussion (or discourages other kinds of comments and flames). This ‘in-content commentary’ mechanism also provides much more metadata about what the commenter is talking about, because it relates to a specific segment within the media file, which may have a very different meaning or reference than the rest of the track.

Soundcloud is a great example of how how people can be enabled to mobilise networked media in order to interact with each other, and how that interaction can be harnessed to enrich the experience and annotate the media being discussed.

So much of the effort in iTV research seems focused on trying to understand and profile TV viewers to then try to guess at and satisfy their established tastes. Soundcloud, instead, focuses on stimulating people’s interactions, using media (text/comments), via another piece of media(an audio file), as a way of understanding that media, and enhancing that media in terms of how it can be indexed, discovered, shared and re-used for further interactions.

This approach, which seems comparatively under-researched, highlights another possibility: that of investigating people’s conversations by looking at how they deploy, share and interject text, images, videos etc. in their interactions around other media (eg. TV).

That is the aim of this research project: to investigate mechanisms and heuristics for harnessing discussion between viewers of iTV, as a way of understanding the media they are experiencing and interacting around, and at the same time, looking at how they are deploying media to interact with one another as a way of understanding their conversations.

In short: understanding media through how it’s used in conversation, and conversation through how it uses media.

Social Film Club

Social Film Club project introduction

I’m working at British Telecom at their research centre at Adastral Park, on a project called ‘Social Film Club’. I’m working with a fantastic group there in the Future Applications team, who are all working on new ideas about the possibilities of creating new services and products with the merge of Television and telecommunications infrastructures.

‘Social Film Club’ refers to a whole set of ideas and technologies that BT have already been working with. It builds on work they’ve already been doing with ‘Social TV’, a catch-all term to describe Interactive TV (IPTV) systems that support and extend the sociable aspects of TV viewing {{1}} – both co-present and remotely.

Since both media and communications now use the same IP network infrastructure, television viewing itself is no longer a clearly distinct activity with a specific domestic device and context. When people say they are ‘Watching TV’ they may mean that they are using a specific piece of hardware (‘The TV’ may refer to largest screen in the house, although the same video content may be accessible on smartphones, PCs and tablets). They may be talking about engaging in a set of relationships defined by rights and contracts (for example, a TV
subscription to a specific company with rights to broadcast live sports events){{2}}, and the act itself may involve the interaction of a variety of communications devices and services beyond the time-honoured combination of TV, remote and sofa.

This lack of clarity as to the status and activity of television in a networked media environment coincides with an often-cited ‘fragmentation’ of the ‘traditional’ media environment {{3}}, in which the mass viewership used to be served by a tightly integrated media industry where infrastructure and content production, promotion, delivery and evaluation were all orchestrated by a limited number of organisations.

Cultural ‘reception studies’ of TV soap operas prior to this so-called fragmentation highlighted the social function of TV, such as soap operas being used as a touch point for interpersonal conversations {{4}}, or as a means of social group formation and identification (or counter-identification) {{5}}.

No longer having a constitutive binding ‘sameness’ to what people watch, providing a foil for collective conversation and identity may have contributed to the kinds of social/cultural isolation Putnam discusses in ‘Bowling Alone’, but one of the promises of Social TV is that IP Communications have created new ways for people interact with each other through and with television.

So the broad subject of my research is how the infrastructure IPTV can be mobilised for interpersonal and group interaction.

[[1]] For early examples of use of the term, see Coppens, T., Trappeniers, L., Godon, M.,: AmigoTV: towards a social TV experience. In: Proc. EuroITV 2004, U. of Brighton (2004), and Oehlberg, L., Ducheneaut, N., Thornton, J.D., Moore, R.J., Nickell, E.: Social TV: Designing for Distributed, Sociable Television Viewing. In: Proc. EuroITV 2006, Athens U. of Economics and Business (2006) 251–259 [[1]] [[2]] For example, using the BBC Iplayer to watch ‘catch-up’ TV does not require a ‘TV license’, whereas watching a simultaneous broadcast requires that viewers buy one or face a £1000 fine. (see http://iplayerhelp.external.bbc.co.uk/help/playing_tv_progs/tvlicence) [[2]] [[3]] See Putnam, Robert. 2000. Bowling Alone. New York: Simon and Schuster. [[3]] [[4]] For example, in Sonia Livinstone’s viewer surveys, something like 40% of participants reported that their soap watching was to provide a regular topic of conversation with friends: Livingstone S M (1988) Why People Watch Soap Operas: An Analysis of the Explanations of British Viewers European Journal of Communication Vol 3 #1London: Sage [[4]] [[5]] Ien Ang’s famous studies of Dutch watchers of the American soap ‘Dallas’ suggested that rather than identifying with the glamorous lifestyle of Texas oil billionaires deicted in the soap, Dutch viewers actually watched while engaging in form of a collective critical distancing. Ang I (1985) Watching Dallas: Soap opera and the melodramatic imagination. London: Methuen [[5]]

Can you have a conversation with your TV?

In the Internet of Things, what kind of thing is your TV? And what kind of thing are you?

Researchers designing ‘Social TV’ used conversation analysis to look at how people interact while watching TV (Oehlberg, Ducheneaut, Thornton, Moore, Nickell, 2006){{1}}. They found that viewers integrated the TV into the conversation, making space for it to ‘take turns’ in much the same way as any other conversationalist.

Although the TV is traditionally quite a selfish conversationalist, speaking more than listening (unless you explicitly shut it up with the remote), the prospect of it becoming a more accommodating conversational participant as a networked device is intriguing. For example, when your Boxee remote app on your iphone detects an incoming phonecall, it will pause your TV. That’s not an especially complex interaction, but it points towards the possibility of the TV at least brokering conversation in a more fluid way by letting other devices (your iphone in this case) participate in turn taking.

It would be interesting to see whether a TV that politely paused itself when it detected a sufficiently high level of chatter in the room would encourage or discourage further communication. I suspect the former – like a schoolteacher adopting patient silence with pursed lips while the children settle down. It would be a fun thing to experiment with though.

But the real promise of networked TV is that it becomes more than just a turn-taker, and begins to participate more actively in the conversation.

Perhaps not quite as actively as that, but there are some amusing parallels between some of the proposals for future TV services and the TV-becoming-flesh in Cronenberg’s Videodrome.

For example, some of the most interesting social TV research I’ve found so far – namely the notube project has looked at ways of leveraging network technologies, linked data and Semantic Web strategies for enriching TV viewer’s experience.

Their ‘beancounter’ application uses a variety of sources including social network platforms, to aggregate data about what you’re watching and creates a detailed, machine readable profile of your habits. This profile can then be used to generate better recommendations, or even help to inform and improve your experience of viewing and discussing media. The really difficult bits of this problem – like trying to figure out what is actually being talked about are dealt with gracefully, using ‘good enough’ systems that evaluate multi-lingual natural language text strings and suggest concepts that they may refer to. Ontotext’s LUpedia service provides this entity recognition function for notube. {{2}}

But in a home, where the TV is in a shared space, does the TV learn from each member of the family separately? From listening to discussions at BT, I’ve learned that the ‘problem’ of knowing who is watching the TV, in order to recommend relevant and appropriate content is not going to be solved by having each watcher cumbersomely log in to the TV. Nor is it going to be entirely solved by logged-in or sensed 2nd screen companion devices (not everyone will have one). BT’s immediate strategy will apparently involve a more complex watershed, where what people watch at certain times of day will inform assumptions about who is watching, and what kinds of content to recommend. Communications companies just don’t have this level of access to monitoring our individual behaviours within the home, and there are probably serious privacy and consent implications that will be significant barriers to granting it to them.

And anyway, I’m more interested in what kind of device the TV becomes when it learns from our collective viewing habits, aggregate viewing behaviours and networked discussions. Does the networked TV begin to develop a compound user profile of it’s own? A unique combination of a household’s various proclivities? Is it like the family dog, which everyone interacts with individually and collectively, and is then seen as having a personality, to some extent nurtured through this process.{{3}}

This brings me back to the idea of the TV as a conversational participant. If it can develop a profile, and start to build a model of the various areas of interest and domains of knowledge that a user is interested in, can it participate in a conversation in a more complex way than turn-taking? One of the key ideas of conversation analysis is the notion of ‘repair‘, in which the contingent meanings of utterances between conversational partners are narrowed down and cross-checked for mutual comprehension through all kinds of gestural or verbal cues and repetitions.

Can the TV begin to engage in this level of conversation? Can it’s profile of established interests be used as a source of recommendations that might clarify a misunderstanding of something that has just been said, for example, correcting the misidentification of an actor by people chatting about what they’re watching together on facebook{{4}}. Or could it relieve the co-watcher’s burden of responding to annoying whispered questions during films (‘why is she holding that chainsaw?’) by delving into more complex layers of in-programme dramaturgic medatada and providing some suggested explanations{{5}}.

Can the profiles of individual viewers and their shared TVs then be evaluated quantitatively for similarities over specific periods of time as a measure of the effectiveness of this kind of conversational grounding with various types of content and TV format?

And at what point do we reach the threshold of complexity, fluency and multi-valency beyond which these kinds of interactions with your TV can be thought of as a conversation?

[[1]] Oehlberg, L., Ducheneaut, N., Thornton, J. D., Moore, R. J., & Nickell, E. (2006). Social TV : Designing for Distributed , Sociable Television Viewing. Theater. [[1]] [[2]] I think of this as graceful because it addresses a hugely complex set of contingencies in a simple and contingent way, by issuing a query to a good enough service via a standard API, that does something useful, and assumes that in the future, when there is more linked, semantically enriched data, and more advanced inference services available, the API can just be plugged into those. [[2]] [[3]] I’m aware that this idea that pets do not have Disney-like anthropomorphic personalities is not popular, especially with British people, and I’m not backing it up with anything other than my own supposition that this is the case, and that your animals would eat you in a second if they were hungry enough and you were incapable of defending yourself. [[3]] [[4]] In the lingo of conversation analysis this would be called ‘self-initiated self-repair’ [[4]] [[5]] Other-initiated self-repair [[5]]

Mystery Science Research Lab 2011

It’s my second day at BT’s Future Applications research lab, and I’m sat at my cool 1990’s style ovoid lozenge office desk, doing a broad web-survey of Social TV applications. I’ve just been given my hall pass that allows me in and out of the automatic security doors, each one bearing a sign (in the italic variety of the ubiquitous BT font) saying:

“Please keep noise to a minimum as you walk through this office”.

I’m snorting, wheezing and squeaking with supressed laughter. The researcher sitting opposite me is giving me worried glances, and I can sense heads, peripherally visible over partitions and banks of computer screens, turning and peering in my direction.

While looking at some prototypes of ‘Avatar Party Mode’, a kind of ‘Virtual Sofa’ service available on the Sky Player foR Xbox Live, I had made the mistake of putting on my headphones and calling up some episodes of one of my favourite late 20th Century TV late-night TV shows, Mystery Science Theatre 3000.

The premise of Joel Hodgson’s MST3K, for the uninitiated, is that as part of an evil scientific experiment, ‘Joel Robinson’ and his robot sidekicks are trapped on a deep-space asteroid and forced to watch B-movies, where, represented as cinema-seated silhouettes at the bottom of the screen, they heckle and wisecrack their way through the film.

I am certainly not the first to make this association. Social TV researchers have turned to MST3K’s presentation style as a model for an acceptable user interface to indicate the co-presence remote viewers and even enable chat-partner selection on the bottom third of a TV screen (Ducheneaut, Moore, Oehlberg, Thornton & Nickell, 2008){{1}}, or as an example of potential applications/services that invite viewers to overlay user generated content and republish personalised video streams over IP (Banerjee et. al., 2002){{2}}. In cultural critique, film and television studies, MST3K has also been invoked as a perfect example of a ‘meta-show’, and used to illustrate how ironic re-appropriation of pop-cultural artefacts can express aesthetic dissent (King, 2007) {{3}}.

The CollaboraTV project implemented this kind of interface as the premise for their ground-breaking Social TV application. They even did some viewer expeience research contrasting this kind of ‘virtual audience’ interface with traditional text-chat underneath video playback, and found that the virtual audience increased audience engagement and enjoyment{{4}}.

In terms of interface, ‘user experience’, and it’s choice of B-movie ‘sociable’ media{{5}}, MST3K seems to offer a useful set of guidelines for Social TV design and research, primarily because the acceptance of it’s visual design and irreverant tone was established when it became a popular cult TV programme. But on a more abstract level, MST3K offers inspirational design patterns for Social TV because of how it constantly shifts focus between the viewer and the viewed, opening up endless imaginative, performative and conversational opportunities.

This shifting of focus foregrounds an aspect of television viewing that is often passed over by cultural critique of the ‘dumbing down’ of TV audiences (Bourdieu & Ferguson, 1998){{6}}. Ien Ang wrote about this in her often-cited book ‘Watching Dallas'{{7}}, where she argues that part of the enjoyment of watching the show for global (in her case Dutch) audiences, far from aspirational identification with the camped-up millionaire Texans, is a smug awareness that ‘other people’ are watching the show in earnest, but that for ‘us’, the show’s tastes and values are an object of collective ridicule. Bad TV, in this way, can be seen as a possible object of counter-identification, forming social groups of collective dislike.

Perhaps a successful deisgn for SocialTV could start with a ‘dislike’ button, and build it’s sociality on the collective activites of booing, heckling and throwing things at the screen.

[[1]]Ducheneaut, N., Moore, R., Oehlberg, L., Thornton, J., & Nickell, E. (2008). Social TV: Designing for Distributed, Sociable Television Viewing. International Journal of Human-Computer Interaction, 24(2), 136-154. doi: 10.1080/10447310701821426. [[1]] [[2]]Banerjee, S., Brassil, J., Dalal, A., & Lee, S.-ju, others. (2002). CDNs for personal broadcasting and individualized reception. In Proceedings of WCW. Citeseer. Retrieved March 29, 2011, from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.7.7964. [[2]] [[3]]King, J. (2011). Mystery Science Theater 3000 , Media Consciousness , and the Postmodern Allegory of the Captive Audience Source : Journal of Film and Video , Vol . 59 , No . 4 ( WINTER 2007 ), pp . 37-53 Published by : University of Illinois Press on behalf of the Univer. Film, 59(4), 37-53. [[3]] [[4]] Harrison, C., & Amento, B. (2007). CollaboraTV: Using asynchronous communication to make TV social again. Adjunct Proceedings of EuroITV2007, 218–222.
[[4]] [[5]] In their paper cited above, Ducheneaut, Moore et al. also point to MST3K as a reference point for their observation that some types of content (such as a B-movie) tends to free up people’s attention for more discussion and interaction. [[5]] [[6]]Bourdieu, P., & Ferguson, P. P. (1998). On television and journalism. Pluto Press. [[6]] [[7]] Ang, I. (1985). Watching Dallas: Soap opera and the melodramatic imagination (p. 148). Routledge Kegan & Paul. [[7]]