blog

Can you have a conversation with your TV?

In the Internet of Things, what kind of thing is your TV? And what kind of thing are you?

Researchers designing ‘Social TV’ used conversation analysis to look at how people interact while watching TV (Oehlberg, Ducheneaut, Thornton, Moore, Nickell, 2006){{1}}. They found that viewers integrated the TV into the conversation, making space for it to ‘take turns’ in much the same way as any other conversationalist.

Although the TV is traditionally quite a selfish conversationalist, speaking more than listening (unless you explicitly shut it up with the remote), the prospect of it becoming a more accommodating conversational participant as a networked device is intriguing. For example, when your Boxee remote app on your iphone detects an incoming phonecall, it will pause your TV. That’s not an especially complex interaction, but it points towards the possibility of the TV at least brokering conversation in a more fluid way by letting other devices (your iphone in this case) participate in turn taking.

It would be interesting to see whether a TV that politely paused itself when it detected a sufficiently high level of chatter in the room would encourage or discourage further communication. I suspect the former – like a schoolteacher adopting patient silence with pursed lips while the children settle down. It would be a fun thing to experiment with though.

But the real promise of networked TV is that it becomes more than just a turn-taker, and begins to participate more actively in the conversation.

Perhaps not quite as actively as that, but there are some amusing parallels between some of the proposals for future TV services and the TV-becoming-flesh in Cronenberg’s Videodrome.

For example, some of the most interesting social TV research I’ve found so far – namely the notube project has looked at ways of leveraging network technologies, linked data and Semantic Web strategies for enriching TV viewer’s experience.

Their ‘beancounter’ application uses a variety of sources including social network platforms, to aggregate data about what you’re watching and creates a detailed, machine readable profile of your habits. This profile can then be used to generate better recommendations, or even help to inform and improve your experience of viewing and discussing media. The really difficult bits of this problem – like trying to figure out what is actually being talked about are dealt with gracefully, using ‘good enough’ systems that evaluate multi-lingual natural language text strings and suggest concepts that they may refer to. Ontotext’s LUpedia service provides this entity recognition function for notube. {{2}}

But in a home, where the TV is in a shared space, does the TV learn from each member of the family separately? From listening to discussions at BT, I’ve learned that the ‘problem’ of knowing who is watching the TV, in order to recommend relevant and appropriate content is not going to be solved by having each watcher cumbersomely log in to the TV. Nor is it going to be entirely solved by logged-in or sensed 2nd screen companion devices (not everyone will have one). BT’s immediate strategy will apparently involve a more complex watershed, where what people watch at certain times of day will inform assumptions about who is watching, and what kinds of content to recommend. Communications companies just don’t have this level of access to monitoring our individual behaviours within the home, and there are probably serious privacy and consent implications that will be significant barriers to granting it to them.

And anyway, I’m more interested in what kind of device the TV becomes when it learns from our collective viewing habits, aggregate viewing behaviours and networked discussions. Does the networked TV begin to develop a compound user profile of it’s own? A unique combination of a household’s various proclivities? Is it like the family dog, which everyone interacts with individually and collectively, and is then seen as having a personality, to some extent nurtured through this process.{{3}}

This brings me back to the idea of the TV as a conversational participant. If it can develop a profile, and start to build a model of the various areas of interest and domains of knowledge that a user is interested in, can it participate in a conversation in a more complex way than turn-taking? One of the key ideas of conversation analysis is the notion of ‘repair‘, in which the contingent meanings of utterances between conversational partners are narrowed down and cross-checked for mutual comprehension through all kinds of gestural or verbal cues and repetitions.

Can the TV begin to engage in this level of conversation? Can it’s profile of established interests be used as a source of recommendations that might clarify a misunderstanding of something that has just been said, for example, correcting the misidentification of an actor by people chatting about what they’re watching together on facebook{{4}}. Or could it relieve the co-watcher’s burden of responding to annoying whispered questions during films (‘why is she holding that chainsaw?’) by delving into more complex layers of in-programme dramaturgic medatada and providing some suggested explanations{{5}}.

Can the profiles of individual viewers and their shared TVs then be evaluated quantitatively for similarities over specific periods of time as a measure of the effectiveness of this kind of conversational grounding with various types of content and TV format?

And at what point do we reach the threshold of complexity, fluency and multi-valency beyond which these kinds of interactions with your TV can be thought of as a conversation?

[[1]] Oehlberg, L., Ducheneaut, N., Thornton, J. D., Moore, R. J., & Nickell, E. (2006). Social TV : Designing for Distributed , Sociable Television Viewing. Theater. [[1]] [[2]] I think of this as graceful because it addresses a hugely complex set of contingencies in a simple and contingent way, by issuing a query to a good enough service via a standard API, that does something useful, and assumes that in the future, when there is more linked, semantically enriched data, and more advanced inference services available, the API can just be plugged into those. [[2]] [[3]] I’m aware that this idea that pets do not have Disney-like anthropomorphic personalities is not popular, especially with British people, and I’m not backing it up with anything other than my own supposition that this is the case, and that your animals would eat you in a second if they were hungry enough and you were incapable of defending yourself. [[3]] [[4]] In the lingo of conversation analysis this would be called ‘self-initiated self-repair’ [[4]] [[5]] Other-initiated self-repair [[5]]

Prototypes vs. ‘Demonstrators’

Before starting research at BT, I’d never heard the term ‘demonstrators’ used to describe the deliverables of a research project. Getting to grips with the idea has given me a valuable insight into the differences between ‘intrapreneurial’ innovation in corporate contexts, and the gung-ho prototyping I’ve been involved with in many arts/technology/entrepreneurial ventures.

I got a chance to see a variety of demonstrators at a recent BT Innovate and Design ‘Deep Dive’ event, where all the geeks put on suits, polish up their wares, and come and present them to a mix of management/strategy types and other geeks.

by http://www.flickr.com/people/splashman/

For the hardware/infrastructure demonstrators, there were some big green BT link cabinets (such as you sometimes see hanging open, with vast spaghetti tangles of copper twisted pair bulging out) – updated with the latest in fibre coupling systems. Then there were cross-sectional fibreglass models of brick walls, showing how domestic fibre channels could be installed. Those kinds of demonstrators felt a bit like playing in the launchpad at the Science Museum.

For Software demonstrators, the killer apps seemed to be Flash and powerpoint, because they seem to be the most effective ways to illustrate the concepts and proposed functionality of consumer technologies to a broad base of management/strategy people, with varied levels of technical knowledge. {{1}}

The product our team demonstrated in this context was built in Adobe Air/Flash, with a Processing back-end server – which was effective at demonstrating the possibilities of a second screen application for playing, pausing, text chatting and audio-messaging between users of a SocialTV system.

When I had first seen it, and read user test reports, I hadn’t immediately understood what could be learned from the results. The demonstrator literally only did four things: pause, play, send text message and send audio message. I was interested in how users behaved, how they misused or adapted the technology, which wasn’t really possible with such a limited device and such a proscribed mode of use.

Then yesterday I saw this:

GOAB. A TV Experience Concept from Syzygy on Vimeo.

Which basically compiles all the Social TV functionalities and features I’ve seen built or speculated about so far. It’s really very slick, and (as far as I can tell) entirely vaporware-enabled, which is a very good way of communicating to lots of people and testing a market without the complicated and horrible job of actually building any software.

So what I was missing is that ‘demonstrators’ should be seen as mostly non-functional props, designed to make it easier to communicate the possibilities and marketability of certain technologies. Whether they *actually* work or not is immaterial.

Another gem (or mine full of gems) from recent research was the Notube project, which involves my much admired semweb veteran friends Dan and Libby, which has involved building all kinds of prototypes – but not, explicitly, any products. Libby wrote about that very interestingly, after being badgered by product-hungry trade-fair attendees. Her post, which brings up lots of very useful user-centric patterns for Social TV from her previous work at Joost, helped me to understand the different applications of prototypes and ‘demonstrators’.

Dan also pointed me to this  beautiful demonstration of one of notube’s prototypes, using the new Universal Controller for Mythtv built by Steven Jolly, James Barrett and Matt Hammond at BBC R&D. This is the kind of stick-figure prototype I really enjoy: plain and uncomplicated on the outside, but actually rather elegant and complex on the inside.

So, before diving into making, I’m glad I clarified this for myself:

You build a protytpe to learn something about how users and systems behave. You build a demonstrator to learn about how people and organisations react to your product.

In terms of my research, I’m more interested in finding out how users and systems behave, so I’ll be building some prototypes, and not worrying too much about making them look slick.

[[1]] This wasn’t the case for some internally-focused applications, which were basically fully functional systems intended for use by BT and it’s partners for cost-saving/efficiency systems, or the Saturn Project aggregator/classifier for threat detection and analysis from fuzzy data.[[1]]

Mystery Science Research Lab 2011

It’s my second day at BT’s Future Applications research lab, and I’m sat at my cool 1990’s style ovoid lozenge office desk, doing a broad web-survey of Social TV applications. I’ve just been given my hall pass that allows me in and out of the automatic security doors, each one bearing a sign (in the italic variety of the ubiquitous BT font) saying:

“Please keep noise to a minimum as you walk through this office”.

I’m snorting, wheezing and squeaking with supressed laughter. The researcher sitting opposite me is giving me worried glances, and I can sense heads, peripherally visible over partitions and banks of computer screens, turning and peering in my direction.

While looking at some prototypes of ‘Avatar Party Mode’, a kind of ‘Virtual Sofa’ service available on the Sky Player foR Xbox Live, I had made the mistake of putting on my headphones and calling up some episodes of one of my favourite late 20th Century TV late-night TV shows, Mystery Science Theatre 3000.

The premise of Joel Hodgson’s MST3K, for the uninitiated, is that as part of an evil scientific experiment, ‘Joel Robinson’ and his robot sidekicks are trapped on a deep-space asteroid and forced to watch B-movies, where, represented as cinema-seated silhouettes at the bottom of the screen, they heckle and wisecrack their way through the film.

I am certainly not the first to make this association. Social TV researchers have turned to MST3K’s presentation style as a model for an acceptable user interface to indicate the co-presence remote viewers and even enable chat-partner selection on the bottom third of a TV screen (Ducheneaut, Moore, Oehlberg, Thornton & Nickell, 2008){{1}}, or as an example of potential applications/services that invite viewers to overlay user generated content and republish personalised video streams over IP (Banerjee et. al., 2002){{2}}. In cultural critique, film and television studies, MST3K has also been invoked as a perfect example of a ‘meta-show’, and used to illustrate how ironic re-appropriation of pop-cultural artefacts can express aesthetic dissent (King, 2007) {{3}}.

The CollaboraTV project implemented this kind of interface as the premise for their ground-breaking Social TV application. They even did some viewer expeience research contrasting this kind of ‘virtual audience’ interface with traditional text-chat underneath video playback, and found that the virtual audience increased audience engagement and enjoyment{{4}}.

In terms of interface, ‘user experience’, and it’s choice of B-movie ‘sociable’ media{{5}}, MST3K seems to offer a useful set of guidelines for Social TV design and research, primarily because the acceptance of it’s visual design and irreverant tone was established when it became a popular cult TV programme. But on a more abstract level, MST3K offers inspirational design patterns for Social TV because of how it constantly shifts focus between the viewer and the viewed, opening up endless imaginative, performative and conversational opportunities.

This shifting of focus foregrounds an aspect of television viewing that is often passed over by cultural critique of the ‘dumbing down’ of TV audiences (Bourdieu & Ferguson, 1998){{6}}. Ien Ang wrote about this in her often-cited book ‘Watching Dallas'{{7}}, where she argues that part of the enjoyment of watching the show for global (in her case Dutch) audiences, far from aspirational identification with the camped-up millionaire Texans, is a smug awareness that ‘other people’ are watching the show in earnest, but that for ‘us’, the show’s tastes and values are an object of collective ridicule. Bad TV, in this way, can be seen as a possible object of counter-identification, forming social groups of collective dislike.

Perhaps a successful deisgn for SocialTV could start with a ‘dislike’ button, and build it’s sociality on the collective activites of booing, heckling and throwing things at the screen.

[[1]]Ducheneaut, N., Moore, R., Oehlberg, L., Thornton, J., & Nickell, E. (2008). Social TV: Designing for Distributed, Sociable Television Viewing. International Journal of Human-Computer Interaction, 24(2), 136-154. doi: 10.1080/10447310701821426. [[1]] [[2]]Banerjee, S., Brassil, J., Dalal, A., & Lee, S.-ju, others. (2002). CDNs for personal broadcasting and individualized reception. In Proceedings of WCW. Citeseer. Retrieved March 29, 2011, from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.7.7964. [[2]] [[3]]King, J. (2011). Mystery Science Theater 3000 , Media Consciousness , and the Postmodern Allegory of the Captive Audience Source : Journal of Film and Video , Vol . 59 , No . 4 ( WINTER 2007 ), pp . 37-53 Published by : University of Illinois Press on behalf of the Univer. Film, 59(4), 37-53. [[3]] [[4]] Harrison, C., & Amento, B. (2007). CollaboraTV: Using asynchronous communication to make TV social again. Adjunct Proceedings of EuroITV2007, 218–222.
[[4]] [[5]] In their paper cited above, Ducheneaut, Moore et al. also point to MST3K as a reference point for their observation that some types of content (such as a B-movie) tends to free up people’s attention for more discussion and interaction. [[5]] [[6]]Bourdieu, P., & Ferguson, P. P. (1998). On television and journalism. Pluto Press. [[6]] [[7]] Ang, I. (1985). Watching Dallas: Soap opera and the melodramatic imagination (p. 148). Routledge Kegan & Paul. [[7]]

What will Youview mean for the arts?

Youview - a new unified IPTV offering from BT, the BBC and the major UK broadcasters.

I went to the Art of the Digital London’s IPTV meetup last night, organised by Simon Worthington and Caroline Heron at Mute Publishing, intended to bring together arts organisations to discuss IPTV in general, and the big silence around how Youview is going to impact on London’s arts organisations, if at all.