Scientific Accuracy, Uncertainty and Documentary Truth

Lomax Boyd interviews creative technologist Vincent McCurley 

 Brake or swerve? Quick! Decide! – This is the situation that Cardboard Crash, a virtual reality vignette show-casted at the 2016 Sundance Film Festival, puts you in. Finding yourself in a self-driving car, you are forced to make a difficult decision in an unavoidable crash event. Do you depend on the decision your car’s artificial intelligence makes for you? Or do you want to decide for yourself? What could ethical standards be for algorithm based decision-making? And given varied cultural and individual ethics, who should choose paradigms and how should algorithms be designed? There is plenty of data, but no easy answer.

In Cardboard Crash, NFB creative technologist Vincent McCurley experientially confronts the user with this dilemma.

In a conversation with Lomax Boyd, a Fulbright Scholar in residence at the National Film Board of Canada, Vincent explains how he approached the use of scientific data in storytelling, how he tries to make people think about complex situations rather than ‘deliver’ a (questionable) scientific truth behind them, and how he deals with the nature of uncertainty in documentary-making.

This post is one of a series of interviews conducted by Lomax Boyd with creatives and producers at the NFB. The series explores the complex interplay between (scientific) data, the representation of uncertainty, story and emotion in data-driven interactive factuals.  The series offers insights for technologists, documentary artists and science communicators in imagining how data and other representations of scientific process might be used in the creative process.

“… to be continued…” – For more on this topic, stay with i-docs and follow the discussion on facebook!

LB: When approaching the story of Cardboard Crash, how did you go about searching for relevant scientific data?

VM: At the beginning, Cardboard Crash was definitely not based on data. At that time, we were doing a lot of research on tanker trucks. I wanted to get an idea of how valuable those trucks were, in economic terms – this meant we were weighing the economic loss of a truck. I looked at the volumetric capacity, assuming that it’s containing gasoline and then multiply the price of gasoline with the volumetric capacity to give a rough value for the loss of one of those trucks. But a lot of the other numbers used in the documentary vignette, like your chance of dying falling off a cliff that is 50 meters high – I couldn’t find that data very easily so I ended up coming up with numbers. The numbers that I came up with, like 9.8% was a throw to gravitational acceleration. The numbers had satirical meaning, but not necessarily scientific meaning. If it was 100% chance of death then people would make the decision much more easily. But if it’s some abstract number where you have to weigh against the other ones, and it’s in a fuzzy area where it’s not clear decision – those were the numbers I was going for, more for making people think than for purely giving scientific reasons.

The goal was to make people think about these situations rather than convey a scientific truth behind it.

LB: Stories in documentary often focus on the perceptual, while scientific practice is more about the physical. What role do you think (scientific) accuracy has in documentary when it’s trying to portray science?

VM: I think it is really important if you are looking at something that has actually occurred in the past. Documentary tends to look at things that have already happened so you have data on what has happened. Cardboard Crash is a little different in this regard as it projects into the future and comes up with a potential scenario. It tries to make people think about the philosophical situation of technology and humans.

The struggle with scientific journalism and scientific documentary is that often the truth is kind of boring and it has to compete with all this other stuff which is vying for people’s attention. In every project, it’s a question of how to find a balance.

Herzog uses a phonebook as an example, say Johnny Apple in the phonebook. That is data. You can go where he lived but you don’t know if he lived alone and cried himself asleep at night. There’s only so much historical data in science that you have and the rest you have to paint that picture that makes it compelling and interesting to the audience.

LB: You mentioned that scientific data was difficult to access. From your perspective, as a creative technologist, what could be done to make scientific information more accessible? What would be ideal from a technological perspective in documentary projects?

VM: The ideal would be to reach into an API and grab that data real-time, so if the data changes, your project gets the most recent data. The problem we always face when we tap into all these data sources is that they are not always stable. So if they break, the equivalent component of our project breaks. If a server no longer exists, or the university changes the folder structure, or something about the web API isn’t maintained properly, then you can’t pull that data in. In the past, we tried to solve this problem using backups so we had static versions that the project could default to if it could not grab the latest data. We did that with Test Tube and Bear 71.

LB: Which components of those projects were live?

VM: In Test Tube, we have the program grab all the live tweets that are related to the word you typed out in the beginning in response to the question: “What would you do if you had one extra minute?” You type it in and it goes out and grabs all the live time tweets that have that keyword in it. If you say, for example, ‘sleep’, then you get everyone’s tweet talking about sleep right now. But since we were presenting that project at conferences that didn’t have good wifi connections, we had to build a fall back that senses if there is no internet connection. We then had a default that comes up in all the microbes. The strength of the network is also its weakness.

Test Tube - Pulling in live data for convening a unique ephemeral UX

Test Tube – Pulling in live data from tweets for convening a ephemeral glimpse on what ‘the world’ is talking about right at that moment

LB: How can we make data accessible to all, including researchers, creatives and the general public?

VM: There are two aspects how to make more data available: you can throw up your data on a server, but people need to know that it exists and they must be able to pull that data. If you have data available on millions of servers then people need to know how to get to that data; that might be a search engine problem. One common solution is to create an entity which combines all this data so that people have one spot to go to. But then you have the chance that it becomes a monopoly. This might be good from the point of view of developers because they have one point of contact that they can get everything from, but it’s not great if you’re trying to be open and have competition. It’s the old problem of centralizing power in one spot. We are battling convenience versus the openness of data.

LB: Science attempts to be explicit about the uncertainty of information. How do you convey uncertainty in data-driven documentary?

VM: If you’ve done data collection, for example, out in the mossy fields measuring PH, you know that there’s a lot of variation in the data collection – depending on whether you take it from this spot or that spot. I think a lot of conclusions we come up with are based on shaky information. There should be a variable or probability of accuracy. I know there are mathematical models which can convey that, but I don’t know how you can convey the actual data collection variability. Most of the population does not understand statistics. You give them a number – say this is plus or minus 5% variance. I don’t think they really understand what that means. Part of the issue is in education.

LB: How do you bring uncertainty into the narrative?

VM: How do we convey uncertainty in documentary truth? I would ask: What is the goal of the documentary? Go back to that core idea you are trying to convey. If you are trying to get people to think and be more open minded about a situation, then I think it is justified to explain that gray scale fuzziness around the data. If the goal is more action based, you want to have a very clear statement.

In the end, I think, it depends on how much you trust your audience to make a decision based on the data you provide. It’s about knowing your audience.

I always go back to the basic idea: Why do we want to do something? What is the value or purpose of doing something? If we can go back to this first principle of why then it helps inform how we do things. A lot of us operate on our immediate goals. But in the end, the question should be: What is the ultimate goal of this project? That is what drives the art, story and user interaction.

LB: How do you think about using data as a component of narrative? Can data, like other media assets and techniques (e.g. time-lapse, montage, etc.) be used in a way that can be emotive or will it always be a sterile abstraction that appeals to the geeks?

VM: I think it can be used in more emotive ways beyond the info-graphic. Our brains like infographics because they are an easy to consume and understand pieces of information as they are very visual. But you could definitely turn data into a more interactive experience where have to choose or make a decision based on the data.

Cardboard Crash – A data-driven VR documentary that tries to make people think about complex situations rather than only to ‘deliver’ scientific facts behind data

LB: If we are thinking more about story structure, like the use of montage to convey meaning through juxtaposition, for example, would you say that data needs to be visual or can it be something else?

VM: I think the goal of the media creation is to ask: “How do you take that data and convey it most effectively in the shortest amount of time possible?” In Cardboard Crash, for example, reading the numbers about impact of crashes is striking, but actually seeing that animate and the mushrooming cloud getting larger and larger and seeing its effective blast radius – this conveys the information not only much quicker but also much more impressively. All representation of data should strive to do that. We constantly have to consider our audience and respect their time. As media creators, we are trying to curate the most useful information into an understandable format. That’s the skill – it doesn’t come naturally.