Here’s What Transport for London Learned From Tracking Your Phone On the Tube

from: Gizomodo.co.uk

At the end of last year, between 21st November and 19th December, Transport for London carried out an intriguing trial: It was going to track your phone on the London Underground.

Today, thanks to the Freedom of Information Act, Gizmodo UK can exclusively reveal some of the utterly fascinating findings that the agency has been able to make from all of our data – and how the plan, if the trial is deemed a success and tracking is implemented full time, is also to use the data to inform advertising decisions on the Tube network.

Where and Why The Trial Took Place

This four week period was merely a pilot project – baby steps to test the water and see what could be learned – as well as, presumably, what the backlash would be like once passengers know that TfL is hoovering up data from phones whether they are connected to TfL’s Virgin Media Wifi network or not. Wisely, it was accompanied by a publicity campaign which included posters in stations and articles in the Metro – so that passengers could at least be informed that it is happening, rather than being horrified several months later when somebody spills the beans.

To have had your data collected in the trial, all you needed to have was your wifi switched on – then the various wifi hotspots around the Tube network would be able to pick up your phone (or tablet, or laptop, or whatever)’s unique MAC address that enables you to be identified.

The good news for the paranoid is that TfL appears to have gone out of its way to make sure everything is above board. In the documents that Giz UK has seen, it makes clear that it is only MAC data that’s collected (ie: they’re not monitoring the websites you visit) – and that this data is stored as encrypted hashes – so even if hackers could somehow break in and obtain the collected data, they wouldn’t be able to get any MAC address data.

As it was only a trial, only 54 out of out of 270 tube stations were involved. Mostly in Zone 1 and everything in the red patch below, apart from Tottenham Court Road, which doesn’t have wifi yet (because they’re too busy building Crossrail).

Though as you can see, the trial did extend out further up the Metropolitan and Northern Lines. According to the documents, the idea was to test whether a station being underground or not has an impact on wifi usage. If someone is at, say, Finchley Road, which is above ground, will they just use their phone’s mobile signal, or will they connect?

According to TfL’s one day analysis of Vauxhall station, for every 3 people who touched in at the Oyster gateline, they saw one person wifi device picked up by the wifi traps. This either means that the Virgin Media wifi is really popular – or there are a lot of people out there walking around with their wifi switched on.

So what has TfL learned? Here’s what we think are the most interesting results from the documents.

Route Tracking

Perhaps the number one reason to do the trial was to better understand the journeys that people actually make on the Tube. At the moment, TfL can tell what station you started and ended your journey at based on your Oyster card – but it can’t tell how you got between two locations. It sometimes supplements this data with a Rolling Origins Destination Survey (RODS) to figure out specific routes, but this is done manually, which is expensive and time consuming.

So one immediately obvious benefit of the wifi data is being able to collect the same data much faster, on a larger scale, and for a fraction of the cost. If you look at the slide below, you can see how popular different routes between Liverpool Street and Victoria are.

So if you travel via Oxford Circus, you do the same as 44% of other people. If you lazily sit on the circle line you do the same as 26% of people making the same journey. And if you change twice – once at Holborn, then again at Green Park, then congratulations, you’re a psychopath.

According to one document, the inclusion of the Finchley Road to Wembley Park section of the Jubilee and Metropolitan lines (they run next to each other – the Jubilee just stops at more stations in between) was deliberately included in order to observe customer behaviour when there are two options where one is obviously faster than the other (It takes 5 minutes on the Met, 12 on the Jubilee).

TfL even checked if this data was accurate, by matching it up with actual train timetables, and was able to demonstrate how on one journey southbound down the Victoria Line they were able match the wifi data of one passenger and figure out which specific train they were travelling on.

The upshot of this is fairly obvious. As TfL says itself “By using Wi-Fi data, merged with aggregated Oyster and Contactless ticketing data we would have a far richer data source to ensure optimal and evidence based decision making for a wide range of planning decisions.”.

In-Station Tracking

It isn’t just travel across the whole network that can be tracked by wifi. It’s even possible to track your location within an individual station – presumably by working out which access point that you’re closest to.

This means that TfL can use the data to make cool maps, like this:

This is a heat map of Euston tube station and shows where passengers walked around the station. Comparing this to the excellent 3D tube map from the Station Master app reveals that the busiest platform by some distance is the southbound Victoria Line. Which perhaps isn’t surprising as that’s the line you need if you want to get to Oxford Circus.

TfL hopes that this data could be used to analyse crowding. For example, the Northern Line was included in the trial as the two branches enables them to compare how the City and Charing Cross branches impact each other. The documents also seem to suggest that if TfL switched on tracking full time it could offer real time crowding information to passengers – so we could see a CityMapper of the not-too-distant future telling us which stations to avoid.

TfL also thinks the crowding data could be used to “Inform decisions on how many staff needed at each station and in what role”. This no doubt nods towards the recent reorganisations which have seen ticket offices close across the Tube network – which has provoked huge controversy in some quarters. One thing that everyone will like, though, is that the same data could also be used to monitor how long passengers have been stuck on trains or held outside of stations – and refunds could be offered as a result.

The in-station tracking enabled TfL to work out the average travel times between different parts of stations. Pictured above is Victoria – which reveals that it takes on average 86 seconds to take the escalator from the ticket hall to the Victoria Line platforms, and 67 seconds to walk along the platform from end to end. This could also have health and safety benefits, as this data could be fed into evacuation planning, and also help TfL test new initiatives. For example – remember the Holborn escalator trial? Using wifi data could give TfL researchers better information on whether they are making journeys faster.

Advertising Potential

In what will no doubt be the most controversial aspect of the trial, the possibilities of using the data to inform advertising are a big motivation. And to be fair to TfL, you can understand why. “TfL is under increasing financial pressure”, the documents note, adding that “The Department of Transport grant we receive (£591m in 2015/16) will be removed from 2018. In addition, fares are to be frozen over the current mayoral term (2016 to 2020).”

So could wifi tracking provide the extra cash to pay for Mayor Khan’s big fare freeze? According to a detailed grid outlining the rationale for the trial, the advertising upshot could be valued at hundreds of millions of pounds because TfL will be able to offer better analytics to advertisers about exactly how many people are looking at their ads around the tube network, because they know where you’re standing.

Being able to estimate the footfall in different parts of each station – and even roughly how long you’ll be staring at each advert – means that they can offer differential pricing depending on how good each advertising slot is. Being able to “demonstrate customer journey pattern volumes” will “enable advertising assets to be sold on a campaign level where the same customer views the same advert”. In other words – as TfL know your commute, if you’re the consumer a company wants to attract, they’ll know exactly the right places to buy ads so that you’ll see them.

The thinking on advertising has gone into some detail too. The docs reckon the data could also be used to choose which advertising slots on the Tube could be upgraded to digital displays next – and using the timing data even decide how long each digital ad will be displayed before switching to the next one.

Customer Attitudes to Tracking

Based on the documents we’ve seen, it’d be easy to write a scaremongering hit-piece based on scary quotes about tracking and advertising. But what’s interesting to see is the level of care TfL took before going through with the trial.

For example, there are numerous privacy assessments, and different tasks are assessed for how they’d use the data. In one document it raises privacy concerns – pointing out that this new data could conceivably be mashed up with data from Oyster or CCTV to enable the close tracking of individuals. This wasn’t done in this trial – though it’s clear that if it wanted to, TfL could conceivably create an Orwellian nightmare for Londoners – so if this does get switched on full time, it’ll be something new for privacy watchdogs to keep a close eye on.

What’s interesting though is that the cache of documents contains the results of research that TfL commissioned from a company called 2CV aimed at analysing customer attitudes to tracking their data – which makes for interesting reading.

For example, it revealed that customers are much more okay about sharing data when they feel that they are making an “informed decision”, and that many people are “apprehensive” about mobile tracking, because it is so new. The sharing of location data in particular is “viewed differently” to other private information too.

“It is clear that communicating the technology and raising awareness of its use will be critical in driving acceptance of TfL using it”, the research notes. Apparently once people understand the benefits, they are much more accepting of it.

Perhaps most intriguing though was that TfL decided to focus group reactions not just to the Tube wifi tracking – but to other potentially trackable aspects of London’s transport too. Unfortunately all we have to go on is these two slides – but this doesn’t mean we can’t wildly speculate. For example – it proposes using bluetooth to track vehicles in order to collect real time congestion data. It also suggests that by using an app, a customer could share their location data with TfL directly and have it automatically hooked up to their Oyster or Congestion Charging account.

Perhaps most intriguingly is mobile phone tracking – which appears to an ambition to do something similar to the Tube tracking but for all of London. If TfL could get data from the mobile networks, it would know where we’re travelling to and from, and would be able to better optimise cycle and bus routes.

As you can see above the reaction to these different scenarios was mixed – with mobile tracking the foggiest by some distance. Conversely, everyone appeared to like the tube tracking idea – as it has both tangible benefits for the customer, and it is obvious why TfL would need the data.

So that’s essentially what TfL have learned so far, as far as can we can tell. Just before publication we reached out to them to find out what the plan is going forward – will it be rolled out more fully? A spokesperson told me that they’re still assessing the data, as the trial was only completed relatively recently – but we wouldn’t be surprised if this quickly becomes standard in the future.

James is Interim Editor of Gizmodo UK and tweets as @Psythor.