Facebook  Twitter  Linkedin  YouTube
Saturday , 20 July 2024

MOBILE PHONE Data in transport

The location precision of CDR data records is quite low and the activity can typically be resolved to a locality within the city. It is not possible to pin-point the exact location from where the calls are made; but this is sufficient for estimating travel demand while maintaining the privacy of users. Another dataset available from telecom systems is the location update (LA) data. The location of mobile phones are polled periodically to determine the cell tower a phone is connected to. This polling takes place as long as the phone is turned on, even if there no activity on the phone. LA dataset has the same spatial precision as CDR data, but has a higher sampling rate.

AlgorithmIt is also possible to obtain more precise location information from telecom systems. Some telecom backbone equipment have the ability to output bulk location data of users; for example, Ericsson systems have an add-on product called Anonymous Bulk Location Data (ABLD). Such products can provide more precise location of all users in an anonymous format. It is also possible to retrofit telecom systems that do not provide such features with extra hardware to obtain Received Signal Strength Indicator (RSSI) data and estimate the precise location of phones. For example, it is possible to install probes on base stations (BSC) and record RSSI information from all connected mobile phones. These methods provide data at a higher sampling rate and at a better spatial precision, in addition to being able to provide the output in near-real-time so that they can also be used for transport operation applications.

Regardless of the type of data used, data from a mobile operator represents only a sample of the overall population movement even if the sampling rate is large. The exact sampling rate is always an unknown and the sampling rate varies both spatially and temporally, especially for activity based datasets such as the CDR. Hence, an estimation methodology that is capable of handling large datasets such as the CDR is needed to obtain an unbiased estimate of travel demand.

The Solution


Amrithanshu Sinha

CDAC-ITSPE team obtained anonymised CDR data from a mobile operator for a one month period for the Mumbai circle, and the proof-ofconcept was to estimate the travel demand from, to and within South Mumbai using this data for the AM peak period.

The first step in the solution was to come up with a naïve demand estimate, or Origin-Destination (OD) matrix, by combining the CDR data with location information of cell towers. Typical representation of CDR records and cell
tower information is shown in tables below.

The catchment area of each cell tower is determined by creating a Voronoi diagram around cell tower locations. In other words, zones are created around cell towers such that the closest cell tower for each point inside a zone is the cell tower the zone is centred around. A phone’s home and work locations are estimated based on the phone’s activity during weekdays over the course of the month. The most frequent location of a phone’s activity during the night is deemed the home zone whereas the most frequent activity location during the day is deemed as the work zone. It is assumed that most journeys to and from South Mumbai are work trips since it is a business district, and a naïve initial OD estimate of travel demand is created based on this process.

Kruti Barot (ITSPE)

Kruti Barot

However, the naïve estimate is biased because of unknown and non-uniform sampling rates. An independent reference dataset is needed to update the naïve estimate. In order to address this issue, traffic counts from a number of locations on the road network was manually obtained as reference points. Traffic flows on the road network were also estimated using a traffic assignment model with the naïve OD estimate as the input. Reasonable assumptions were made regarding mode choice (the percentage of users using rail versus road) and average vehicle occupancy rates before running the traffic assignment. The difference between modelled and actual traffic flows is the result of estimation error in traffic demand.

Share with: