CISUC

Methods for pre-processing smartcard data to improve data quality

Authors

Abstract

In recent years smartcards have been implemented in many transit systems around the world as a means by which passengers pay for travel. In addition to allowing speedier boardings there are many secondary benefits of smartcard systems including better under- standing of travel patterns and behaviour of travellers. Such research is dependent on the smartcard correctly recording the boarding stop, and where available the alighting stop. It is also dependent on the smartcard system correctly aggregating individual rides into trips.
This paper identifies causes for why smartcard systems may not correctly record such information. The first contribution of the paper is to propose a set of rules to aggregate individual rides into a single trip. This is critical in the research of activity based modelling as well as for correctly charging the passenger. The second contribution of the paper is to provide an approach to identify erroneous tap-out data, either caused by system problems or by the user. An approach to detecting this phenomenon is provided. The output from this analysis is then used to identify faulty vehicles or data supply using the ‘‘comparison against peers approach’’. This third contribution of the paper identifies where transit agen- cies and operators should target resources to improve performance of their Automatic Vehicle Location systems. This method could also be used to identify users who appear to be tapping out too early.
The approaches are tested using smartcard data from the Singapore public transport net- work from one week in April 2011. The results suggest that approximately 7.7% of all smartcard rides recorded the passenger as alighting one stop before the bus stop that they most probably alighted at. A further 0.7% of smartcard rides recorded the passenger as alighting more than one stop before the bus stop that they most probably alighted at. There was no evidence that smartcards overestimated the distance travelled by the passenger.

Keywords

Smart cards, data quality improvement, data analysis

Subject

Smart cards, data quality improvement, data analysis

Journal

Transport Research - Part C, Vol. 49, pp. 43-58, Elsevier, December 2014

DOI


Cited by

No citations found