Estimating disaggregated employment size from Points-of-Interest and census data: From mining the web to model implementation and visualization
Authors
Filipe Rodrigues
Ana Cristina da Costa Oliveira Alves
Evgheni Polisciuc
Shan Jiang
Joseph Ferreira
Francisco Câmara Pereira
Ana Cristina da Costa Oliveira Alves
Evgheni Polisciuc
Shan Jiang
Joseph Ferreira
Francisco Câmara Pereira
Abstract
The global spread of internet access and the ubiquity of internet capable devices has lead to an increased online presence on the behalf of companies and businesses, namely in collaborative platforms called local directories, where Points-of- Interest (POIs) are usually classified with a set of categories and tags. Such information can be extremely useful, especially if aggregated under a common (shared) taxonomy.This article proposes a complete framework for the urban planning task of disaggregated employment size estimation based on collaborative online POI data, collected using web mining techniques. In order to make the analysis possible, we present a machine learning approach to automatically classify POIs to a common taxonomy - the North American Industry Classification System. This hierarchical taxonomy is applied in many areas, particularly in urban planning, since it allows for a proper analysis of the data at different levels of detail, depending on the practical application at hand. The classified POIs are then used to estimate disaggregated employment size, at a finer level than previously possible, using a maximum likelihood estimator. We empirically show that the automatically-classified online POIs are competitive with proprietary gold-standard POI data. This fact is then supported through a set of new visualizations that allow us to understand the spatial distribution of the classification error and its relation with employment size error.
Keywords
Data mining, Points of Interest, GIS, Urban planningSubject
POI mining, machine learning, urban planningRelated Project
Crowds - Understanding urban land use from digital footprints of crowdsJournal
International Journal on Advanced Intelligent Systems, Vol. 6, #2, pp. 41-52, December 2013DOI
Cited by
Year 2018 : 4 citations
Folch, D. C., Spielman, S. E., & Manduca, R. (2018). Fast food data: Where user?generated content works and where it does not. Geographical Analysis, 50(2), 125-140.
Novack, T., Peters, R., & Zipf, A. (2018). Graph-Based Matching of Points-of-Interest from Collaborative Geo-Datasets. ISPRS International Journal of Geo-Information, 7(3), 117.
Gervasoni, L., Fenet, S., Perrier, R., & Sturm, P. (2018, October). Convolutional neural networks for disaggregated population mapping using open data. In IEEE International Conference on Data Science and Advanced Analytics (DSAA).
Gervasoni, L., Fenet, S., & Sturm, P. (2018, January). Une méthode pour l’estimation désagrégée de données de population à l’aide de données ouvertes. In 18ème Conférence Internationale sur l'Extraction et la Gestion des Connaissances.
Year 2017 : 1 citations
Touya, G., Antoniou, V., Olteanu-Raimond, A. M., & Van Damme, M. D. (2017). Assessing crowdsourced POI quality: Combining methods based on reference data, history, and spatial relations. ISPRS International Journal of Geo-Information, 6(3), 80.
Year 2016 : 1 citations
Jonietz, D.; Zipf, A. Defining Fitness-for-Use for Crowdsourced Points of Interest (POI). ISPRS Int. J. Geo-Inf. 2016, 5, 149. doi:10.3390/ijgi5090149
Year 2015 : 1 citations
DRAFT, S. 2015, Why so many people? Explaining non-habitual transport overcrowding with internet data.Montini, L., Rieser-Schüssler, N., Horni, A., & Axhausen, K. (2014). Trip purpose identification from GPS tracks. Transportation Research Record: Journal of the Transportation Research Board, (2405), 16-23.
Year 2014 : 4 citations
Montini, L., Rieser-Schüssler, N., Horni, A., & Axhausen, K. (2014). Trip purpose identification from GPS tracks. Transportation Research Record: Journal of the Transportation Research Board, (2405), 16-23.
Montini, L., & Rieser, N. (2014). Implementation and pretest of the trip purpose detection.
Fine-resolution population mapping using OpenStreetMap points-of-interest
Mohamed Bakillah , Steve Liang , Amin Mobasheri , Jamal Jokar Arsanjani , Alexander Zipf
International Journal of Geographical Information Science
Vol. 28, Iss. 9, 2014
Limits of Predictability in Commuting Flows in the Absence of Data for Calibration (Yingxiang Yang, C. Herrera-Yagüe, N. Eagle, Marta C González),Nature Collections, Scientific Reports 4, Article number: 5662 doi:10.1038/srep05662 (2014) http://www.nature.com/srep/2014/140711/srep05662/full/srep05662.html
Year 2013 : 1 citations
S Jiang, GA Fiore, Y Yang, J Ferreira Jrâ?¦, A review of urban computing for mobile phone traces: current methods, challenges and opportunities, Proceedings of the 2nd …, 2013