Abstract
Identification of individuals based on transit modes is of great importance in user tracking systems. However, identifying users in real-life studies is not trivial owing to the following challenges: 1) activity data containing both temporal and spatial context are high-order and sparse; 2) traditional two-step classifiers depend on trajectory patterns as input features, which limits accuracy especially in the case of scattered and diverse data; 3) in some cases, there are few positive instances and they are difficult to detect. Therefore, approaches involving statistics-based or trajectory-based features do not work effectively. Deep learning methods also suffer from the problem of how to represent trajectory vectors for user classification. Here, we propose a novel end-to-end scenario-based deep learning method to address these challenges, based on the observation that individuals may visit the same place for different reasons. We first define a scenario using critical places and related trajectories. Next, we embed scenarios via path-based or graph-based approaches using extended embedding techniques. Finally, a two-level convolution neural network is constructed for the classification. Our model is applied to the problem of detection of addicts using transit records directly without feature engineering, based on real-life data collected from mobile devices. Based on constructed scenario with dense trajectories, our model outperforms classical classification approaches, anomaly detection methods, state-of-the-art sequential deep learning models, and graph neural networks. Moreover, we provide statistical analyses and intuitiveexplanations to interpret the characteristics of resident and addict mobility. Our method could be generalized to other trajectory-related tasks involving scattered and diverse data.
Similar content being viewed by others
References
World drug report (2019) http://www.unodc.org/doc/wdr2018/WDR_2018_Press_ReleaseENG.PDF, Accessed 1 Feb 2019
Abul O, Bonchi F, Nanni M (2010) Anonymization of moving objects databases by clustering and perturbation. Inf Syst 35(8):884–910
Branco P, Torgo L, Ribeiro R (2015) A survey of predictive modelling under imbalanced distributions. arXiv:1505.01658
Cao H, Mamoulis N, Cheung DW (2007) Discovery of periodic patterns in spatiotemporal sequences. IEEE Trans Knowl Data Eng 19(4):453–467
Cao H, Mamoulis N, Cheung DW (2005) Mining frequent spatio-temporal sequential patterns. In: Fifth IEEE international conference on data mining (ICDM’05)
Chen C, Zhang D, Zhou Z, Li N, Atmaca T, Li S (2013) B-planner: Night bus route planning using large-scale taxi gps traces. In: 2013 IEEE international conference on pervasive computing and communications (PerCom), pp 225–233
Du B, Liu C, Zhou W, Hou Z, Xiong H (2018) Detecting pickpocket suspects from large-scale public transit records. IEEE Trans Knowl Data Eng :1–1
Feng J, Li Y, Zhang C, Sun F, Meng F, Guo A, Jin D (2018) Deepmove: Predicting human mobility with attentional recurrent networks. In: WWW ’18 international world wide web conferences steering committee, pp 1459–1468
Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018) Learning from imbalanced data sets. Springer, New York
Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. In: KDD ’07. ACM, pp 330–339
Goldberg Y, Levy O (2014) Word2vec explained: deriving mikolov others.’s negative-sampling word-embedding method. arXiv:1402.3722
Gong H, Chen C, Bialostozky E, Lawson CT (2012) A gps/gis method for travel mode detection in New York city. Comput Environ Urban Syst 36 (2):131–139. special Issue: Geoinformatics 2010
Grover A, Leskovec J (2016) Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’16, Association for Computing Machinery. https://doi.org/10.1145/2939672.2939754, New York, pp 855–864
Guangyu Z, Gao K (2015) Research on community division algorithm with directed and weighted network in pervasive sensing environment. In: (SKG’15), pp 105–111
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. Association for Computational Linguistics, pp 427–431
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks
Kong D, Wu F (2018) Hst-lstm: A hierarchical spatial-temporal long-short term memory network for location prediction. In: IJCAI’18. AAAI Press, pp 2341–2347
Laube P, Imfeld S (2002) Analyzing relative motion within groups oftrackable moving point objects. In: Egenhofer MJ, Mark DM (eds) Science, geographic information. Springer, Berlin, pp 132–144
Lee JG, Han J, Li X, Gonzalez H (2008) Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering. PVLDB 1(1):1081–1094
Li M, Ahmed A, Smola AJ (2015) Inferring movement trajectories from gps snippets. In: WSDM ’15. ACM, pp 325–334
Li Q, Zheng Y, Xie X, Chen Y, Liu W, Ma WY (2008) Mining user similarity based on location history. In: GIS ’08. ACM, pp 34:1–34:10
Lin M, Hsu WJ (2014) Mining gps data for mobility patterns: A survey. Pervasive Mobile Comput 12:1–16
Luo W, Tan H, Chen L, Ni LM (2013) Finding time period-based most frequent path in big trajectory data. In: SIGMOD ’13. ACM, pp 713–724
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Morency C, Trepanier M, Agard B (2006) Analysing the variability of transit users behaviour with smart card data. In: 2006 IEEE intelligent transportation systems conference, pp 44–49
Przulj N (2007) Biological network comparison using graphlet degree distribution. Bioinformatics 23(2):e177–e183
Reddy S, Mun M, Burke J, Estrin D, Hansen M, Srivastava M (2010) Using mobile phones to determine transportation modes. ACM Trans Sen Netw 6(2):13:1–13:27
da Silva TLC, de Macêdo JAF, Casanova MA (2014) Discovering frequent mobility patterns on moving object data. In: MobiGIS’14. ACM, pp 60–67
Song C, Qu Z, Blumm N, Barabási AL (2010) Limits of predictability in human mobility. Science 327(5968):1018–1021
Song R, Sun W, Zheng B, Zheng Y (2014) Press: A novel framework of trajectory compression in road networks. Proc VLDB Endow 7(9):661–672
Van Brummelen G (2013) Heavenly mathematics: The forgotten art of spherical trigonometry. Princeton University Press, Princeton
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2017) Graph attention networks
Wang Y, Jiang WW, Zhang D (2017) A study on drug-taking behavior based on big data: Taking guizhou province as an example. Jouranl of Shandong police Colldege
Yanardag P, Vishwanathan S (2015) Deep graph kernels. In: KDD ’15. ACM, pp 1365–1374
Yanardag P, Vishwanathan S (2015) Deep graph kernels. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1365–1374
Yuan J, Zheng Y, Xie X (2012) Discovering regions of different functions in a city using human mobility and pois. In: KDD ’12. ACM, pp 186–194
Zhang C, Zhang K, Yuan Q, Zhang L, Hanratty T, Han J (2016) Gmove: Group-level mobility modeling using geo-tagged social media. In: KDD ’16. ACM, pp 1305–1314
Zhang J, Zheng Y, Qi D (2016) Deep spatio-temporal residual networks for citywide crowd flows prediction. arXiv:1610.00081
Zheng Y (2015) Trajectory data mining: An overview. ACM Trans Intell Syst Technol
Zheng Y, Chen Y, Li Q, Xie X, Ma WY (2010) Understanding transportation modes based on gps data for web applications. ACM Trans Web 4(1):1:1–1:36
Zheng Y, Li Q, Chen Y, Xie X, Ma WY (2008) Understanding mobility based on gps data. In: UbiComp ’08. ACM, pp 312–321
Zheng Y, Liu L, Wang L, Xie X (2008) Learning transportation mode from raw gps data for geographic applications on the web. In: WWW ’08. ACM, pp 247–256
Zhonghua (2005) Alongitudinal survey of patterns and prevalence on addictive drug use in general population in five or six areas with high-prevalence in China from 1993 to 2000 Chinese. J Drug Depend
Acknowledgements
Our research is supported by the Natural Science Foundation of Zhejiang Province of China under Grant (No. LY21F020003), Zhejiang Provincial Key Research and Development Program of China (NO. 2021C01164).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jin, C., Chen, D., Lin, Z. et al. How do you visit: Identifying addicts from large-scale transit records via scenario deep embedding. Geoinformatica 25, 799–820 (2021). https://doi.org/10.1007/s10707-021-00448-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-021-00448-9