Abstract
With the development of mobile social networks, more and more crowdsourced data are generated on the Web or collected from real-world sensing. The fragment, heterogeneous, and noisy nature of online/offline crowdsourced data, however, makes it difficult to be understood. Traditional content-based analyzing methods suffer from potential issues such as computational intensiveness and poor performance. To address them, this paper presents CrowdMining. In particular, we observe that the knowledge hidden in the process of data generation, regarding individual/crowd behavior patterns (e.g., mobility patterns, community contexts such as social ties and structure) and crowd-object interaction patterns (flickering or tweeting patterns) are neglected in crowdsourced data mining. Therefore, a novel approach that leverages implicit human intelligence (implicit HI) for crowdsourced data mining and understanding is proposed. Two studies titled CrowdEvent and CrowdRoute are presented to showcase its usage, where implicit HIs are extracted either from online or offline crowdsourced data. A generic model for CrowdMining is further proposed based on a set of existing studies. Experiments based on real-world datasets demonstrate the effectiveness of CrowdMining.
Similar content being viewed by others
References
Alivand, M., Hochmair, H., Srinivasan, S.: Analyzing how travelers choose scenic routes using route choice models. Comput. Environ. Urban. Syst. 50, 41–52 (2015)
X. Bao and R. Roy Choudhury, “Movi: mobile phone based video highlights via collaborative sensing”. In: Proceedings of the 8th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys’10), 2010, pp. 357–370
Barbier, G., Zafarani, R., Gao, H., Fung, G., Liu, H.: Maximizing benefits from crowdsourced data. Comput. Math. Organ. Theory. 18(3), 257–279 (2012)
Boykin, S., Merlino, A.: Machine learning of event segmentation for news on demand. Commun. ACM. 43(2), 35–41 (2000)
J. Bragg, D. S. Weld et al., “Crowdsourcing multi-label classification for taxonomy creation”. In: Proceedings of First AAAI Conference on Human Computation and Crowdsourcing, 2013
S. Chen, M. Li, K. Ren, and C. Qiao, “Crowd map: Accurate reconstruction of indoor floor plans from crowdsourced sensorrich videos”. In: Proceedings of IEEE 35th International Conference on Distributed Computing Systems (ICDCS’15), 2015, pp. 1–10
H. Chen, B. Guo, Z. Yu, and Q. Han, “Toward real-time and cooperative mobile visual sensing and sharing”. In: Proceedings of the 35th IEEE International Conference on Computer Communications (INFOCOM’16), 2016, pp. 1359–1368
J. Cheng and M. S. Bernstein, “Flock: Hybrid crowd-machine learning classifiers”. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’15), 2015, pp. 600–611
Cooper, M., Foote, J., Girgensohn, A., Wilcox, L.: Temporal event clustering for digital photo collections. ACM Trans. Multimed. Comput. Commun. Appl. 1(3), 269–288 (2005)
J. Cranshaw, E. Toch, J. Hong, A. Kittur, and N. Sadeh, “Bridging the gap between physical location and online social networks”. In: Proceedings of the 12th ACM international conference on Ubiquitous computing (UbiComp’10). ACM, 2010, pp. 119–128
Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide Web. Commun. ACM. 54(4), 86–96 (2011)
M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin, “Crowddb: answering queries with crowdsourcing”. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD’11), 2011, pp. 61–72
J. P. Gozali, M.-Y. Kan, and H. Sundaram, “Hidden markov model for event photo stream segmentation”. In: 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW’12), 2012, pp. 25–30
X. Guo, E. C. Chan, C. Liu, K. Wu, S. Liu, and L. M. Ni, “Shopprofiler: Profiling shops with crowdsourcing data”. In: Proceedings of IEEE INFOCOM’14, 2014, pp. 1240–1248
Guo, B., Chen, H., Yu, Z., Xie, X., Huangfu, S., Zhang, D.: FlierMeet: a mobile crowdsensing system for cross-space public information reposting, tagging, and sharing. IEEE Trans. Mob. Comput. 14(10), 2020–2033 (2015)
Guo, B., Chen, H., Yu, Z., Xie, X., Zhang, D.: Picpick: a generic data selection framework for mobile crowd photography. Pers. Ubiquit. Comput. 20(3), 325–335 (2016)
Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., Niblack, W.: Efficient color histogram indexing for quadratic form distance functions. IEEE Trans. Pattern Anal. Mach. Intell. 17(7), 729–736 (1995)
Huang, W., Xiong, Y., Li, X.Y., Lin, H., Mao, X., Yang, P., Liu, Y., Wang, X.: Swadloon: direction finding and indoor localization using acoustic signal by shaking smartphones. IEEE Trans. Mob. Comput. 14(10), 2145–2157 (2015)
G. Kim and E. Xing, “Jointly aligning and segmenting multiple Web photo streams for the inference of collective photo storylines”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), 2013, pp. 620–627
Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)
C. Lin, C. Lin, J. Li, D. Wang, Y. Chen, and T. Li, “Generating event storylines from microblogs”. In: Proceedings of the 21st ACM international conference on Information and Knowledge Management (CIKM’12), 2012, pp. 175–184
Liu, L., Wei, W., Zhao, D., Ma, H.: Urban resolution: new metric for measuring the quality of urban sensing. IEEE Trans. Mob. Comput. 14(12), 2560–2575 (2015)
Ma, H., Zhao, D., Yuan, P.: Opportunities in mobile crowd sensing. IEEE Commun. Mag. 52(8), 29–35 (2014)
A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, and R. C. Miller, “Twitinfo: aggregating and visualizing microblogs for event exploration”. In: Proceedings of the SIGCHI Conference on Human factors in Computing Systems (CHI’11), 2011, pp. 227–236
M. Noto and H. Sato, “A method for the shortest path search by extended dijkstra algorithm”. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC’00), 2000, pp. 2316–2320
Ota, K., Dong, M., Gui, J., Liu, A.: QUOIN: incentive mechanisms for crowd sensing networks. IEEE Netw. 32(2), 114–119 (2018)
R. W. Ouyang, A. Srivastava, P. Prabahar, R. Roy Choudhury, M. Addicott, and F. J. McClernon, “If you see something, swipe towards it: crowdsourced event localization using smartphones”. In: Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp’13), 2013, pp. 23–32
Pfitzner, D., Leibbrandt, R., Powers, D.: Characterization and evaluation of similarity measures for pairs of clusterings. Knowl. Inf. Syst. 19(3), 361–394 (2009)
M. Redi, D. Quercia, L. T. Graham, and S. D. Gosling, “Like partying? your face says it all. predicting the ambiance of places with profile pictures”. arXiv preprint arXiv:1505.07522, 2015
T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes twitter users: real-time event detection by social sensors”. In: Proceedings of the 19th International Conference on World Wide Web (WWW’10), 2010, pp. 851–860
J. Staiano, B. Lepri, N. Aharony, F. Pianesi, N. Sebe, and A. Pentland, “Friends don’t lie: inferring personality traits from social network structure”. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UbiComp’12), 2012, pp. 321–330
R. J. Sternberg, “Handbook of Human Intelligence,” CUP Archive, 1982
A. S. Taylor, “Machine intelligence”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2009, pp. 2109–2118
A. Torralba, K. P. Murphy, W. T. Freeman, and M. A. Rubin, “Context-based vision system for place and object recognition”. In: Ninth IEEE International Conference on Computer Vision (ICCV’13), 2003, pp. 273–280
K. Tuite, N. Snavely, D.-y. Hsiao, N. Tabing, and Z. Popovic, “Photocity: training experts at large-scale image acquisition through a competitive game”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, 2011, pp. 1383–1392
Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: Recaptcha: human-based character recognition via Web security measures. Science. 321(5895), 1465–1468 (2008)
Y. Wang, W. Hu, Y. Wu, and G. Cao, “Smartphoto: A resourceaware crowdsourcing approach for image sensing with smartphones”. In :Proceedings of the 15th ACM international symposium on Mobile Ad hoc Networking and Computing (MobiHoc’14), 2014, pp. 113–122
Wang, J., Wang, Y., Zhang, D., Wang, L., Xiong, H., Helal, A., He, Y., Wang, F.: Fine-grained multitask allocation for participatory sensing with a shared budget. IEEE Internet Things J. 3(6), 1395–1405 (2016)
Wang, J., Wang, Y., Zhang, D., Wang, F., Xiong, H., Chen, C., Lv, Q., Qiu, Z.: Multi-task allocation in mobile crowd sensing with individual task quality assurance. IEEE Trans. Mob. Comput. 17(9), 2101–2113 (2018)
J. Wu, M. Dong, K. Ota, J. Li, and Z. Guan, “FCSS: Fog Computing Based Content-Aware Filtering for Security Services in Information Centric Social Networks”. IEEE Trans. Emerg. Top. Comput. 2017
Xu, J., Ota, K., Dong, M.: Real-time awareness scheduling for multimedia big data oriented in-memory computing. IEEE Internet Things J. 5(5), 3464–3473 (2018)
Zheng, Y.-T., Yan, S., Zha, Z.-J., Li, Y., Zhou, X., Chua, T.-S., Jain, R.: Gpsview: A scenic driving route planner. ACM Trans. Multimed. Comput. Commun. Appl. 9(1), 3 (2013)
Y. Zhong, N. J. Yuan, W. Zhong, F. Zhang, and X. Xie, “You are where you go: Inferring demographic attributes from location check-ins”. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM’15), 2015, pp. 295–304
P. Zhou, Y. Zheng, M. Li, “How long to wait?: predicting bus arrival time with mobile phone based participatory sensing”. In: Proceedings of the 10th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys’12), 2012: 379–392.
Zhou, X., Wu, B., Jin, Q.: Analysis of user network and correlation for community discovery based on topic-aware similarity and behavioral influence. IEEE Trans. Hum. Mach. Syst. 48(6), 559–571 (2018)
X. Zhou, W. Liang, K. Wang, R. Huang, and Q. Jin, “Academic Influence Aware and Multidimensional Network Analysis for Research Collaboration Navigation Based on Scholarly Big Data”. IEEE Trans. Emerg. Top. Comput. 2018
Funding
This work was partially supported by the National Key R&D Program of China(2017YFB1001803), National Basic Research Program of China (No.2015CB352400), and the National Natural Science Foundation of China (No. 61772428, 61725205).
Author information
Authors and Affiliations
Corresponding author
Additional information
This article belongs to the Topical Collection: Special Issue on Smart Computing and Cyber Technology for Cyberization
Guest Editors: Xiaokang Zhou, Flavia C. Delicato, Kevin Wang, and Runhe Huang
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Guo, B., Chen, H., Liu, Y. et al. From crowdsourcing to crowdmining: using implicit human intelligence for better understanding of crowdsourced data. World Wide Web 23, 1101–1125 (2020). https://doi.org/10.1007/s11280-019-00718-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-019-00718-5