In-memory, distributed content-based recommender system

Dooms, Simon; Audenaert, Pieter; Fostier, Jan; De Pessemier, Toon; Martens, Luc

doi:10.1007/s10844-013-0276-1

In-memory, distributed content-based recommender system

Published: 06 September 2013

Volume 42, pages 645–669, (2014)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Simon Dooms¹,
Pieter Audenaert²,
Jan Fostier²,
Toon De Pessemier¹ &
…
Luc Martens¹

886 Accesses
18 Citations
3 Altmetric
Explore all metrics

Abstract

Burdened by their popularity, recommender systems increasingly take on larger datasets while they are expected to deliver high quality results within reasonable time. To meet these ever growing requirements, industrial recommender systems often turn to parallel hardware and distributed computing. While the MapReduce paradigm is generally accepted for massive parallel data processing, it often entails complex algorithm reorganization and suboptimal efficiency because mid-computation values are typically read from and written to hard disk. This work implements an in-memory, content-based recommendation algorithm and shows how it can be parallelized and efficiently distributed across many homogeneous machines in a distributed-memory environment. By focusing on data parallelism and carefully constructing the definition of work in the context of recommender systems, we are able to partition the complete calculation process into any number of independent and equally sized jobs. An empirically validated performance model is developed to predict parallel speedup and promises high efficiencies for realistic hardware configurations. For the MovieLens 10 M dataset we note efficiency values up to 71 % for a configuration of 200 computing nodes (eight cores per node).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AMORE: design and implementation of a commercial-strength parallel hybrid movie recommendation engine

Article 01 August 2015

Online optimization for user-specific hybrid recommender systems

Article 29 August 2014

Parallelizing approximate single-source personalized PageRank queries on shared memory

Article 08 October 2019

Notes

References

Ahmadizar, F. (2012). A new ant colony algorithm for makespan minimization in permutation flow shops. Computers & Industrial Engineering.
Amdahl, G. (1967). Validity of the single processor approach to achieving large scale computing capabilities. In Proc. spring joint computer conf. (pp. 483–485). ACM.
Anand, S.S. & Mobasher, B. (2003). Intelligent techniques for web personalization. In Proc. int. conf. intelligent techniques for web personalization (pp. 1–36). Springer.
Berkovsky, S. & Freyne, J. (2010). Group-based recipe recommendations: analysis of data aggregation strategies. In Proc. 4th ACM conf. Recommender Systems, RecSys ’10 (pp. 111–118). New York: ACM. doi:10.1145/1864708.1864732.
Chapter Google Scholar
Bilolikar, V., Jain, K., Sharma, M. (2012). An annealed genetic algorithm for multi mode resource constrained project scheduling problem. International Journal of Computers and Applications, 60(1), 36–42.
Google Scholar
Bobadilla, J., Serradilla, F., Bernal, J. (2010). A new collaborative filtering metric that improves the behavior of recommender systems. Knowledge-Based Systems, 23(6), 520–528.
Article Google Scholar
Chhabra, S. & Resnick, P. (2012). Cubethat: news article recommender. In Proc. 6th ACM Conf. Recommender Systems, RecSys ’12 (pp. 295–296). New York: ACM. doi:10.1145/2365952.2366020.
Chapter Google Scholar
Das, A., Datar, M., Garg, A., Rajaram, S. (2007). Google news personalization: scalable online collaborative filtering. In Proc. 16th int. conf. world wide web (pp. 271–280). ACM.
De Pessemier, T., Vanhecke, K., Dooms, S., Martens, L. (2011). Content-based recommendation algorithms on the hadoop mapreduce framework. In Proc. 7th int. conf. web information systems and technologies. Ghent University, Department of Information Technology.
Dean, J. & Ghemawat, S. (2008). Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113.
Article Google Scholar
Dooms, S., De Pessemier, T., Martens, L. (2011). A file-based approach for recommender systems in high-performance computing environments. In Proc. 22nd int. workshop on database and expert systems applications (pp. 529–533). IEEE. doi:10.1109/DEXA.2011.3.
Dooms, S., De Pessemier, T., Martens, L. (2011). An online evaluation of explicit feedback mechanisms for recommender systems. In Proc. 7th int. conf. web information systems and technologies (pp. 391–394).
Dooms, S., De Pessemier, T., Martens, L. (2011). A user-centric evaluation of recommender algorithms for an event recommendation system. In Workshop on Human Decision Making in Recommender Systems (Decisions@RecSys’11) and User-Centric Evaluation of Recommender Systems and Their Interfaces—2 (UCERSTI 2) affiliated with 5th ACM Conf. Recommender Systems (RecSys 2011) (pp. 67–73).
Gomez-Gasquet, P., Segura-Andres, R., Franco, D., Andres, C. (2012). A makespan minimization in an m-stage flow shop lot streaming with sequence dependent setup times: Milp model and experimental approach. In 6th int. conf. industrial engineering and industrial management (pp. 332–339).
Hager, G. & Wellein, G. (2010). Introduction to high performance computing for scientists and engineers (1st ed.). Boca Raton: CRC Press, Inc.
Book Google Scholar
Han, P., Xie, B., Yang, F., Shen, R. (2004). A scalable p2p recommender system based on distributed collaborative filtering. Expert Systems with Applications, 27(2), 203–210. doi:10.1016/j.eswa.2004.01.003. http://www.sciencedirect.com/science/article/pii/S0957417404000065.
Article Google Scholar
Herlocker, J., Konstan, J.A., Riedl, J. (2002). An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Information Retrieval, 5(4), 287–310.
Article Google Scholar
Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J. (1999). An algorithmic framework for performing collaborative filtering. In Proc. 22nd int. ACM SIGIR conf. research and development in information retrieval (pp. 230–237). ACM.
Hochbaum, D.S. & Shmoys, D.B. (1987). Using dual approximation algorithms for scheduling problems theoretical and practical results. Journal of the ACM, 34(1), 144–162. doi:10.1145/7531.7535.
Article MathSciNet Google Scholar
Jannach, D., Zanker, M., Felfernig, A., Friedrich, G. (2010). Recommender systems: An introduction. Cambridge University Press.
Jiang, J., Lu, J., Zhang, G., Long, G. (2011). Scaling-up item-based collaborative filtering recommendation algorithm based on hadoop. In 2011 IEEE world congress on services (SERVICES) (pp. 490–497). IEEE.
Keckler, S., Olukotun, K., Hofstee, H. (2009). Multicore processors and systems. Springer
Lämmel, R. (2008). Googles mapreduce programming model revisited. Science of Computer Programming, 70(1), 1–30.
Article MATH MathSciNet Google Scholar
Levi, A., Mokryn, O., Diot, C., Taft, N. (2012). Finding a needle in a haystack of reviews: cold start context-based hotel recommender system. In Proc. 6th ACM conf. Recommender Systems, RecSys ’12 (pp. 115–122). New York: ACM. doi:10.1145/2365952.2365977.
Chapter Google Scholar
Liu, M., Zheng, F., Wang, S., Xu, Y. (2013). Approximation algorithms for parallel machine scheduling with linear deterioration. Theoretical Computer Science, 497, 108–111. doi:10.1016/j.tcs.2012.01.020.
Article MATH MathSciNet Google Scholar
McCarthy, J.F. & Anagnost, T.D. (1998). MusicFX: an arbiter of group preferences for computer supported collaborative workouts. In Proc. ACM conf. Computer Supported Cooperative Work, CSCW ’98 (pp. 363–372). New York: ACM. doi:10.1145/289444.289511.
Google Scholar
Pera, M.S. & Ng, Y.K. (2012). Personalized recommendations on books for k-12 readers. In Proc. 5th ACM workshop on Research advances in large digital book repositories and complementary media, BooksOnline ’12 (pp. 11–12). New York: ACM. doi:10.1145/2390116.2390124.
Chapter Google Scholar
Peralta, V. (2007). Extraction and integration of movielens and imdb data. Tech. rep., Technical Report, Laboratoire PRiSM, Université de Versailles, France.
Sarwar, B., Karypis, G., Konstan, J., Riedl, J. (2000). Application of dimensionality reduction in recommender system-a case study. Tech. rep., DTIC Document.
Sarwar, B., Karypis, G., Konstan, J., Riedl, J. (2002). Incremental singular value decomposition algorithms for highly scalable recommender systems. In 5th int. conf. computer and information science (pp. 27–28). Citeseer.
Schelter, S., Boden, C., Markl, V. (2012). Scalable similarity-based neighborhood methods with mapreduce. In pROC. 6th ACM conf. on recommender systems (pp. 163–170). ACM.
Symeonidis, P., Nanopoulos, A., Manolopoulos, Y. (2009). Moviexplain: a recommender system with explanations. In Proc. 3rd ACM conf. Recommender Systems, RecSys ’09 (pp. 317–320). New York: ACM. doi:10.1145/1639714.1639777.
Google Scholar
Takács, G., Pilászy, I., Németh, B., Tikk, D. (2009). Scalable collaborative filtering approaches for large recommender systems. Journal of Machine Learning Research, 10, 623–656.
Google Scholar
Xie, B., Han, P., Yang, F., Shen, R.M., Zeng, H.J., Chen, Z. (2007). Dcfla: a distributed collaborative-filtering neighbor-locating algorithm. Information Sciences, 177(6), 1349–1363. doi:10.1016/j.ins.2006.09.005.
Article Google Scholar
Yang, D., Chen, T., Zhang, W., Lu, Q., Yu, Y. (2012). Local implicit feedback mining for music recommendation. In Proc. 6th ACM conf. Recommender Systems, RecSys ’12 (pp. 91–98). New York: ACM. doi:10.1145/2365952.2365973.
Chapter Google Scholar
Zhao, Z. & Shang, M. (2010). User-based collaborative-filtering recommendation algorithms on hadoop. In 3rd int. conf. Knowledge Discovery and Data Mining (WKDD’10) (pp. 478–481). IEEE.

Download references

Acknowledgements

The described research activities were funded by a PhD grant to Simon Dooms of the Agency for Innovation by Science and Technology (IWT Vlaanderen). The computational resources (Stevin Supercomputer Infrastructure) and services used in this work were provided by Ghent University, the Hercules Foundation and the Flemish Government—department EWI.

Author information

Authors and Affiliations

Wica, iMinds-Ghent University, G. Crommenlaan 8, box 201, 9050, Ghent, Belgium
Simon Dooms, Toon De Pessemier & Luc Martens
IBCN, iMinds-Ghent University, G. Crommenlaan 8, box 201, 9050, Ghent, Belgium
Pieter Audenaert & Jan Fostier

Authors

Simon Dooms
View author publications
You can also search for this author in PubMed Google Scholar
Pieter Audenaert
View author publications
You can also search for this author in PubMed Google Scholar
Jan Fostier
View author publications
You can also search for this author in PubMed Google Scholar
Toon De Pessemier
View author publications
You can also search for this author in PubMed Google Scholar
Luc Martens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Dooms.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dooms, S., Audenaert, P., Fostier, J. et al. In-memory, distributed content-based recommender system. J Intell Inf Syst 42, 645–669 (2014). https://doi.org/10.1007/s10844-013-0276-1

Download citation

Received: 08 April 2013
Revised: 08 August 2013
Accepted: 14 August 2013
Published: 06 September 2013
Issue Date: June 2014
DOI: https://doi.org/10.1007/s10844-013-0276-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

In-memory, distributed content-based recommender system

Abstract

Access this article

Similar content being viewed by others

AMORE: design and implementation of a commercial-strength parallel hybrid movie recommendation engine

Online optimization for user-specific hybrid recommender systems

Parallelizing approximate single-source personalized PageRank queries on shared memory

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

In-memory, distributed content-based recommender system

Abstract

Access this article

Similar content being viewed by others

AMORE: design and implementation of a commercial-strength parallel hybrid movie recommendation engine

Online optimization for user-specific hybrid recommender systems

Parallelizing approximate single-source personalized PageRank queries on shared memory

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation