Abstract
Random forest (RF) is widely used in many applications due to good classification performance. However, its voting mechanism assumes that all base classifiers have the same weight. In fact, it is more reasonable that some have relatively high weights while some have relatively low weights because the randomization of bootstrap sampling and attributes selecting cannot guarantee all trees have the same ability of making decision. We mainly focus on the weighted voting mechanism and then propose a novel weighted RF in this paper. Experiments on 6 public datasets illustrate that our method outperforms the RF and another weighted RF. We apply our method to credit card fraud detection and experiments also show that our method is the best.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gupta, S., Johari, R.: A new framework for credit card transactions involving mutual authentication between cardholder and merchant. In: 2011 International Conference on Communication Systems and Network Technologies, pp. 22–26. IEEE (2011)
Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time url spam filtering service. In: Security and Privacy, vol. 42, pp. 447–462. IEEE (2011)
Zhang, Y., Liu, G., Luan, W., Yan, C., Jiang, C.: An approach to class imbalance problem based on stacking and inverse random under sampling methods. In: 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), pp. 1–6. IEEE (2018)
Bolton, R.J., Hand, D.J.: Unsupervised profiling methods for fraud detection. In: Credit Scoring and Credit Control VII, pp. 235–255 (2001)
Gmbh, Y., Co, K.G.: Global online payment methods: Full year 2016, Technical report (2016)
Seyedhossein, L., Hashemi, M.R.: Mining information from credit card time series for timelier fraud detection. In: 2010 5th International Symposium on Telecommunications (IST), pp. 619–624. IEEE (2010)
Zheng, L., Liu, G., Yan, C., Jiang, C.: Transaction fraud detection based on total order relation and behavior diversity. IEEE Trans. Comput. Soc. Syst. 99, 1–11 (2018)
Srivastava, A., Kundu, A., Sural, S., Majumdar, A.: Credit card fraud detection using hidden Markov model. IEEE Trans. Dependable Secure Comput. 5(1), 37–48 (2008)
Drummond, C., Holte, R.C.: C4.5, class imbalance, and cost sensitivity: why under-sampling beats oversampling. In: Proceedings of the ICML Workshop on Learning from Imbalanced Datasets II, pp. 1–8 (2003)
Quah, J.T.S., Sriganesh, M.: Real-time credit card fraud detection using computational intelligence. Expert Syst. Appl. 35(4), 1721–1732 (2008)
Kundu, A., Panigrahi, S., Sural, S., Majumdar, A.K.: Blast-ssaha hybridization for credit card fraud detection. IEEE Trans. Dependable Secure Comput. 6(4), 309–315 (2009)
Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., Jiang, C.: Random forest for credit card fraud detection. In: 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), pp. 1–6. IEEE (2018)
Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Decis. Support Syst. 50(3), 602–613 (2011)
Mota, G., Fernandes, J., Belo, O.: Usage signatures analysis an alternative method for preventing fraud in E-Commerce applications. In: International Conference on Data Science and Advanced Analytics, pp. 203–208. IEEE (2014)
Behdad, M., Barone, L., Bennamoun, M., French, T.: Nature-inspired techniques in the context of fraud detection. IEEE Trans. Syst. Man Cyber. Part C 42(6), 1273–1290 (2012)
Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 17(3), 235–249 (2002)
Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intell. Syst. Appl. 14(6), 67–74 (2002)
Chen, R.C., Chen, T.S., Lin, C.C.: A new binary support vector system for increasing detection rate of credit card fraud. Int. J. Pattern Recognit. Artif. Intell. 20(02), 227–239 (2006)
Mcdonald, D.W., Ackerman, M.S.: Expertise recommender:a flexible recommendation system and architecture. In: ACM Conference on Computer Supported Cooperative Work, pp. 231–240. ACM (2000)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Quinlan, J.R.: Induction on decision tree. Mach. Learn. 1(1), 81–106 (1986)
Breiman, L., Friedman, J.H., Olshen, R., Stone, C.J.: Classification and regression trees. Biometrics 40(3), 358 (1984)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Albrecht, W.S., Albrecht, C., Albrecht, C.C.: Current trends in fraud and its detection. Inf. Syst. Secur. 17(1), 2–12 (2008)
Li, H.B., Wang, W., Ding, H.W., Dong, J.: Trees weighting random forest method for classifying high-dimensional noisy data. In: IEEE, International Conference on E-Business Engineering, pp. 160–163. IEEE (2011)
Zhou, Q., Zhou, H., Li, T.: Cost-sensitive feature selection using random forest: selecting low-cost subsets of informative features. Knowl. Based Syst. 95, 1–11 (2016)
Harris, J.R., Grunsky, E.C.: Predictive lithological mapping of Canada’s North using random forest classification applied to geophysical and geochemical data. Comput. Geosci. 80, 9–25 (2015)
Singh, K., Guntuku, S.C., Thakur, A., et al.: Big data analytics framework for peer-to-peer botnet detection using random forests. Inform. Sci. 278(19), 488–497 (2014)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Fanelli, G., Dantone, M., Gall, J., et al.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013)
Winham, S.J., Freimuth, R.R., Biernacka, J.M.: A weighted random forests approach to improve predictive performance. Stat. Anal. Data Min. ASA Data Sci. J. 6(6), 496–505 (2013)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
UCI Homepage. http://archive.ics.uci.edu/ml/datasets.html
Scikit-learn Homepage. http://scikit-learn.org/stable/
Acknowledgments
Authors would like to thank reviewers for their helpful comments, and also thank Professor Changjun Jiang who provides authors a lot of assistance on data and experiments. This paper is supported in part by the National Natural Science Foundation of China under grand no. 61572360 and in part by the Shanghai Shuguang Program under grant no. 15SG18. Corresponding author is G.J. Liu.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Xuan, S., Liu, G., Li, Z. (2018). Refined Weighted Random Forest and Its Application to Credit Card Fraud Detection. In: Chen, X., Sen, A., Li, W., Thai, M. (eds) Computational Data and Social Networks. CSoNet 2018. Lecture Notes in Computer Science(), vol 11280. Springer, Cham. https://doi.org/10.1007/978-3-030-04648-4_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-04648-4_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04647-7
Online ISBN: 978-3-030-04648-4
eBook Packages: Computer ScienceComputer Science (R0)