Attacking Machine Learning Models for Social Good

Belavadi, Vibha; Zhou, Yan; Kantarcioglu, Murat; Thuriasingham, Bhavani

doi:10.1007/978-3-030-64793-3_25

Vibha Belavadi¹²,
Yan Zhou¹²,
Murat Kantarcioglu¹² &
…
Bhavani Thuriasingham¹²

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12513))

Included in the following conference series:

International Conference on Decision and Game Theory for Security

1147 Accesses
2 Citations

Abstract

As machine learning (ML) techniques are becoming widely used, awareness of the harmful effect of automation is growing. Especially, in problem domains where critical decisions are made, machine learning-based applications may raise ethical issues with respect to fairness and privacy. Existing research on fairness and privacy in the ML community mainly focuses on providing remedies during the ML model training phase. Unfortunately, such remedies may not be voluntarily adopted by the industry that is concerned about the profits. In this paper, we propose to apply, from the user’s end, a fair and legitimate technique to “game” the ML system to ameliorate its social accountability issues. We show that although adversarial attacks can be exploited to tamper with ML systems, they can also be used for social good. We demonstrate the effectiveness of our proposed technique on real world image and credit data.

The research reported herein was supported in part by NIH award 1R01HG006844, NSF awards CNS-1837627, OAC-1828467, IIS-1939728 and ARO award W911NF-17-1-0356.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Although the gender information is not privacy sensitive, we use this as a substitute for more privacy-sensitive concept such as sexual orientation.

References

Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018)
Article Google Scholar
Alufaisan, Y., Marusich, L.R., Bakdash, J.Z., Zhou, Y., Kantarcioglu, M.: Does explainable artificial intelligence improve human decision-making? (2020)
Google Scholar
Ballet, V., Renard, X., Aigrain, J., Laugel, T., Frossard, P., Detyniecki, M.: Imperceptible adversarial attacks on tabular data. arXiv e-prints arXiv:1911.03274, November 2019
Bruckner, M., Scheffer, T.: Stackelberg games for adversarial prediction problems. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2011)
Google Scholar
Carlini, N., Wagner, D.: Adversarial examples are not easily detected: bypassing ten detection methods. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 3–17. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3128572.3140444
Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57 (2017)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)
MATH Google Scholar
Chouldechova, A., Roth, A.: A snapshot of the frontiers of fairness in machine learning. Commun. ACM 63(5), 82–89 (2020). https://doi.org/10.1145/3376898
Commission, E.: 2018 reform of EU data protection rules
Google Scholar
Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 99–108. KDD 2004. ACM, New York (2004)
Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: Computer Vision and Pattern Recognition (CVPR)
Google Scholar
Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015)
Google Scholar
Hashemi, M., Fathi, A.: PermuteAttack: counterfactual explanation of machine learning credit scorecards (2020)
Google Scholar
Ji, Z., Lipton, Z.C., Elkan, C.: Differential privacy and machine learning: a survey and review. CoRR abs/1412.7584 (2014). http://arxiv.org/abs/1412.7584
Kanerva, A., Helgesson, F.: On the use of model-agnostic interpretation methods as defense against adversarial input attacks on tabular data. Master’s thesis, Department of Computer Science (2020)
Google Scholar
Kantarcioglu, M., Xi, B., Clifton, C.: Classifier evaluation and attribute selection against active adversaries. Data Min. Knowl. Discov. 22, 291–335 (2011)
Article MathSciNet Google Scholar
Kulynych, B., Overdorf, R., Troncoso, C., Gürses, S.: POTs: protective optimization technologies. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 177–188. FAT* 2020. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3351095.3372853
Levin, S., et al.: Machine-learning-based electronic triage more accurately differentiates patients with respect to clinical outcomes compared with the emergency severity index. Ann. Emerg. Med. 71(5), 565–574.e2. https://doi.org/10.1016/j.annemergmed.2017.08.005
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV), December 2015
Google Scholar
Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 641–647. KDD 2005 (2005)
Google Scholar
Luo, J., Bai, T., Zhao, J., Li, B.: Generating adversarial yet inconspicuous patches with a single image (2020)
Google Scholar
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519. ASIA CCS 2017. ACM, New York (2017)
Google Scholar
Renard, X., Laugel, T., Lesot, M.J., Marsala, C., Detyniecki, M.: Detecting potential local adversarial examples for human-interpretable defense. In: Workshop on Recent Advances in Adversarial Learning (Nemesis) of the European Conference on Machine Learning and Principles of Practice of Knowledge Discovery in Databases (ECML-PKDD), Dublin, Ireland, September 2018. https://hal.sorbonne-universite.fr/hal-01905948, presented at: ECML/PKDD Workshop on Recent Advances in Adversarial Machine Learning (Nemesis 2018), Dublin, Ireland (2018)
Rudin, C., Wang, C., Coker, B.: The age of secrecy and unfairness in recidivism prediction. Harvard Data Sci. Rev. (1) (2020). https://doi.org/10.1162/99608f92.6ed64b30
Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: Proceedings of the 23rd ACM SIGSAC Conference on Computer and Communications Security (2016)
Google Scholar
Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: A general framework for adversarial examples with objectives. ACM Trans. Priv. Secur. (2019)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014). http://arxiv.org/abs/1312.6199
Vedaldi, A., Lenc, K.: MatConvNet - convolutional neural networks for MATLAB. In: Proceeding of the ACM International Conference on Multimedia (2015)
Google Scholar
Wang, Y., Kosinski, M.: Deep neural networks are more accurate than humans at detecting sexual orientation from facial images, October 2018. http://www.osf.io/zn79k
Wenger, E., Passananti, J., Yao, Y., Zheng, H., Zhao, B.Y.: Backdoor attacks on facial recognition in the physical world. CoRR abs/2006.14580 (2020). https://arxiv.org/abs/2006.14580
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. 408–421 (1972)
Google Scholar
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp. 3320–3328 (2014). http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks
Zhou, Y., Kantarcioglu, M.: Modeling adversarial learning as nested Stackelberg games. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J.Z., Wang, R. (eds.) PAKDD 2016. LNCS (LNAI), vol. 9652, pp. 350–362. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31750-2_28
Chapter Google Scholar
Zhou, Y., Kantarcioglu, M., Thuraisingham, B., Xi, B.: Adversarial support vector machine learning. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Texas at Dallas, Richardson, TX, 75080, USA
Vibha Belavadi, Yan Zhou, Murat Kantarcioglu & Bhavani Thuriasingham

Authors

Vibha Belavadi
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Murat Kantarcioglu
View author publications
You can also search for this author in PubMed Google Scholar
Bhavani Thuriasingham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Murat Kantarcioglu .

Editor information

Editors and Affiliations

Tandon School of Engineering, New York University, Brooklyn, NY, USA
Quanyan Zhu
ISR, University of Maryland, College Park, MD, USA
John S. Baras
Electrical Engineering, University of Washington, Seattle, WA, USA
Radha Poovendran
New York University, New York, NY, USA
Juntao Chen

Appendix A

We consider the following attributes to change in our German Credit data:

1.
Purpose: For getting the loan ex. car(new), car(old), repairs, education, etc.
2.
Duration: Increase/decrease the duration (in months) to see the change in granting loan.
3.
Credit amount: Increase and decrease the credit amount granted as a matter of percentage of original amount. ex: 1.05x, 1.10x, 0.90x, 0.85x where x is the current amount.
4.
Savings account/bonds: Change the number of savings and bonds from None (A65) to ‘...100 DM’ (A61).
5.
Other installment plans: Change from None (A143) to Bank/Store (A141/A142).
6.
Telephone: Change the ownership of telephone from None (A191) to registered in user’s name (A192).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Belavadi, V., Zhou, Y., Kantarcioglu, M., Thuriasingham, B. (2020). Attacking Machine Learning Models for Social Good. In: Zhu, Q., Baras, J.S., Poovendran, R., Chen, J. (eds) Decision and Game Theory for Security. GameSec 2020. Lecture Notes in Computer Science(), vol 12513. Springer, Cham. https://doi.org/10.1007/978-3-030-64793-3_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-64793-3_25
Published: 22 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64792-6
Online ISBN: 978-3-030-64793-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Attacking Machine Learning Models for Social Good

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A

Appendix A

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation