Abstract
In adversarial classification, the interaction between classifiers and adversaries can be modeled as a game between two players. It is natural to model this interaction as a dynamic game of incomplete information, since the classifier does not know the exact intentions of the different types of adversaries (senders). For these games, equilibrium strategies can be approximated and used as input for classification models. In this paper we show how to model such interactions between players, as well as give directions on how to approximate their mixed strategies. We propose perceptron-like machine learning approximations as well as novel Adversary-Aware Online Support Vector Machines. Results in a real-world adversarial environment show that our approach is competitive with benchmark online learning algorithms, and provides important insights into the complex relations among players.
Similar content being viewed by others
Notes
Regular instance refers to an object (activity) that has a regular innocent intention but nevertheless could potentially be classified as a malicious activity.
Parameter k, which is the total number of messages that can be drawn by Nature. It has to be determined context-dependently; see, for example, the application in Sect. 6
An information set is said to be off-path if it cannot be reached with the given strategy profile. The concept of sequential equilibria puts almost no discipline on the beliefs at information sets that are not reached in equilibria. A significant literature on “refinements” has tried to rule out unreasonable equilibria. See, for example, Cho and Kreps (1987).
This procedure is based on input parameters defined in the Gambit software tool for the QRE computation (Turocy 2010).
Based on the dynamic update of parameters given classification mistakes originally proposed by Rosenblatt (1962).
In the following, for the sake of simplicity, messages at time \(\tau \) of type \(j, \mathbf {x}^{j}_{\tau }\), will be considered as a generic message of type \(\mathbf {x}^{j}\).
Available at http://bit.ly/jnazariophishing [online: access 01/01/2013]
“Ham” is the name used to describe regular messages that are neither spam nor phishing.
available at http://spamassassin.apache.org/publiccorpus/ [online: accessed 01/01/2013]
This expression represents a lower bound for \(\hat{P}_{A}(x'|+1)\). It has been shown to have a good performance in practice (Dalvi et al. 2004).
These are the same values considered in Dalvi et al. (2004) which showed less variability in the overall utility functions of both players.
References
APWG (2012) Phishing activity trends report, 2nd quarter 2012. Tech. rep., APWG
Basne R, Mukkamala S, Sung AH (2008) Detection of phishing attacks: a machine learning approach. Chapter Studies in fuzziness and soft computing. Springer, Berlin, pp 373–383
Bergholz A (2009) Antiphish: lessons learnt. In: Proceedings of the ACM SIGKDD workshop on CyberSecurity and intelligence informatics, ACM, CSI-KDD ’09, New York, pp 1–2
Bergholz A, Beer JD, Glahn S, Moens MF, Paass G, Strobel S (2010) New filtering approaches for phishing email. J Comput Secur 18(1):7–35
Biggio B, Fumera G, Roli F (2008) Adversarial pattern classification using multiple classifiers and randomisation. In: SSPR & SPR ’08: Proceedings of the 2008 joint IAPR international workshop on structural, syntactic, and statistical pattern recognition. Springer, Berlin, pp 500–509
Biggio B, Fumera G, Roli F (2009) Multiple classifier systems for adversarial classification tasks. In: MCS ’09: Proceedings of the 8th international workshop on multiple classifier systems. Springer, Berlin, pp 132–141
Biggio B, Nelson B, Laskov P (2011) Support vector machines under adversarial label noise. In: Journal of Machine Learning Research—Proceedings of the 3rd Asian conference on machine learning (ACML 2011), vol 20. Taoyuan, Taiwan, pp 97–112
Bíró I, Siklósi D, Szabó J, Benczúr AA (2009) Linked latent dirichlet allocation in web spam filtering. In: AIRWeb ’09: Proceedings of the 5th international workshop on adversarial information retrieval on the web, ACM, New York, pp 37–40
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bravo C, Thomas LC, Weber R (2015) Improving credit scoring by differentiating defaulter behaviour. J Oper Res Soc 66(5):771–781. doi:10.1057/jors.2014.50
Brückner M, Scheffer T (2011) Stackelberg games for adversarial prediction problems. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, KDD ’11, New York, pp 547–555
Brückner M, Kanzow C, Scheffer T (2012) Static prediction games for adversarial learning problems. J Mach Learn Res 13:2617–2654
Cho IK, Kreps DM (1987) Signaling games and stable equilibria. Q J Econ 102(2):179–221
Crespo F, Weber R (2005) A methodology for dynamic data mining based on fuzzy clustering. Fuzzy Sets Syst 150(2):267–284
Dalvi N, Domingos P, Mausam, Sanghai S, Verma D (2004) Adversarial classification. In: Proceedings of the tenth international conference on knowledge discovery and data mining, ACM Press, Seattle, vol 1, pp 99–108
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41:391–407
Fette I, Sadeh N, Tomasic A (2007) Learning to detect phishing emails. In: WWW ’07: Proceedings of the 16th international conference on World Wide Web, ACM, New York, pp 649–656
Fudenberg D, Tirole J (1991) Game theory. MIT Press, Cambridge
Gibbons R (1992) Game theory for applied economists. Princeton University Press, Princeton
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst 17(2/3):107–145
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18
Hamming R (1950) Error detecting and error correcting codes. Games Econ Behav 29(2):147–160
Harsanyi JC (1968) Games with incomplete information played by bayesian players. The basic probability distribution of the game. Manag Sci 14(7):486–502
Kantarcioglu M, Xi B, Clifton C (2011) Classifier evaluation and attribute selection against active adversaries. Data Min Knowl Discov 22:291–335
L’Huillier G, Hevia A, Weber R, Rios S (2010) Latent semantic analysis and keyword extraction for phishing classification. In: ISI’10: Proceedings of the IEEE international conference on intelligence and security informatics, Vancouver, pp 129–131
Liu W, Chawla S (2010) Mining adversarial patterns via regularized loss minimization. Mach Learn 81:69–83
Lowd D, Meek C (2005) Adversarial learning. In: KDD ’05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, ACM, New York, pp 641–647
McKelvey R, Palfrey T (1998) Quantal response equilibria for extensive form games. Exp Econ 1(1):9–41
McKelvey RD, McLennan AM, Turocy TL (2010) Gambit: Software tools for game theory, version 0.2010.09.01. [online: Accessed 25 Nov 2012], http://www.gambit-project.org
Nazario J (2007) Phishing corpus. [online: Accessed 25 Nov 2012], http://bit.ly/jnazariophishing
Papadimitriou CH, Tamaki H, Raghavan P, Vempala S (1998) Latent semantic indexing: a probabilistic analysis. In: PODS ’98: Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, ACM, New York, pp 159–168
Peters G, Weber R, Nowatzke R (2012) Dynamic rough clustering and its applications. Appl Soft Comput 12(10):3193–3207
Peters G, Crespo F, Lingras P, Weber R (2013) Soft clustering: fuzzy and rough approaches and their extensions and derivatives. Int J Approx Reason 54(2):307–322
Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Advances in kernel methods, MIT Press, Cambridge, pp 185–208
Rosenblatt F (1962) Principles of neurodynamics: perceptrons and the theory of brain mechanisms. Spartan Books, Washington
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Sculley D, Wachman GM (2007) Relaxed online SVMS for spam filtering. In: SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, pp 415–422
Tambe M (2011) Security and game theory: algorithms, deployed systems, lessons learned. Cambridge University Press, New York
Turocy TL (2005) A dynamic homotopy interpretation of the logistic quantal response equilibrium correspondence. Games Econ Behav 51(2):243–263
Turocy TL (2010) Using quantal response to compute nash and sequential equilibria. Econ Theory 42(1):255–269
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Velasquez JD, Rios SA, Bassi A, Yasuda H, Aoki T (2005) Towards the identification of keywords in the web site text content: a methodological approach. Int J Web Inf Syst 1(1):53–57
Wu X, Srihari R (2004) Incorporating prior knowledge with weighted margin support vector machines. In: KDD ’04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, pp 326–333
Xu R, Wunsch DC (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Zareapoor M, Seeja K (2015) Text mining for phishing e-mail detection. In: Jain LC, Patnaik S, Ichalkaranje N (eds) Intelligent computing, communication and devices, advances in intelligent systems and computing. Springer India, pp 65–71. doi:10.1007/978-81-322-2012-1_8
Zhou Y, Kantarcioglu M, Thuraisingham B, Xi B (2012) Adversarial support vector machine learning. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, KDD ’12, New York, pp 1059–1067
Acknowledgments
Support from the Chilean “Instituto Sistemas Complejos de Ingeniería” (ICM: P-05-004-F, CONICYT: FBO16; www.isci.cl), the Chilean Anillo project ACT87 “Quantitative Methods in Security” (www.ceamos.cl), and the Master’s Degree program in Operations Management at the University of Chile is gratefully acknowledged. The third author acknowledges support from Fondecyt project 1140831.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Bing Liu.
Appendices
Appendix 1: Text mining for feature extraction
This appendix contains methods from text mining which have been used to extract useful features from the emails we analyzed in our application as presented in Sect. 6.
First, documents are tokenized in order to extract every word used in the message. Both a stopword removal process, and stemming of the words in the messages are realized. Second, various feature extraction methodologies are executed: Keyword extraction, SVD, and LDA. Third, the extracted features are combined in order to extract the features that fully characterize a given phishing message (see Fig. 13). Finally, feature selection methodologies are required in order to minimize the complexity of the classification task.
Let,
-
\(\varOmega \) be the set of features determined by keyword extraction algorithm.
-
\(\varUpsilon \) be the set of features determined by SVD.
-
\(\varGamma \) be the set of features determined by LDA
-
\(\varXi \) be the set of basic structural features.
-
Then the final set of features \(\mathcal {F}\) is given by,
$$\begin{aligned} \mathcal {F} = \varXi \cup ((\varGamma \cap \varUpsilon ) \cup (\varOmega \cap \varUpsilon ) ) = \varXi \cup ( (\varGamma \cup \varOmega ) \cap \varUpsilon ) \end{aligned}$$(42)
As shown in Eq. (42), the final set of features \(\mathcal {F}\) is a combination of structural basic features \(\varXi \), independent of the other content-based feature sets \(\varGamma , \varOmega \), and \(\varUpsilon \). However, these feature sets are not independent of each other. They are represented by binary features, indicating whether or not a word is present in a given message.
1.1 Keyword extraction
Based on a TF-IDF representation for the set of messages, the following keyword extraction technique was applied in order to define a set of relevant features: Using clustering, the entire collection of phishing emails is grouped into segments, and a geometric mean of the weights of each word within the messages contained in a given segment is computed. Then, these words are ranked and selected for each set of messages within their respective clusters. This procedure is based on work described by Velasquez et al. (2005).
1.2 Singular value decomposition
Using the TF-IDF matrix (Salton et al. 1975), its singular value decomposition (SVD) reduces the dimension of the \(Term \times Document\)-space. SVD uses a new representation of the feature space, in which the underlying semantic relationship between terms and documents is revealed.
As described by Papadimitriou et al. (1998), SVD preserves the relative distances in the vector space model matrix (e.g., TF-IDF), while projecting it into a lower-dimensional semantic space model. This is similar to what principal component analysis achieves by projecting features into their principal components, maintaining the information needed for an appropriate representation of the dataset.
1.3 Probabilistic topic models
A topic model can be viewed as a probabilistic model that relates documents and words through variables which represent the main topics inferred from the text itself. In this context, a document can be looked at as a mixture of topics, represented by probability distributions which generate the words in a document given these topics. The inferring process of the latent variables, or topics, is the key component of this model, whose main objective is to learn the distribution of the underlying topics from text in a given corpus of text documents.
A leading topic model is the LDA (Bíró et al. 2009; Blei et al. 2003). LDA is a Bayesian model in which latent topics of documents are inferred from estimated probability distributions over a training data set. In LDA every topic is modeled as a probability distribution over the set of words represented by the vocabulary (\(w\in \mathcal {V}\)), and every document as a probability distribution over a set of topics (\(\mathcal {T}\)). These distributions are sampled from multinomial Dirichlet distributions.
1.4 Structural features
As presented in Basne et al. (2008), Bergholz et al. (2010) and Fette et al. (2007), the extraction of basic structural features is needed for a minimum representation of phishing messages given a certain application. In the case of phishing emails, these features are associated with structural properties of email messages, such as their links, embedded codes, and the output of traditional Spam filters. It is important to note that all these features are extracted directly from email messages.
Appendix 2: List of symbols
Throughout the paper, we used the following notation:
Notation | Explanation |
---|---|
A | Total number of features |
N | Total number of instances or objects |
\(\mathbf {x}\) | Set of features \(\mathbf {x} = \{x_{1},\dots ,x_{A}\}\) |
\(x_{a}\) | ath feature of an object \(\mathbf {x}, a\in \{1,\dots ,A\}\) |
y | Dependent variable |
\((\mathbf {x}^{i},y_{i})\) | Instance or object \(i, i\in \{1,\dots ,N\}\) |
\(\mathcal {F}_{x_{a}}\) | Set of possible values for feature \(x_{a}\) |
\(\mathcal {F}_{\mathbf {x}}\) | All possible values for all features \(x_{a}\in \mathbf {x}\) |
\(\mathcal {F}_{y}\) | Set of target values |
\(\mathcal {T}\) | Set of instances \(\mathcal {T} = \{\mathbf {x}^{i},y_{i}\}_{i = 1}^{N}\) |
\(\mathcal {T}_{r}\) | Training dataset, subset of \(\mathcal {T}\) |
\(\mathcal {T}_{e}\) | Testing dataset, subset of \(\mathcal {T}\) |
\(\mathcal {N}\) | Agent Nature |
\(\mathcal {A}\) | Agent Adversary |
\(\mathcal {C}\) | Agent Classifier |
R | Regular adversary type (non-malicious) |
M | Malicious adversary type |
k | Number of information sets. Total number of type of messages that can be drawn by nature |
\(\{I_{i}\}_{i=1}^{k}\) | Information sets for types of messages |
\(T_{\mathcal {A}}\) | Adversary types, \(t\in T_{\mathcal {A}} = \{R,M\}\times \{1,\dots ,k\}\), also denoted as \(T_{\mathcal {A}} = \{t_{R,j}\}_{j=1}^{k}\times \{t_{M,j}\}_{j=1}^{k}\) |
\(T_{M}\) | \(\{t_{M,j}\}_{j=1}^{k}\) |
\(T_{R}\) | \(\{t_{R,j}\}_{j=1}^{k}\) |
p(t) | Probability distribution for each type \(t\in T_{\mathcal {A}}\) |
\(\{\mathbf {x}^{1},\dots ,\mathbf {x}^{k}\}\) | Set of all possible types of messages |
\(\varDelta _{k}\) | Simplex of dimension \(k-1\) |
\(\mathcal {C}: \mathbb {R}^{A}\rightarrow \varDelta _{2}\) | Classification function |
\(\mathcal {\phi }: \mathbb {R}^{A}\rightarrow \varDelta _{k}\) | Mimetic function that changes \(\mathbf {x}\) to \(\mathbf {x}'\), interpreted as the probability of sending a given message i when preferred message is \(l, \forall i, l \in \{\mathbf {x}^{1},\dots ,\mathbf {x}^{k}\}\) |
\(\mu (t|\mathbf {x})\) | Classifier’s belief that message is type t when observed instance is \(\mathbf {x}\) |
\(\sigma ^{*}_{\mathcal {A}}\) | Optimal strategy for the Adversary |
\(\sigma ^{*}_{\mathcal {C}}\) | Optimal strategy for the Classifier |
\(d: \mathbb {R}^{A}\rightarrow \{+1,-1\}\) | Decision function |
\(h_{\tau }\in \{+1,-1\}\) | Hypothesis classification for message \(\tau \) |
\(\{c_{i}\}_{i=1}^{k}\) | Centroids associated to all types of messages |
\(\delta : \mathbb {R}^{A}\times \mathbb {R}^{A}\rightarrow \mathbb {R}\) | Distance function between messages’ centroids |
I(a) | Information set for action a |
\(\pi \) | Strategy profile |
\(\pi _{a}\) | Probability that an action a is chosen if information set I(a) is reached |
\(\lambda \) | Parameter that controls the A-QRE algorithm |
\(\eta _{\epsilon }, \eta _{A}\) | Factors which amplifies the distance function between messages’ centroids |
\(U_{\mathcal {A}}: T_{\mathcal {A}}\times \mathbb {R}^A \times \{+1,-1\}\rightarrow \mathbb {R}\) | Utility function for the Adversary |
\(u_{\mathcal {A}}\) | Baseline utility for the Adversary |
\(U_{\mathcal {C}}:T_{\mathcal {A}}\times \mathbb {R}^A\times \{+1,-1\}\rightarrow \mathbb {R}\) | Utility function for the Classifier |
\(u_{\mathcal {C}}\) | Baseline utility for the Classifier |
\(\theta _{1}\) | Cases where messages are correctly classified |
\(\theta _{2}\) | Cases where messages are incorrectly classified |
\(\mathcal {E}rr: \{+1,1\}\times \{+1,-1\}\rightarrow \mathbb {R}\) | Error function |
\(\nu \) | Classification threshold |
\(\epsilon _{M}\) | Classification gain when classifying correctly a malicious message |
\(\epsilon _{R}\) | Classification gain when classifying correctly a regular message |
\(\gamma _{M}\) | Classification cost when classifying incorrectly a malicious message |
\(\gamma _{R}\) | Classification cost when classifying incorrectly a regular message |
Rights and permissions
About this article
Cite this article
Figueroa, N., L’Huillier, G. & Weber, R. Adversarial classification using signaling games with an application to phishing detection. Data Min Knowl Disc 31, 92–133 (2017). https://doi.org/10.1007/s10618-016-0459-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-016-0459-9