Capacity and Data Complexity in Multidimensional Linear Attack

Huang, Jialin; Vaudenay, Serge; Lai, Xuejia; Nyberg, Kaisa

doi:10.1007/978-3-662-47989-6_7

Jialin Huang^15,16,
Serge Vaudenay¹⁷,
Xuejia Lai¹⁶ &
…
Kaisa Nyberg¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9215))

Included in the following conference series:

Annual Cryptology Conference

3521 Accesses
3 Citations

Abstract

Multidimensional linear attacks are one of the most powerful variants of linear cryptanalytic techniques now. However, there is no knowledge on the key-dependent capacity and data complexity so far. Their values were assumed to be close to the average value for a vast majority of keys. This assumption is not accurate. In this paper, under a reasonable condition, we explicitly formulate the capacity as a Gamma distribution and the data complexity as an Inverse Gamma distribution, in terms of the average linear probability and the dimension. The capacity distribution is experimentally verified on the 5-round PRESENT.

Regarding to complexity, we solve the problem of estimating the average data complexity, which was difficult to estimate because of the existence of zero correlations. We solve the problem of using the median complexity in multidimensional linear attacks, which is an open problem since proposed in Eurocrypt 2011. We also evaluate the difference among the median complexity, the average complexity and a lower bound of the average complexity – the reciprocal of average capacity. In addition, we estimate more accurately the key equivalent hypothesis, and reveal the fact that the average complexity only provides an accurate estimate for less than half of the keys no matter how many linear approximations are involved.

Finally, we revisit the so far best attack on PRESENT based on our theoretical result.

J. Huang—This work was finished when the author affiliated to Shanghai Jiao Tong University and was visiting EPFL.

You have full access to this open access chapter, Download conference paper PDF

New Links between Differential and Linear Cryptanalysis

First Multidimensional Cryptanalysis on Reduced-Round $$\mathrm{PRINCE }_{core}$$

Joint data and key distribution of simple, multiple, and multidimensional linear cryptanalysis test statistic and its impact to data complexity

Article 17 August 2016

Keywords

1 Introduction

Block ciphers are used as basic building primitives in symmetric cryptography for encryption, authentication, construction of hash functions and so on. Evaluation of their practical security has been a hot research issue over the decades, giving rise to different analysis techniques. Statistical attacks exploit non-uniform behaviors of the plaintext-ciphertext data to find information about the key. One of the most prominent statistical attacks is linear cryptanalysis. Previously, linear trails were assumed to behave equally for each key [3, 4, 17, 20]. Then, by considering many trails in one approximation [24, 25], the linear hull effect raises interesting discussions about fixed-key behaviors in single linear approximations [21, 22]. Daemen et al. gave a fixed-key probability distribution for single linear correlations [13], leading to subsequent works on e.g., fundamental assumptions [9], the effect of key schedules [1] and measures for data complexity [19], all for single linear attacks. However, we still do not understand the situation in multidimensional linear cryptanalysis.

A collection of linear approximations has a capacity which measures their bias to the uniform distribution. One important open problem in multidimensional linear cryptanalysis is to estimate the capacity and data complexity when a large number of different keys are considered. In previous work, the capacity was assumed to hold an average value constantly for most of the keys, and the data complexity was usually measured by reciprocal of the average capacity. However, neither is correct. As we know, the key equivalent hypothesis has been questioned for single linear approximations and differential trails [5, 9, 12]. Now this hypothesis also requires adjustment in multidimensional linear setting.

Also, it has always been difficult to compute average data complexity over the keys in linear cryptanalysis. Using Jensen’s inequality, Murphy [22] points out that the Fundamental Theorem [24] can only give a lower bound for the average data complexity when a collection of linear trails in a linear approximation is used. Leander shows that in single linear attacks we should focus on median complexity instead of average complexity since the latter usually turns to infinity [19]. Both Murphy’s and Leander’s concerns haven’t been addressed yet in the scenario of multidimensional linear attacks.

As one of the most powerful variants of linear attacks, multidimensional linear attacks notably benefit the data complexity, both in theory and in practice [10, 11, 15, 16, 23]. Moreover, the multidimensional linear distinguisher has been discovered to have connections with other statistical distinguishers, e.g., truncated differential distinguishers [6], statistical saturation distinguishers [19], and integral distinguishers [8]. All the above suggests the importance of multidimensional linear cryptanalysis, hence, the lack of knowledge on fundamental aspects of this attack is especially surprising, and deserves more attention.

Our Contributions. In this paper, we point out that under a reasonable assumption, the distribution of key-dependent capacity can be explicitly formulated with a Gamma distribution, depending on average linear probability and dimension (Sect. 3). This distribution is verified experimentally on the round-reduced PRESENT cipher. Then, we derive the distribution of data complexity, an Inverse Gamma distribution based on the same parameters (Sect. 4). Our results allow a more accurate measurement for multidimensional linear attacks.

With these distributions, in Sect. 5 we discuss three well-known measures when considering the data complexity of multidimensional linear attacks: the reciprocal of average capacity, the average and the (general) median complexity. The following fundamental questions in single linear attacks are then generalized to multidimensional linear attacks and solved.

Firstly, we consider the standard key equivalence hypothesis. We discover that instead of holding for a majority of keys, the average capacity actually holds for less than half of the keys, no matter how many linear approximations are used. Hence, we modify the hypothesis in a way which is more in line with the practical situation.

Secondly, as we know, the average data complexity of single linear attacks is difficult to calculate, since the linear hull effect may result in zero correlation for some keys. However, we show that the situation changes when multiple linear approximations are involved, and in this case the average data complexity can be easily calculated from the Inverse Gamma distribution. Then, by generalizing Murphy’s idea from the case of linear hulls to the case of multiple linear approximations, the reciprocal of average capacity is proved to be only a lower bound of the average data complexity. We also figure out the exact difference between this lower bound and the average data complexity.

Thirdly, we solve the open problem proposed by Leander in [19] by developing the usage of median complexity to multidimensional linear attacks. Finally, all measures of data complexity are compared under different dimensions. An interesting observation is that, the median complexity infinitely approaches to the average one as the dimension increases.

In Sect. 6, we revisit Cho’s 25 rounds of multidimensional linear attack on PRESENT [10], which targets the most rounds of PRESENT with data complexity less than the whole codebook. As an application of our theoretical analysis, we can directly estimate the average capacity, instead of making a complex proof like [10]. Our results are very close to Cho’s. Moreover, the exact knowledge of the capacity distribution allows us to compute the ratio of weak keys precisely. Using Cho’s attack method by changing some parameters in the attack, $2^{123.24}$ weak keys for 26 rounds PRESENT can be recovered with no more than $2^{62.5}$ plaintext-ciphertext pairs.

2 Preliminaries

2.1 Block Ciphers and Linear Cryptanalysis

Let $\mathbb {F}_2$ be the binary field with two elements and $\mathbb {F}_2^n$ be the n-dimensional vector space over $\mathbb {F}_2$. The inner product on $\mathbb {F}_2^n$ is defined by $a \cdot b = \sum _{i=1}^{n}a_ib_i$, where a, b $\in \mathbb {F}_2^n$.

A block cipher is a mapping $E : \mathbb {F}_2^n \times \mathbb {F}_2^\kappa \rightarrow \mathbb {F}_2^n$ with $E_k(\cdot ) \overset{def}{=} E(k,\cdot )$ for each $k \in \mathbb {F}_2^\kappa $. If $y = E_k(x)$, x, y and k are referred to as the plaintext, the ciphertext and the master key, respectively. A key-alternating cipher is a block cipher consisting of an alternating sequence of unkeyed rounds and simple bitwise key additions.

Linear cryptanalysis uses a linear relation between bits from x, y and k. A linear approximation (u, v) is a probabilistic linear relation expressed as a boolean function of these bits, i.e.,

$$\begin{aligned} B(k) \overset{def}{=} u \cdot x \oplus v \cdot E_k(x), \end{aligned}$$

(1)

where (u, v) is called the text mask. B(k) is a boolean random variable characterized by

We call $c(k) = 2p(k) - 1$ the fixed-key correlation of the linear approximation (u, v). The linear probability (LP) of approximation (u, v) is defined as $LP(k) = c(k)^2$. Both c(k) and LP(k) vary over different keys, and can be regarded as real-value random variables over the whole key space.

In a linear approximation (u, v), there may be many paths with different intermediate masks, but sharing the same input and output mask (u, v). A path that considers linear relation round by round is called as linear trail (or linear characteristic). Note that in a key-alternating cipher, the LP of a linear trail^{Footnote 1} is independent of the subkeys.

2.2 Multidimensional Linear Approximations and Data Complexity

Multidimensional linear attacks use m approximations with linearly independent text masks, called base approximations, to construct an m-dimensional vectorial boolean function f. Let p = ($p_0$, $p_1$,..., $p_{2^m-1}$) be the probability distribution of f. It can be computed by the following lemma.

Lemma 1

([15, Corollary 1]) Let $f: \mathbb {F}_2^n \mapsto \mathbb {F}_2^m$ be a vectorial boolean function with the probability distribution p. Then, we have

$$c_a = \sum _{\eta \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }p_{\eta },\ for \ all \ a \in \mathbb {F}_2^m$$

and

$$p_{\eta } = 2^{-m}\sum _{a \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }c_a,\ for \ all \ \eta \in \mathbb {F}_2^m.$$

Here, $c_a$ is the correlation of the boolean function $a \cdot f$, $a \in \mathbb {F}_2^m$.

In multidimensional linear attack, $c_a$ is indeed the correlation of the approximation that combines the base approximations linearly.

Let $q = (q_0,...,q_{2^m-1})$ be another discrete probability distribution of an m-bit random variable. Then, the capacity of p and q is defined as follows.

Definition 1

The capacity between two probability distributions p and q is defined by

$$C(p,q) = \sum _{\eta =0}^{2^m-1}(p_{\eta }-q_{\eta })^2q_{\eta }^{-1}.$$

The capacity of multidimensional linear approximations with probability distribution p is $C(p) = C(p,\theta )$, where $\theta $ is the uniform distribution.

Lemma 2

([15, Corollary 2]) Given an m-dimensional vectorial boolean function f with the probability distribution p, the capacity is

$$\begin{aligned} C(p)= \sum _{a \in \mathbb {F}_2^m, a \ne 0}c_a^2. \end{aligned}$$

Thus, the capacity of multidimensional linear approximations is computed from m base approximations and other $2^m-1-m$ approximations that are XOR sum of the m base approximations. These $2^m-1-m$ approximations, denoted as combined approximations, are linearly spanned from the m base approximations.

To estimate the data complexity of multidimensional linear cryptanalysis, the Chernoff information $D^*$ can be considered [2].

Theorem 1

([2, Theorem 1]) Let $BestAdv_N(p,q)$ be the best advantage for distinguishing probability distribution p from probability distribution q, using N samples. We have

$$1-BestAdv_N(p,q) = 2^{-ND^*(p,q)+o(N)}.$$

Hence, the data complexity is $N \approx \frac{1}{D^*(p,q)}$. When q is the uniform distribution and p is close to q, the Chernoff information can be approximated by the capacity C(p), [2, Theorem 7], by

$$D^*(p,q) \simeq \frac{C(p)}{8\ln 2}.$$

In this case, when the optimal distinguisher based on LLR-statistic (or $\chi ^2$-statistic) is used, the data complexity is given as $\frac{\lambda }{C(p)}$, where $\lambda $ depends on the success probability of the distinguisher.

The probability distribution p of an m-dimensional linear approximation actually varies over different keys, so does the capacity (as we will show later). Hereafter, instead of using C(p(k)), we use C(k) to represent the variable of key-dependent capacity.

2.3 Related Distributions and Assumptions

Note 3

Let $\mathcal {N}(\mu ,\sigma ^2)$ be the normal distribution with mean $\mu $ and variance $\sigma ^2$. Let $\varGamma (\alpha ,\theta )$ be the Gamma distribution under the shape-scale parametrization, with mean $\alpha \theta $, the probability density function g and the cumulative distribution function $\mathcal {G}$. If $X \sim \mathcal {N}(0,\sigma ^2)$, then $X^2 \sim \varGamma (1/2,2\sigma ^2)$. Inv-Gamma($\alpha $,$\beta $) denotes the inverse-Gamma distribution with mean $\frac{\beta }{\alpha -1}$ for $\alpha > 1$. If $X \sim \varGamma (\alpha ,\theta )$, then $\frac{1}{X} \sim $ Inv-Gamma($\alpha ,\theta ^{-1})$.

Daemen et al. give the distribution of the fixed-key LP of linear approximations when linear hull effect is considered [13].

Approximation 4

[13, Theorem 22] Given a key-alternating cipher with independent round-keys, when the number of linear trails of (u, v) is large enough and their LP are small compared to ELP(u,v), the fixed-key correlation of (u, v), c(k), which is a real-value random variable, follows

$$c(k) \sim \mathcal {N}(0,ELP(u,v)).$$

The fixed-key LP(k) follows the distribution of $\varGamma (\frac{1}{2},2ELP(u,v))$, with mean ELP(u,v) and variance $2ELP(u,v)^2$, where $ELP(\cdot $) is the average linear probability of the approximation over all keys.

The ELP(u, v) can be denoted as $\overline{c^2}$ and computed by the following proposition for key-alternating ciphers.

Proposition 1

[12, 24] Let E be a key-alternating block cipher and assume that all subkeys are independent. The average LP of a linear approximation is the sum of all LP of the linear trails $t_j$, LPT($t_j$), between the input and output mask of this approximation, i.e.,

$$ELP(u,v) = \sum _{t_j \in (u,v)}LPT(t_j).$$

3 Key-Dependent Capacity in Multidimensional Linear Approximations

In this section, we study the distribution of key-dependent capacity. Let c(k) (resp. LP(k)) be a real-value random variable representing the fixed-key correlation (resp. linear probability) of the linear approximation and we can know c(k) and LP(k) from Approximation 4. When multiple linear approximations are used, we use i in the subscript to denote the index of linear approximations, e.g., denote $c_{i}(k)$ as the fixed-key correlation of the ith linear approximation. W.l.o.g, we use $i = 1,\dots ,m$ to represent the subscript of m base approximations.

In [16], the authors claim that in practical experiments the probability distributions vary a lot with the keys while the capacity remains rather constant. However, in this section we point out that the capacity also varies over different keys from the theoretical point and give experimental verification. We focus on dealing with two cases, both existing in practical block ciphers. These two cases are shown in Propositions 2 and 3, respectively.

Proposition 2

In an m-dimensional linear attack using m base approximations with correlations $c_i(k)$ i.i.d. to $\mathcal {N}(0,\overline{c^2})$ over the keys, where $\overline{c^2}$ is the average LP. If for each fixed key, the binary random variables associated to the base approximations are statistically independent, the fixed-key capacity of this m-dimensional linear approximation, C(k), approximately follows Gamma-distribution $\varGamma (\frac{m}{2},2\overline{c^2})$.

Proof

Let $f_1(k),\dots ,f_m(k)$ be m linearly independent base approximations to construct the m-dimensional approximation f(k), and $f(k) = (f_1(k),\dots ,f_m(k))$ is an m-dimensional vectorial boolean function with the probability distribution $p(k)=\{p_\eta (k)\}$, where $\eta \in \mathbb {F}_2^m$ and $p_\eta (k)$ is the probability that $f(k) = \eta $. Indeed, $f_i(k)$ is a binary random variable with correlation $c_i(k)$. Since $f_i(k)$ are statistically independent each other for each fixed key k,

$$p_\eta (k) = \prod _{i = 1}^{m}(\frac{1}{2}+(-1)^{f_i(k)}\frac{c_i(k)}{2}), \eta \in \mathbb {F}_2^m$$

According to Definition 1,

$$\begin{aligned} C(k)&= \sum _{\eta \in \mathbb {F}_2^m}(p_\eta (k)-2^{-m})^2/2^{-m} = 2^m\sum _{\eta \in \mathbb {F}_2^m}(p_\eta (k)-2^{-m})^2\\&= 2^m\sum _{\eta \in \mathbb {F}_2^m}(\prod _{i = 1}^{m}(\frac{1}{2}+(-1)^{f_i(k)}\frac{c_i(k)}{2})-2^{-m})^2 \end{aligned}$$

For each fixed key, $c_i(k) \cdot c_j(k) \ll c_i(k)$,

$$\begin{aligned} C(k)&= 2^m\sum _{\eta \in \mathbb {F}_2^m}[\sum _{i = 1}^{m}(-1)^{f_i(k)}\frac{c_i(k)}{2\cdot 2^{m-1}}]^2\\&= 2^m\sum _{\eta \in \mathbb {F}_2^m}[\frac{1}{2^{2m-2}}(\sum _{i = 1}^{m}(\frac{c_i(k)}{2})^2 + 2\sum _{i \ne j}(-1)^{f_i(k)+f_j(k)}\frac{c_i(k)}{2}\frac{c_j(k)}{2})] \end{aligned}$$

Since $\sum _{\eta \in \mathbb {F}_2^m}\sum _{i \ne j}(-1)^{f_i(k)+f_j(k)}\frac{c_i(k)}{2}\frac{c_j(k)}{2} = 0$,

$$C(k) = \frac{2^m}{2^{2m-2}}\sum _{\eta \in \mathbb {F}_2^m}\sum _{i = 1}^{m}(\frac{c_i(k)}{2})^2 = \sum _{i = 1}^{m}c_i(k)^2 = \sum _{i = 1}^{m}LP_i(k)$$

Since $c_i(k)$ are i.i.d. to $\mathcal {N}(0,\overline{c^2})$, $LP_i(k)$ are i.i.d to $\varGamma (\frac{1}{2},2\overline{c^2})$, $i = 1,\dots ,m$. Thus, C(k) is the sum of m independent Gamma distribution $\varGamma (\frac{1}{2},2\overline{c^2})$. Hence, $C(k) \sim \varGamma (\frac{m}{2},2\overline{c^2})$. $\square $

Recall that for one-dimensional linear approximations, $\overline{c^2}$ can be calculated by Proposition 1 when the dominant trails in a linear approximation are known.

Proposition 2 considers the scenario where the LP of base approximations are dominant. In this case, we approximate the capacity by summing the LP of base approximations and ignoring the LP of combined approximations (see Lemma 2). To show the reasonableness of this approximated capacity, we also bound the error of our approximation. For this part of analysis, please see Appendix B.

In the other hand, Proposition 3 considers the case that not only m base approximations but also $2^m-1-m$ combined approximations have non-negligible contribution to the capacity. In this case, the correlations of $2^m-1-m$ combined approximations are not independent any more. Thus, we derive the capacity in this case under another hypothesis.

Proposition 3

In an m-dimensional linear attack using the m-dimensional linear approximation with the probability distribution $p_{\eta }(k)$ i.i.d to a normal distribution $\mathcal {N}(2^{-m},\sigma ^2)$, $\eta \in \mathbb {F}_2^m$, the fixed-key capacity of this m-dimensional linear approximation, C(k), follows Gamma-distribution $\varGamma (\frac{2^m-1}{2},2 \cdot 2^m\sigma ^2)$.

Proof

Since $p_{\eta }(k)$ are i.i.d. to $\mathcal {N}(2^{-m},\sigma ^2)$,

$$Q = \sum _{\eta =0}^{2^m-1}\frac{(p_{\eta }(k)-2^{-m})^2}{\sigma ^2} \sim \chi ^2(2^m-1) = \varGamma (\frac{2^m-1}{2},2)$$

According to the definition of capacity,

$$C(k) = \sum _{\eta =0}^{2^m-1}\frac{(p_{\eta }(k)-2^{-m})^2}{2^{-m}}= 2^m \sigma ^2 Q = \varGamma (\frac{2^m-1}{2},2 \cdot 2^m\sigma ^2)$$

$\square $

Compared with Proposition 2 which considers only m base approximations with equally dominant correlations, Proposition 3 indeed addresses the situation where the correlation $c_a(k)$ of $2^m-1$ approximations are identically distributed (for the proof please refer to Appendix A). Thus, the average LP of $2^m-1$ approximations are equal, denoted as $\overline{c^2}$ again. As we know, the average capacity is the sum of the average LP of involved approximations, i.e., $(2^m-1)\cdot 2^m\sigma ^2 = (2^m-1)\overline{c^2}$, the distribution of capacity in Proposition 3 can also be represented as $\varGamma (\frac{2^m-1}{2},2\overline{c^2})$.

Experimental Verification. In order to verify that the above analysis reflects the reality with reasonable accuracy, we have experimentally computed the capacity distributions sampled from 5000 randomly chosen keys for 5-round PRESENT. A set of usable one-dimensional linear approximations is discovered in [26], with theoretical average LP computed as $2^{-16.83}$. Thus, the correlation distributions of these approximations are $\mathcal {N}(0,2^{-16.83})$, and the LP distributions are $\varGamma (\frac{1}{2},2^{-15.83})$ ^{Footnote 2}.

We can select linearly independent approximations from this set as the base approximations. Here we examine the 2-dimensional and 4-dimensional linear approximations for the case of Proposition 2.

In this case, the base approximations with input masks from different S-boxes in the first round and output masks from different S-boxes in the last round are chosen. According to Proposition 2, the theoretical distribution of 2-dimensional capacity is $\varGamma (1,2^{-15.83})$ and of 4-dimensional capacity is $\varGamma (2,2^{-15.83})$. The experimental distributions of 2-dimensional and 4-dimensional capacity sampled over 5000 keys are as (a) and (b) of Fig. 1, respectively.

As illustrated in Fig. 1, the experimental distribution of capacity follows the theoretical estimate closely. The scattering of data points occurs due to the fact that we basically use a histogram, and deal with raw data instead of averaging.

4 Distribution of Data Complexity

With the knowledge of capacity distribution, the distribution of data complexity, which approximates to $\lambda $ times the reciprocal of capacity, can be obtained formally. Hereafter we focus on the case mentioned in Proposition 2. The case of Proposition 3 can be deduced in a similar way.

Corollary 1

If the fixed-key capacity of the multidimensional linear approximation follows $C(k) \sim \varGamma (\frac{m}{2},2\overline{c^2})$, then the fixed-key data complexity of the corresponding multidimensional attack follows $N(k) \sim $ Inv-Gamma($\frac{m}{2}$,$\frac{\lambda }{2\overline{c^2}})$.

Corollary 1 is derived directly from Proposition 2 (also refer to Note 3), and addresses the case that m correlations of base approximations play a prominent role in the capacity. Since $\lambda $ is a constant for any fixed success probability in an attack, w.l.o.g. hereafter we study the above data complexity distribution as Inv-Gamma($\frac{m}{2},\frac{1}{2\overline{c^2}})$. For each key k, N(k) is asymptotically inversely proportional to C(k). The average data complexity over all keys is denoted by N, $N = E_k[N(k)]$, which is proportional to

$$E_k \bigg [\frac{1}{C(k)}\bigg ] = \frac{1}{|\mathcal {K}|}\sum _{k \in \mathcal {K}}\frac{1}{C(k)},$$

where $\mathcal {K}$ denotes the whole key space, and $E_k(\cdot )$ means an expected value taken over the whole key space. According to Corollary 1 and the mean of inverse Gamma distribution (see Note 3), the average data complexity is $E_k[\frac{1}{C(k)}] = \frac{1}{2\overline{c^2}(m/2-1)}$ = $\frac{1}{m\overline{c^2}-2\overline{c^2}}$.

Remark. The data complexity distribution in Corollary 1 also holds for single linear attacks where $m = 1$. In the case of $m = 1$, the average data complexity is infinite as pointed out by [19]^{Footnote 3}, which corresponds to the fact that the mean of the distribution Inv-Gamma($\frac{1}{2},\frac{1}{2\overline{c^2}}$) doesn’t exist. When m is equal to 2, the mean of the inverse Gamma distribution also doesn’t exist because there are always values going to infinite according to the distribution.

Similarly, the average capacity over the keys

$$E_k[C(k)] = \frac{1}{|\mathcal {K}|}\sum _{k \in \mathcal {K}}C(k)$$

is equal to $m\overline{c^2}$, derived from the mean of the Gamma distribution in Proposition 2 (see Note 3).

Example 5

For clearer explanation, hereafter a simple example which quite meets real situations in practical ciphers is used in our analysis. We take $\overline{c^2}$ as $2^{-40}$, which roughly equates the case in 15-round PRESENT, and take different m as 2, 4, 6, 8, 20 respectively. In this example, the distribution functions of data complexity are shown in Fig. 2, and the distribution functions of capacity are shown in Fig. 3.

5 Evaluation of the Data Complexity

In practical attacks, $E_k[\frac{1}{C(k)}]$ and $\frac{1}{E_k[C(k)]}$ are highly related to the evaluation of data complexity. Since $E_k[\frac{1}{C(k)}]$ is hard to estimate, the complexity is usually measured by $\frac{1}{E_k[C(k)]}$. In this section, we firstly propose a refined key equivalent hypothesis for $E_k[C(k)]$ (Sect. 5.1). With the exact description of data complexity distributions, the difficulty of evaluating $E_k[\frac{1}{C(k)}]$ is overcome, and a basic issue about the relation of average capacity and average data complexity is studied (Sect. 5.2). We also extend Leander’s idea of exploiting median data complexities [19] to multidimensional linear attacks (Sect. 5.3). Finally, all measures are compared.

5.1 Adjusted Key Equivalence Hypothesis

In regard to the connection between the fixed-key capacity and the average capacity in a multidimensional linear system, the traditional key equivalence hypothesis indicates that the fixed-key capacity does not deviate significantly from its average value [14, 18]. This key equivalence hypothesis can be interpreted as follows: $C(k) \approx E_k[C(k)]$, for almost all keys k. As we have shown, the capacity is actually Gamma distributed so that this hypothesis does not hold. Thus, two questions arise: which value is suitable for the evaluation of the attack complexity? Is that average value enough and correct? We start with the following conjecture to show that the average capacity is far from being able to represent the majority of keys.

Conjecture 1

There are always less than half of the keys having a capacity larger than the average capacity. That is, $|\{k^{*} \in \mathcal {K} | C(k^{*}) \ge E_k[C(k)]\}| < \frac{1}{2}|\mathcal {K}|$. Hence, less than half of the right keys can be recovered with a data complexity of $\frac{\lambda }{E_k[C(k)]}$, where $\mathcal {K}$ is the whole key space.

Table 1. The ratio of keys that have a capacity larger than the average capacity

Full size table

This conjecture is illustrated in Table 1 with Example 5. With the increase of m, the ratio of keys that have a capacity larger than the average capacity approximates to $\frac{1}{2}$, but cannot equal to $\frac{1}{2}$. This is because, for such a skew Gamma distribution as in Proposition 2, the median value is always smaller than the mean. It can be concluded that, using the number of cipher texts equal to $\frac{\lambda }{E_k[C(k)]}$, more than half of the keys cannot be recovered successfully with a reasonable probability. Thus, the average capacity is not enough to bring a sound estimation of attack complexities for most keys, especially when m is not large enough.

Since the capacity is highly dependent on the choice of the key, we concern that with how many data texts the multidimensional attacks can succeed for a majority of keys. A natural way to adjust the hypothesis is to consider the upper bound of data complexity for, e.g. 90 %, of the keys, meaning that for these 90 % keys the amount of data texts can guarantee a successful attack with high probability, even for some of these keys this data complexity is overestimated.

Hypothesis 6

(Adjusted Key Equivalence Hypothesis) If the capacity distribution of an m-dimensional linear attack satisfies Proposition 2, then $90\,\%$ of the keys in the key space have a capacity no smaller than $\mathcal {G}^{-1}(0.1)$, where $\mathcal {G}$ is the cumulative distribution function of $\varGamma (\frac{m}{2},2\overline{c^2})$. Using $\frac{\lambda }{\mathcal {G}^{-1}(0.1)}$ data is enough for recovering $90\,\%$ of the keys in the key space.

5.2 On Average Data Complexity

Why the Average Data Complexity is Calculable? It is known that in the classical single linear attacks considering linear hull effect, the average data complexity is hard to derive and usually infinite because of the existence of zero correlation. This difficulty now can be solved in the situation of m-dimensional linear attacks, since the average value can be easily derived from the accurate distributions of data complexity, when m is larger than 2. From the point of capacity distributions, we can understand more about the reason why the average data complexity is calculable in multidimensional attacks.

In the single linear setting, the keys with zero C(k) may make the average complexity infinite, thus, this part of keys should be focused on. Here, we point out that by taking multiple linear approximations simultaneously into consideration instead of only one, the number of keys with zero capacity can be very tiny so that the average complexity turns out to be computable.

We compare the ratio of keys bringing C(k) between zero and $\epsilon $, where $\epsilon $ is a fixed value very close to zero. From (b) of Fig. 3, it is obvious that with the increase of m, the ratio of keys with capacity going to zero decreases. This ratio for several fixed $\epsilon $ is shown in Table 2. From Table 2 we can see that as the increase of m, the ratio of keys with capacity close to zero decreases dramatically. This is because as the number of approximations grows, for each key there is higher probability that at least one approximation brings a non-zero LP, so that a non-zero capacity. Hence, for a fixed $\epsilon $, the more base approximations are used, the fewer the number of keys which bring infinite data complexities becomes. When $\epsilon $ is small enough and m has a reasonable size, this ratio can be negligible in the whole key space. In this case it is sound to assume that there is no key causing a zero capacity, so that the average data complexity is computable.

A Difference Between ${\varvec{E}}_{\varvec{k}}\mathbf {[}\frac{\mathbf {1}}{{\varvec{C}}\mathbf ( {\varvec{k}}\mathbf ) }\mathbf {]}$ and $\frac{\mathbf {1}}{{\varvec{E}}_{\varvec{k}}\mathbf {[}{\varvec{C}}\mathbf {(}{\varvec{k}}\mathbf {)}\mathbf {]}}$ . The problem discussed here is firstly found in the context of linear hull effect by Murphy [22]. We extend it to multidimensional linear attacks and make further investigation.

In some attack analysis, e.g. [10], the reduction in data complexity given by multiple approximations is based on the assertion that the data complexity N is proportional to $\frac{1}{E_k[C(k)]}$. Like the effectiveness issue of linear hull effect studied in [22], there is also a difference between $\frac{1}{E_k[C(k)]}$ and the actual average data complexity. According to Jensen’s Inequality and the fact that reciprocal of positive real numbers is a convex function, we have

$$E_k\bigg [\frac{1}{C(k)}\bigg ] \ge \frac{1}{E_k[C(k)]}.$$

Thus, the $\frac{1}{E_k[C(k)]}$ can only be used to give a lower bound to the average data complexity.

Jensen’s Inequality gives a general comparison without considering the details of the variables. When the distributions of both C(k) and $\frac{1}{C(k)}$ are known, $E_k[\frac{1}{C(k)}]$ and $\frac{1}{E_k[C(k)]}$ can be derived as in Sect. 4. Their difference is formulated as $\frac{1}{m\overline{c^2}-2\overline{c^2}}-\frac{1}{m\overline{c^2}}$ = $\frac{2}{m(m-2)\overline{c^2}}$. Therefore, in fact the equality will never hold for m larger than 2, i.e., $E_k[\frac{1}{C(k)}]$ is always larger than $\frac{1}{E_k[C(k)]}$. The difference can be ignored only when m is large enough. Figure 4 shows the difference for $m = 4$ and $m = 20$. For small m the difference is much more non-negligible, and $\frac{1}{E_k[C(k)]}$ does not reflect the real average data complexity. As more approximations are involved, the difference has a quicker trend to be small. For a fixed m, the smaller the average LP is, the larger the difference becomes. That is, as $\overline{c^2}$ decreases, which is a typical case since cryptanalysts always try to break as many rounds of the cipher as possible, the difference between $E_k[\frac{1}{C(k)}]$ and $\frac{1}{E_k[C(k)]}$ turns to be huge.

Table 2. The ratio of keys with capacity close to zero for different m and $\epsilon $

Full size table

5.3 On Median Data Complexity

Leander proposed a way to overcome the problem of infinite data complexities for single linear attacks [19]. Namely, instead of studying the average complexity, he studied the median complexities $\widetilde{N}$ such that for half of the keys the data complexity of an attack is less than or equal to $\widetilde{N}$. So far the usage of median complexity in multidimensional linear attacks remains unsolved, which we will discuss in this section. A general definition of $N_{p}$ is as follows, where $\widetilde{N}$ = $N_{1/2}$.

Definition 2

([19, Definition 1]) $N_p$ is defined as the complexity such that the probability that for a given key the attack complexity is lower than $N_p$, is p.

Although Leander gave this general definition, he focused on the case of $N_{1/2}$ in single linear attacks. With the knowledge of accurate distributions of data complexity, we generalize Leander’s Theorem 2 in [19] not only under the multidimensional linear model but also from $N_{1/2}$ to $N_p$.

Theorem 2

Assuming independent subkeys in an m-dimensional linear attack using m base approximations with the i.i.d. LP that is $\varGamma (\frac{1}{2},2\overline{c^2})$, p percent of the keys yield to a capacity of at least $\mathcal {G}^{-1}(1-p)$, where $\mathcal {G}$ is the cumulative distribution function of $\varGamma (\frac{m}{2},2\overline{c^2})$. Thus, the complexity of this m-dimensional linear attack is less than $\frac{\lambda }{\mathcal {G}^{-1}(1-p)}$ with the probability p.

Leander’s Theorem 2 is a special case of Theorem 2 taking m as 1 and p as $\frac{1}{2}$, when the noisy linear trails are ignored in the linear hull effect (If the noisy trails are considered, the ratio of keys reduces by a factor of 2). If we explain Leander’s Theorem 2 in our context, we use the fact that $F^{-1}(1/2) = 0.46\overline{c^2}$, where F is the cumulative distribution function of $\varGamma (\frac{1}{2},2\overline{c^2})$ (see [19] for more details).

As illustrated in (b) of Fig. 3, for the Y-axis at 1 / 2, the median capacity increases with the increment of m. That is, when the LP of base approximations are i.i.d., the more approximations we use, the lower data complexity we require for the same ratio of weak keys. Given a fixed capacity (so that a fixed data complexity), the ratio of keys causing a larger capacity than the fixed one increases when more base approximations are used. Thus, the ratio of weak keys resulting in a data complexity lower than the fixed one also increases.

Considering Example 5 again, we take different p, and fix the same $\lambda $ (as 1 w.l.o.g.) for each m. The highest data complexity required for different m-dimensional linear attacks for p percent of keys is shown in Table 3.

When the general median complexity $N_p$ is applied, there is such a question: which p is more suitable for measuring and comparing the strength of a linear attack. Obviously, it is meaningless to compare $N_{1/3}$ and $N_{2/3}$ directly. A natural and simple way is to consider the value of $\frac{N_p}{p}$ because the division of p can unify the disparity for different $N_p$ to a reasonably great extent. For example, if the attack complexity is lower than $N_{1/3}$ with probability 1 / 3, then the attack requires to be repeated 3 times for a sufficiently sound success rate. This should be equivalently compared with the case that, let’s say, an attack with complexity lower than $N_{1/2}$ has to be repeated twice. By confirming the existence of the minimal $\frac{N_p}{p}$, we can evaluate different multidimensional linear attacks with the value of $\min _{p}\frac{N_p}{p}$. The results are shown in Table 4.

Table 3. The highest data complexity for different m and different ratios of keys

Full size table

Table 4. Comparison of the average data complexity, the median data complexity, the reciprocal of average capacity, and $\min _{p}\frac{N_p}{p}$.

Full size table

Moreover, comparing $E_k[\frac{1}{C(k)}]$, $\frac{1}{E_k[C(k)]}$ and the median complexity, we observe that the average complexity is always larger than the median one, and the median complexity is always larger than the reciprocal of average capacity. As m increases, the difference between these three values decreases. When m is large enough, these values are approximately equal (see Table 4), since the Gamma and Inverse Gamma distribution turn to be normal distributions.

6 Application to Cho’s Multidimensional Attack on PRESENT

6.1 Cho’s Attack on 25-Round PRESENT

The structure of PRESENT [7] makes it vulnerable for a multidimensional attack: there are several strong one-dimensional approximations. The linear hull of each such approximation with non-negligible correlations consists of several equally strong single-bit trails, whose intermediate masks have Hamming weight one. The average LP $\overline{c^2}$ of all such approximations are $2^{2(-2r)}L(r)$ [26], where L(r) is the number of r-round trails in each approximation. The so far best result for PRESENT is proposed by Cho aiming to 25 rounds [10]. Nine 23-round m-dimensional linear approximations are used simultaneously, and each of them has the dimension $m = 8$ starting at one of the S-boxes $S_i$, i = 5, 9 or 13 and ending at one of the S-boxes $S_j$, j = 5, 6 or 7. They recover 16 bits of key in the first round and 16 bits of key in the last round. Please refer to [10] for more details of this attack. Cho proved that the average capacity is $2^{-52.77}$, and gave the formula of data complexity as in [10]:

$$\begin{aligned} N = (\sqrt{advantage\cdot 4\cdot M}+4(\varPhi ^{-1}(2P_s-1))^2)/C = \lambda /C \end{aligned}$$

(2)

where $\varPhi $ is the cumulative distribution function of the normal distribution, $P_s$ is the success probability, C(p) is the capacity, M is the number of linear approximations used in the attack. In Eq. (2), if the advantage is equal to a bits, then the right key candidate should be within the position of $2^{\ell -a}$, where $\ell $ is the number of targeted key bits. Cho chose the $\lambda = 2^{9.08}$ (advantage is 32 bits, $M = 9\cdot (2^8-1)$, $P_s = 0.95$)^{Footnote 4}, and estimated the average data complexity about $2^{61.85}$.

6.2 Our Investigation on Cho’s Attack

We give a simpler but close estimation on the capacity and data complexity of Cho’s attack. The authors in [16] claimed that Cho observes in practical experiments that the probability distribution of multidimensional linear approximations varies a lot with the keys, while the capacity remains rather constant. We have shown that the capacity also varies for different keys from theoretical and experimental viewpoints.

In order to attack 25-round PRESENT, 23-round approximations are used, thus $r = 23$. According to [26], $L(23) = 367261713$, thus $\overline{c^2} = 2^{-63.55}$. With Propositions 2 and 3, the fixed-key capacity of 9 8-dimensional approximations is estimated to be $\varGamma (9 \cdot \frac{2^8-1}{2},2^{-62.55})$. Hence, the average capacity is $2^{-52.39}$. With the same $\lambda $ as Cho, we obtain the data complexity $N = \frac{2^{9.08}}{C(k)} \sim Inv-Gamma (9 \cdot \frac{2^8-1}{2},2^{71.63})$. The average data complexity is $2^{61.47}$. This result is very close to the estimate in Cho’s attack, but easier to compute.

In the same way, we compute the capacity distribution used for 26-round PRESENT, which approximates to $\varGamma (9 \cdot \frac{2^8-1}{2},2^{-65.16})$. With the knowledge of distributions, we can derive the exact number of weak keys corresponding to different attack scenarios. Using Cho’s attack method by taking $\lambda = 2^{7.58}$ (advantage is 4 bits, $P_s=0.8$), there are now $2^{123.24}$ (3.7 % in the whole key space) weak keys with capacity larger than $2^{-54.92}$. That means, for $2^{123.24}$ keys out of $2^{128}$ keys, 26-round PRESENT can be attacked using less than $2^{62.5}$ plaintext/ciphertext pairs, with success probability 0.8.

7 Conclusion and Further Work

In this paper, we deal with the multidimensional linear attacks using m base approximations with i.i.d. correlations (linear probability). We focus more on the case where the base linear approximations can be regarded as statistically independent. In this case, we point out that the capacity of multidimensional linear approximations satisfies a Gamma distribution, which also leads to an exact Inverse Gamma distribution for the data complexity. Both distributions are parametrized by the dimension and the average linear probability of each approximation. These theoretical results have been verified by experiments on PRESENT. We establish an explicit connection between the fixed-key behaviour and the average behaviour. Based on the distributions, several fundamental issues are discussed in more detail. Multidimensional linear attacks not only benefit from data complexity, but also offer more convenience for measuring the average data complexity due to the fact that the ratio of keys with capacity going to zero decreases with the increase of dimension. The relation of the median and average data complexity, as well as the inverse of average capacity is derived. When the dimension is large enough, these three values are infinitely close. We also propose a modified key equivalent hypothesis that is more suitable for practical situations. Finally, the multidimensional linear attack on 25- and 26-round PRESENT is analyzed based on our theoretical result.

In future work, more complicated cases about the relations of LP distributions should be studied, which may bring more precise evaluation on multidimensional attacks. The measure of $\frac{N_p}{p}$ can be extended to single linear attacks. Moreover, given the close relation between statistical saturation attacks and multidimensional linear attacks, our results may allow a clearer understanding for the capacity of statistical saturation attacks, whose key-dependent performance still lacks accurate measurement.

Notes

1.
Hereafter, whether the LP is of a linear approximation or of a linear trail will be clear from the context.
2.
For more details about the approximations used in our experiments, please refer to [26].
3.
In fact, the data complexity should be upper-bounded by the size of the codebook.
4.
This result is slightly different from [10], since Eq. (2) is slightly corrected in [16] and our computation uses the corrected formula.

References

Abdelraheem, M.A., Ågren, M., Beelen, P., Leander, G.: On the distribution of linear biases: three instructive examples. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 50–67. Springer, Heidelberg (2012)
Chapter Google Scholar
Baignères, T., Vaudenay, S.: The complexity of distinguishing distributions (invited talk). In: Safavi-Naini, R. (ed.) ICITS 2008. LNCS, vol. 5155, pp. 210–222. Springer, Heidelberg (2008)
Chapter Google Scholar
Biham, E.: On matsui’s linear cryptanalysis. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 341–355. Springer, Heidelberg (1995)
Chapter Google Scholar
Biryukov, A., De Cannière, C., Quisquater, M.: On multiple linear approximations. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 1–22. Springer, Heidelberg (2004)
Chapter Google Scholar
Blondeau, C., Bogdanov, A., Leander, G.: Bounds in shallows and in miseries. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013, Part I. LNCS, vol. 8042, pp. 204–221. Springer, Heidelberg (2013)
Chapter Google Scholar
Blondeau, C., Nyberg, K.: New links between differential and linear cryptanalysis. In: Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 388–404. Springer, Heidelberg (2013)
Chapter Google Scholar
Bogdanov, A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw, M.J.B., Seurin, Y., Vikkelsoe, C.: PRESENT: an ultra-lightweight block cipher. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466. Springer, Heidelberg (2007)
Chapter Google Scholar
Bogdanov, A., Leander, G., Nyberg, K., Wang, M.: Integral and multidimensional linear distinguishers with correlation zero. In: Wang, X., Sako, K. (eds.) ASIACRYPT 2012. LNCS, vol. 7658, pp. 244–261. Springer, Heidelberg (2012)
Chapter Google Scholar
Bogdanov, A., Tischhauser, E.: On the wrong key randomisation and key equivalence hypotheses in matsui’s algorithm 2. In: Moriai, S. (ed.) FSE 2013. LNCS, vol. 8424, pp. 19–38. Springer, Heidelberg (2014)
Google Scholar
Cho, J.Y.: Linear cryptanalysis of reduced-round PRESENT. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 302–317. Springer, Heidelberg (2010)
Chapter Google Scholar
Cho, J.Y., Hermelin, M., Nyberg, K.: A new technique for multidimensional linear cryptanalysis with applications on reduced round serpent. In: Lee, P.J., Cheon, J.H. (eds.) ICISC 2008. LNCS, vol. 5461, pp. 383–398. Springer, Heidelberg (2009)
Chapter Google Scholar
Daemen, J., Rijmen, V.: The Design of Rijndael: AES - The Advanced Encryption Standard. Information Security and Cryptography. Springer (2002)
Google Scholar
Daemen, J., Rijmen, V.: Probability distributions of correlation and differentials in block ciphers. J. Math. Cryptology 1, 221–242 (2007)
Article MathSciNet MATH Google Scholar
Harpes, C., Kramer, G.G., Massey, J.L.: A generalization of linear cryptanalysis and the applicability of matsui’s piling-up lemma. In: Guillou, L.C., Quisquater, J.-J. (eds.) EUROCRYPT 1995. LNCS, vol. 921, pp. 24–38. Springer, Heidelberg (1995)
Chapter Google Scholar
Hermelin, M., Cho, J.Y., Nyberg, K.: Multidimensional linear cryptanalysis of reduced round serpent. In: Mu, Y., Susilo, W., Seberry, J. (eds.) ACISP 2008. LNCS, vol. 5107, pp. 203–215. Springer, Heidelberg (2008)
Chapter Google Scholar
Hermelin, M., Nyberg, K.: Linear cryptanalysis using multiple linear approximations. In: Junod, P., Canteaut, A. (eds.) Advanced Linear Cryptanalysis of Block and Stream Ciphers, IOS Press (2011)
Google Scholar
Kaliski Jr., B.S., Robshaw, M.J.B.: Linear cryptanalysis using multiple approximations. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 26–39. Springer, Heidelberg (1994)
Google Scholar
Lai, X., Massey, J.L.: Markov ciphers and differential cryptanalysis. In: Davies, D.W. (ed.) EUROCRYPT 1991. LNCS, vol. 547, pp. 17–38. Springer, Heidelberg (1991)
Chapter Google Scholar
Leander, G.: On linear hulls, statistical saturation attacks, PRESENT and a cryptanalysis of PUFFIN. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 303–322. Springer, Heidelberg (2011)
Chapter Google Scholar
Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994)
Chapter Google Scholar
Murphy, S.: The independence of linear approximations in symmetric cryptanalysis. IEEE Trans. Inf. Theory 52, 5510–5518 (2006)
Article Google Scholar
Murphy, S.: The effectiveness of the linear hull effect. J. Math. Cryptology 6, 137–148 (2012)
Article MATH Google Scholar
Nguyen, P.H., Wei, L., Wang, H., Ling, S.: On multidimensional linear cryptanalysis. In: Steinfeld, R., Hawkes, P. (eds.) ACISP 2010. LNCS, vol. 6168, pp. 37–52. Springer, Heidelberg (2010)
Chapter Google Scholar
Nyberg, K.: Linear approximation of block ciphers. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 439–444. Springer, Heidelberg (1995)
Chapter Google Scholar
Nyberg, K.: Correlation theorems in cryptanalysis. Discrete Appl. Math. 111, 177–188 (2001)
Article MathSciNet MATH Google Scholar
Ohkuma, K.: Weak keys of reduced-round PRESENT for linear cryptanalysis. In: Jacobson Jr., M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 249–265. Springer, Heidelberg (2009)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität Darmstadt, Darmstadt, Germany
Jialin Huang
Shanghai Jiao Tong University, Shanghai, China
Jialin Huang & Xuejia Lai
EPFL, Lausanne, Switzerland
Serge Vaudenay
Aalto University, Espoo, Finland
Kaisa Nyberg

Authors

Jialin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Serge Vaudenay
View author publications
You can also search for this author in PubMed Google Scholar
Xuejia Lai
View author publications
You can also search for this author in PubMed Google Scholar
Kaisa Nyberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jialin Huang .

Editor information

Editors and Affiliations

City College of New York, New York, New York, USA
Rosario Gennaro
Impinj, Inc., Seattle, Washington, USA
Matthew Robshaw

Appendices

A Appendix - Proof of Lemma 7

Lemma 7

For an m-dimensional linear approximation with the probability distribution $p_{\eta }(k)$ i.i.d. to the normal distribution $\mathcal {N}(2^{-m},\sigma ^2)$, $\eta $ = 0,...,$2^m-1$, the correlations $c_a(k)$ ($a \in \mathbb {F}_2^m$, $a \ne 0$) of the involved $2^m-1$ approximations are identically distributed.

Proof

According to Lemma 1, for $a \ne 0$,

$$\begin{aligned} c_a(k)&= \sum _{\eta \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }p_{\eta }(k) \\&= \sum _{\eta \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }(p_{\eta }(k) - 2^{-m}+2^{-m})\\&= \sum _{\eta \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }(p_{\eta }(k) - 2^{-m})+\sum _{\eta \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }2^{-m} \end{aligned}$$

As $p_{\eta }(k)$ are i.i.d. to the normal distribution $\mathcal {N}(2^{-m},\sigma ^2)$, $p_{\eta }(k) - 2^{-m}$ are i.i.d. to $\mathcal {N}(0,\sigma ^2)$. Thus,

$$\sum _{\eta \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }(p_{\eta }(k) - 2^{-m}) \sim \mathcal {N}(0,2^m\sigma ^2)$$

As $\sum _{\eta \in \mathbb {F}_2^m}(-1)^{a \cdot \eta }2^{-m}$ is equal to 0, $c_a(k)$ are identically distributed to the normal distribution $\mathcal {N}(0,2^m\sigma ^2)$, where $a \in \mathbb {F}_2^m$ and $a \ne 0$. $\square $

B Appendix - Error Bound of Proposition 2

In Proposition 2, the binary random variables associated to the base approximations are statistically independent, for each fixed key. According to Piling-up Lemma, the LP of combined approximations is equal to the multiplication of the corresponding base LPs. Thus, the accurate capacity is the summation of LP of all base and combined approximations (see Lemma 2):

$$C(k) = LP_1(k) + \dots + LP_m(k) + LP_1(k)\times LP_2(k) + \dots + LP_1(k)\times LP_2(k)\times \dots \times LP_m(k)$$

$$= \prod _{i = 1}^m(LP_i(k)+1)-1,$$

while our approximated capacity in Proposition 2 is $\sum _{i = 1}^mLP_i(k)$. Their difference is

$$\prod _{i = 1}^m(LP_i(k)+1)-1 - \sum _{i = 1}^mLP_i(k)$$

$$< (\frac{\sum _{i = 1}^mLP_i(k) + m}{m})^m -1 - \sum _{i = 1}^mLP_i(k) $$

In practical attacks, $LP_i(k) \ll 1$ is natural and reasonable. Denote $\sum _{i = 1}^mLP_i(k)$ as A, and $A \ll 1$. The above formula can be written as

$$\begin{aligned} (\frac{A+m}{m})^m-1-A&= (\frac{A}{m}+1)^m-1-A = 1 + \sum _{i = 1}^m\frac{C_m^i}{m^i}A^i-1-A\\&= \sum _{i = 2}^m\frac{C_m^i}{m^i}A^i < \sum _{i = 2}^mA^i < (m-1)A^2 \end{aligned}$$

In our case, A is a random variable distributed to $\varGamma (\frac{m}{2},2\overline{c^2})$. The expected value of A, E(A), is $m\overline{c^2}$. The variance of A, D(A), is $m/2 \times (2\overline{c^2})^2$. The expected value of $A^2$, $E(A^2)$, is equal to $D(A)+[E(A)]^2$, i.e.,

$$\begin{aligned} E(A^2)&= D(A)+[E(A)]^2\\&= m(m+2)(\overline{c^2})^2 \end{aligned}$$

Thus, the expected value of the error is less than $(m-1)m(m+2)(\overline{c^2})^2$, which is reasonably smaller than the expected value of our approximated capacity, $m\overline{c^2}$. As we target towards attacking more and more rounds of the cipher, in average $\overline{c^2}$ tends to be close to the inverse of the message space, for example, $2^{-64}$, meaning that the error is negligible in this case.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, J., Vaudenay, S., Lai, X., Nyberg, K. (2015). Capacity and Data Complexity in Multidimensional Linear Attack. In: Gennaro, R., Robshaw, M. (eds) Advances in Cryptology -- CRYPTO 2015. CRYPTO 2015. Lecture Notes in Computer Science(), vol 9215. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47989-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-662-47989-6_7
Published: 01 August 2015
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-47988-9
Online ISBN: 978-3-662-47989-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Capacity and Data Complexity in Multidimensional Linear Attack

Abstract

Similar content being viewed by others

New Links between Differential and Linear Cryptanalysis

First Multidimensional Cryptanalysis on Reduced-Round $$\mathrm{PRINCE }_{core}$$

Joint data and key distribution of simple, multiple, and multidimensional linear cryptanalysis test statistic and its impact to data complexity

Keywords

1 Introduction

2 Preliminaries

2.1 Block Ciphers and Linear Cryptanalysis

2.2 Multidimensional Linear Approximations and Data Complexity

Lemma 1

Definition 1

Lemma 2

Theorem 1

2.3 Related Distributions and Assumptions

Note 3

Approximation 4

Proposition 1

3 Key-Dependent Capacity in Multidimensional Linear Approximations

Proposition 2

Proof

Proposition 3

Proof

4 Distribution of Data Complexity

Corollary 1

Example 5

5 Evaluation of the Data Complexity

5.1 Adjusted Key Equivalence Hypothesis

Conjecture 1

Hypothesis 6

5.2 On Average Data Complexity

5.3 On Median Data Complexity

Definition 2

Theorem 2

6 Application to Cho’s Multidimensional Attack on PRESENT

6.1 Cho’s Attack on 25-Round PRESENT

6.2 Our Investigation on Cho’s Attack

7 Conclusion and Further Work

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A Appendix - Proof of Lemma 7

Lemma 7

Proof

B Appendix - Error Bound of Proposition 2

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation