Keywords

1 Introduction

Cloud storage is a new concept that extends and develops in the concept of cloud computing, which collects data storage and service access functions through the combination of cluster application, network technology or distributed file system, and collects many different types of storage devices in the network together through application software to work together. It is an emerging network storage technique, which has many good qualities. For one thing it makes all the storage resources be integrated together to achieve data storage management automation and intelligence, for another, it improve the storage efficiency and flexible expansion through the visualization technology to solve the waste of storage space, reduce the operating costs [1,2,3,4]. Due to its properties of flexible management and low rental prices, many users and businesses choose to put their own data in the cloud.

With the promotion of cloud services, the industry soon found that when cloud storage brings people convenience, it gradually appears some short boards, and the biggest obstacle of cloud service promotion is the security issues around the data. The user suspects that the cloud service can not provide the corresponding security support for the data, which hinders the transfer of more data and business platform. In order to solve the problem mentioned above, we need to satisfies the following two conditions: Integrity and confidentiality, that is the cloud storage server should ensure that the data and operations in the cloud would not be malicious or non-malicious loss, destruction, leakage or illegal use; Access and privacy, when users visit some sensitive data, the system can prevent potential rivals to infer the user’s behavior through the user’s access mode. At present, the main means to solve such problems is to use the cryptography techniques. Users always use encryption system to encrypt their sensitive data before upload to the cloud to protect the data’s confidentiality from illegal adversary. This method is the most straightforward and the simplest, but is not practical in the real scene. After a long period of research, for the former, people find that the searchable encryption is good tool to solve this problem. In this paper, we will focus on how to realize the dynamic data confidentiality and privacy retrieval control in cloud.

1.1 Related Work

The earliest research on searchable symmetric encryption system can be traced back to Chor et al.’s work [5] in 1995. They proposed the first retrieval scheme on encrypted data that stored in the database, which enables the user to search the special encrypted file without leaking anything of the data. After Chor’s work, searchable symmetric encryption has been deeply studied and most of them focus on improving search performance, search pattern and security [6,7,8]. Cash et al. [9] renewed the encrypted data structure refer to the original one, and designed the first sub-linear SSE scheme which supported boolean queries for large databases at the cost of leaking the search pattern to the server. To make up for the lack of that Cash’s work can only support single search, Jarecki extended Cash’s OXT protocol to multi-client OXT [10] through provided the client s set of partial trapdoors for some permitted keywords. Their core policy is to define a sequence of attributes corresponding each query on an element in the keyword set, and the token could be computed when it satisfy the attributes.

There are also a lot of other works focus on realizing the dynamic search model, multi-client searching and other functions [11,12,13,14,15]. Compared with SE supporting single user, which can be regarded as data outsourcing, multi-user SE can achieve share of sensitive data. Generally, many existing SE schemes use key sharing, key distribution, proxy re-encryption, broadcast encryption, or other techniques to achieve the extension from single user to multi-user. Such as, in 2006, Curtmola et al. [16] proposed the first multiuser SE system under a broadcast encryption system, which brings enormous cost of user revocation. In 2008, Bao et al. [17] also proposed a multi-user SE. Because the users access rights depend on corresponding attribute set, the efficiency of system will increase by number of users. Dong et al. [18] constructed multi-user system based on proxy re-encryption techniques, where each user has its own unique key to encrypt, search and decrypt data. Thus, the scheme need a trusted server to manage keys. At the same time, recently, there are many systems based on ABE, in which user used attribute set to define rights of search [19,20,21,22]. Wang et al. [19] achieves fine-grained access control to authorized users with different access rights using a standard CP-ABE without key share. 2016, Wang et al. [20] proposed an efficiently multiuser searchable attribute-based encryption scheme with attribute revocation and grant for cloud storage. In the scheme, attribute revocation and grant processes of users are delegated to proxy server. In 2015, Rompay et al. [23] introduces a third party, named a proxy, that performs an algorithm to transform a single user query into one query per targeted document. In this way, sever cannot have access to content of query and its result, which achieves query privacy.

1.2 Our Contribution

In this work, we provide a multi-client dynamic searchable symmetric encryption system (MC-DSSE) for retrieving encrypted privacy data in cloud, and the main properties are listed as follows:

  1. 1.

    Multi-client. For practical use, this work focus on achieving single-writer/multi-reader search mode. It allows the data owner to delegate the search capability to multi-clients by a RSA approach. In fact, we distinguish the client by giving them the different ability search for a set of permitted keywords. When someone want to search for some special keyword, he needs to apply a partial search token from the data owner firstly, then generates the full search token according to the expected keyword.

  2. 2.

    Dynamic. To enhance the flexibility of the scheme, we add the AddKeyword, DeleteFile, algorithm to make it dynamic for the data owner. With these algorithms, the data owner can use his private key to add the new keyword and delete the encrypted file with the delete token.

  3. 3.

    Privacy. The proposed scheme achieves IND-CKA2 secure against probability polynomial adversary. Users could search the encrypted data stored in cloud platform which contains some keywords by a unique token without leaking anything about the origin data. Moreover, we also demonstrate that our scheme is secure for multi-clients by employ the RSA function.

1.3 Organization

The rest of this paper is organized as follows: In Sect. 2, we describe the definition of MC-DSSE scheme and gave some hardness assumptions. In Sects. 3 and 4, we propose a novel DSSE scheme support for multi-client and give its security proof. Section 5 gives its communication and computation cost. Finally, we end the paper with a brief conclusion.

2 Preliminaries

In this section, we first review the definition of the multi-client dynamic searchable symmetric encryption with keyword search, and then introduce some hardness problems with its complexity assumption related to our security proof.

2.1 MC-DSSE Definition and Related Database Structure

Here mainly introduce the syntax of multi-client dynamic symmetric searchable encryption and give a brief description of some necessary database structures.

Definition 1

(MC-DSSE) [24]. A MC-DSSE scheme consists of the following five polynomial algorithms among a data owner, a client and a server:

  • Setup:The data owner takes security parameter \({\lambda }\) and a database \(\mathrm {DB}\) as input, generate the system master key \(\mathrm {MK}\) and public key \(\mathrm {PK}\), and sends the encrypted database \(\mathrm {EDB}\) to the server, the server stores \(\mathrm {EDB}\).

  • ClientKGen: The data owner takes \(\mathrm {MK}\), and a set w of permitted keywords as input and generates a search authorized private key sk for the client.

  • AddKeyword: The data owner takes a new the file-keyword pair (id, w) and his secret key as input, generates and sends the ciphertexts to the server. The server takes the \(\mathrm {EDB}\) as input, and inserts these ciphertexts into \(\mathrm {EDB}\).

  • DeleteFile: The data owner takes the file’s identifier and his secret key as input, returns a delete token to the server. The server takes the \(\mathrm {EDB}\) as input, and deletes all ciphertexts of a file with identifier id from \(\mathrm {EDB}\).

  • Search: The client takes the keyword and his secret parameters as inputs, generates a search token for the server. Then the server takes the database \(\mathrm {EDB}\) as input, and returns the corresponding file identifiers of the file.

In order to make the proposed scheme look more concise and practical, here it will employ two data structures \(\mathcal {D}, \mathcal {T}\) which denotes List and Dictionary respectively, and then introduce four database language Great, Get, Update, Remove from [24], and it also will be used in our construction.

2.2 Security Definition and Hardness Assumptions

In this paper, we consider IND-CKA2 security of our MC-DSSE scheme. First, we define four response rules for the simulator for returning each query (Such as Setup, AddKeyword, DeleteFile, Search) of adversary \(\mathcal {A}\), which will be used in our security model, and then give detail IND-CKA2 security model for our multi client searchable encryption.

  • When \(\mathcal A\) gives a selected database DB to \(\mathcal {S}\) to have a test on protocol Setup, \(\mathcal {S}\) takes leakage function \(\mathcal {L}_{Setup}\) as input, and simulates an encrypted database EDB.

  • When \(\mathcal {A}\) gives a new file-keyword pair to \(\mathcal {S}\) to have a test on protocol AddKeyword, \(\mathcal {S}\) takes leakage function \(\mathcal {L}_{AddKeyword}\) as input, and generates the corresponding searchable ciphertexts.

  • When \(\mathcal {A}\) gives a selected file to \(\mathcal {S}\) to test on protocol DeleteFile, \(\mathcal {S}\) takes leakage function \(\mathcal {L}_{DeleteF ile}\) as input, and generates the corresponding delete token.

  • When \(\mathcal {A}\) gives a selected keyword to \(\mathcal {S}\) to have a test on protocol Search, \(\mathcal {S}\) takes leakage function \(\mathcal {L}_{Search}\) as input, and simulates the corresponding search token.

Definition 2

(IND-CKA2 Security) [24]. Let \(\varPi \) = (Setup, AddKeyword, DeleteFile, ClientKGen, Search) be a multi client dynamic symmetric searchable encryption scheme, \(\mathcal A\) and \(\mathcal S\) denote the adversary and simulator, respectively. Suppose tuple (\(\mathcal {L}_{Setup}\), \(\mathcal {L}_{AddKeyword}\), \(\mathcal {L}_{DeleteFile}\), \(\mathcal {L}_{ClientKGen}\), \(\mathcal {L}_{Search}\)) be five leakage functions, consider the related two probabilistic games as follows:

Real\(_{A}(1^k)\): \(\mathcal {A}\) chooses an initial database \(\mathrm {DB}\). A challenger runs Setup to generate \(\mathrm {(MK, PK}\), \(\mathrm {EDB})\) where \(\mathrm {(PK, MK)}\) denote the public/secret key of data owner and \(\mathrm {DB}\) denotes the encrypted data of database \(\mathrm {DB}\). Once \(\mathcal {A}\) receives the \(\mathrm {EDB}\) from challenger, it makes a polynomial number of queries for protocol AddKeyword, DeleteFile, ClientKGen and Search. For response, the challenger feedbacks the corresponding result to \(\mathcal {A}\). Finally, the adversary \(\mathcal {A}\) outputs a bit ‘b’ as the result of the game.

Ideal\(_{A, S}(1^k)\): \(\mathcal {A}\) chooses an initial database \(\mathrm {DB}\). Given the leakage \(\mathcal {L}_{Setup}\), \(\mathcal {S}\) computes and sends encrypted database \(\mathrm {EDB}\) to \(\mathcal {A}\). Then \(\mathcal {A}\) makes a polynomial number of queries for the five protocols as above. For each query, \(\mathcal S\) masters the relevant leakage function five-tuple (\(\mathcal {L}_{AddKeyword}\), \(\mathcal {L}_{DeleteFile}\), \(\mathcal {L}_{ClientKGen}, \mathcal {L}_{Search}\)), For response, the challenger feedbacks the corresponding result to \(\mathcal {A}\). Finally, the adversary \(\mathcal {A}\) outputs a bit ’b’ as the result of the game.

We say that a multi-client DSSE scheme is called IND-CKA2 secure with leakage functions above, if the probability \(Pr[\mathbf {Real}_{A}(k) = 1]-Pr[\mathbf {Ideal}_{A,S}(k) = 1]\) is negligible for some security parameter k.

Definition 3

(Strong RSA Problem) [25]. Let p, q be two k-bit big prime numbers, and set \(n=pq\). Choose \(g \in \mathbb {Z}_n^*\) randomly. We say that an efficient algorithm \(\mathcal {A}\) solves the strong RSA problem if it receives as input the tuple (n, g) and outputs two element (z, e) such that \(z^e = g \mod n\).

3 Our MC-DSSE Construction

Assume Data owner, Server, Client be the participants who take part in the DSSE scheme. With the four database language described in Sect. 2.1, now we design our detail multi-client dynamic searchable symmetric encryption scheme which includes the following five phases.

Setup(\(1^{k}\), DB, NULL):

  • Data owner: Take a security parameter k and a database DB as inputs. Let F:\(\{0,1\}^{k} \times \{0,1\}^* \rightarrow \{0,1\}^{k}\) be a key-based pseudo random function, and H: \(\{0,1\}^* \rightarrow \{0,1\}^{2k+1}\), G: \(\{0,1\}^* \rightarrow \{0,1\}^{3k+1}\) be two cryptographic hash functions. Choose two big prime integers p, q, and pick \(k_1, k_2 \in \{0,1\}^k\) randomly, then output the master key MK \(= (p, q, k_1, k_2, g)\) and the public key PK\(=(n=pq, F, G, H)\). Finally, run the Algorithm 1 to generate the \(\mathrm {EDB}\) and send it to the Server, keep the \(\mathcal T_P\) secret.

  • Server: Store the encrypted database \(\mathrm {EDB}\).

figure a

ClientKGen(MK, w):

  • Client: Assuming that a legitimate client wish to perform searches over keywords \(\mathbf w =(w_1, w_2, \dots , w_n)\), he send w to the owner to apply for his private key of keywords w.

  • Data owner: The data owner generates a corresponding private key as:

    $$\begin{aligned} sk_\mathbf{w }=(sk_\mathbf{w ,1},sk_\mathbf{w ,2},sk_\mathbf{w ,3}) \leftarrow (k_1, k_2, g^{1/\prod _{j=1}^{n}w_j} \mod n) \end{aligned}$$

    and then sends back \(sk_\mathbf{w }\) together with w to the client.

AddKeyword((MK, \(\mathcal D_P, id, w)\), EDB):

  • Data owner: Take the master key MK \(= (k_1, k_2, p, q)\), dictionary \(\mathcal D_P\), encrypted database EDB \(=(\mathcal D_W, \mathcal D_F\), \(DT_{id})\) and a chosen file-keyword pair (id, w) as inputs, then execute Add Keyword Algorithm to add the new ciphertext of the pair to EDB.

  • Server: Take \(EDB=(\mathcal D_W, \mathcal D_F, \mathcal D_{F,W}\)) and \((L_w, D_w, L_{id}, D_{id}, L_{id,w}, D_{id,w})\) as inputs, and then run standard data algorithm Update\((D_W, (L_w, D_w))\), and Update\((\mathcal D_{F,W}, (L_{id,w}\), \(D_{id,w}))\).

figure b

DeleteFile((MK, id), EDB):

  • Data owner: Take \(K = (k_1, k_2, p, q)\), \(\mathcal D_P\) and a file identifier id as inputs, generate and send a delete token

    $$\begin{aligned} DT_{id} = (F_{k_1}(g^{1/id}\mod n), F_{k_2}(g^{1/id}\mod n)) \end{aligned}$$

    to the server.

  • Server: Take the encrypted database \(EDB=(\mathcal D_W, \mathcal D_F\), \(DT_{id})\) as inputs, set \(L_{id}=F_{k_1}(g^{1/id}\mod n)\), and executes the following algorithm to delete the expected file.

figure c

Search \(((sk_\mathbf{w }), \mathrm {EDB})\) :

  • Client: Whenever the client with searchable ability on keywords \(\mathbf w =(w_1,w_2,\cdots ,w_n)\) wants to search the file on keyword \(w_i\), he uses his private key as inputs, compute the search token

    $$\begin{aligned} ST_{w_i}=(F_{sk_\mathbf{w ,1}}(sk_\mathbf{w ,3}^{\prod _{w \in \mathbf w / \{w_i\}}w}\mod n),F_{sk_\mathbf{w ,2}}(sk_\mathbf{w ,3}^{\prod _{w \in \mathbf w / \{w_i\}}w}\mod n)) \end{aligned}$$

    and send \(ST_{w_i}=(F_{k_1}(g^{1/w_i}\mod n),F_{k_2}(g^{1/w_i}\mod n))\)to the server;

  • Server: Take \(\mathrm {EDB}\) \(= (\mathcal D_W, \mathcal D_F)\) and token \(ST_{w_i} = (F_{k_1}(g^{1/w_i}\mod n))\), \(F_{k_2}(g^{1/w_i}\mod n))\) as inputs, initialize an empty set \(\mathcal I\), a temporary index-data pair (\(L^t_w = NULL, D_w^t = NULL\)) and a temporary pointer \(P_w^t = NULL\), set \(L_w = F_{k_1}(g^{1/w_i})\), and do the following steps:

figure d

4 Security Analysis

In this section, we show that our proposed protocol is IND-CKA2 secure against the adaptive server and the client one after another as [24] except some leakage function. Before starting our proof, we need a simulator \(\mathcal S\) to response the query from \(\mathcal {A}\), which is defined in Sect. 2, to take the following leakage functions as input:

Theorem 1

Suppose hash functions H and G and key-based pseudo-random function \(F_{k_1}\) are respectively modeled as three random oracles. Our complete DSSE scheme is IND-CKA2 secure with leakage functions in the random oracle model, where (\(\mathcal {L}_{Setup}=|DB|\), \(\mathcal {L}_{AddKeyword}= New(id, w)\), \(\mathcal {L}_{DeleteFile} = (Old(id), New(id))\), New(id, w)) and \(\mathcal {L}_{Search} = (DB(w), Old(w)\), New(w)).

The proof of Theorem 1 relies on Lemmas 1, 2, 3 and 4 defined below in [24], which just construct a map \(f: x \rightarrow g^x\), here x can be w or id. Now it needs an efficient simulator \(\mathcal S\) to play game Ideal\(_{\mathcal A, \mathcal S}(k)\) with an adversary \(\mathcal A\). Our main arguments are the each lemma listed in following must be computational indistinguishable from the real one with leakage functions in the view of \(\mathcal A\) under several complexity assumptions.

  1. 1.

    \(\mathcal {L}\) \(_{Setup} = |DB|\): After running Setup algorithm, one will statistics the number of file-keyword pairs in DB according to the size of EDB.

  2. 2.

    \(\mathcal {L}_{AddKeyword} = New(id, w)\): When running the AddKeyword algorithm, one will get the generated ciphertexts New(id, w) by comparing with the former database.

  3. 3.

    \(\mathcal {L}_{DeleteFile} = (Old(id), New(id))\): When deleting a selected file id, one will know all deleted ciphertexts of file which identifier is id, and ciphertexts of them were simulated by protocol Setup or AddKeyword.

  4. 4.

    \(\mathcal {L}_{Search} = (DB(w), Old(w), New(w))\): When searching a file which contains the keyword w, one will know all matched ciphertexts and their father files in DB(w), and the first part of these ciphertexts were generated by protocol Setup or AddKeyword.

Lemma 1

Suppose that there exists an adversary \(\mathcal A\) that run protocol Setup to get the corresponding encrypted database from \(\mathcal S\), and the leakage function \(\mathcal {L}_{Setup} = |DB|\), then \(\mathcal {A}\) can not distinguish the above simulated \(\mathrm {EDB}\) with a real one.

Lemma 2

Suppose that H and G are random oracles, then for any polynomial time adversary \(\mathcal {A}\), there exists an algorithm \(\mathcal {S}_{AddKeyword}\), such that \(\mathcal {A}\) could distinguish it with a real one that is generated in game Real\(_{\mathcal {A}}(k)\).

Lemma 3

Suppose H and \(F_{k_1}\) are random oracles, then for any polynomial time adversary \(\mathcal {A}\), there exists an algorithm \(\mathcal {S}_{Search}\), such that \(\mathcal {A}\) could distinguish it with a real one that is generated in game Real\(_{\mathcal {A}}(k)\).

Lemma 4

Suppose G and \(F_{k_1}\) are random oracles, then for any polynomial time adversary \(\mathcal {A}\), there exists an algorithm \(\mathcal {S}_{DeleteFile}\), such that \(\mathcal {A}\) could distinguish it with a real one that is generated in game Real\(_{\mathcal {A}}(k)\).

We define algorithm Setup, DeleteFile, AddKeyword, Search be the event \(C_i\) for \(i=1,2,\cdots ,4\) respectively. From the four lemmas above, we have that the distinguish probability of them each can be write as \(|Pr[\mathbf {Real}_{\mathcal {A}}^{C_i}(k) = 1]- Pr[\mathbf {Ideal}_{\mathcal {A},\mathcal {S}}^{C_i}(k) = 1]| \le \epsilon _i\), where \(1 \le i \le 4\), and \(\epsilon _i\) are all negligible.

Summarily, the indistinguishability of above four protocols implies that \(\mathcal A\) can not distinguish game Ideal\(_{\mathcal {A},\mathcal {S}}(k)\) with game Real\(_{\mathcal {A}}(k)\). Because, we have that the probability \(|Pr[\mathbf {Real}_{\mathcal {A}}(k) = 1]- Pr[\mathbf {Ideal}_{\mathcal {A},\mathcal {S}}(k) = 1]|\) is also negligible, which can be got by the computation below:

$$\begin{aligned}&|Pr[\mathbf {Real}_{\mathcal {A}}(k) = 1]- Pr[\mathbf {Ideal}_{\mathcal {A},\mathcal {S}}(k) = 1]| \\&=\prod \limits _{i = 1}^5 |Pr[\mathbf {Real}_{\mathcal {A}}^{C_i}(k)=1]-Pr[\mathbf {Ideal}_{\mathcal {A},\mathcal {S}}^{c_i}(k)=1]| \,=\,\prod \limits _{i = 1}^5\epsilon _i \\ \end{aligned}$$

This completes the proof of Theorem 1.

Theorem 2

Our scheme \(\varPi \) is secure against malicious clients, i.e., search token in \(\varPi \) is unforgeable against adaptive attacks, assuming that the strong RSA assumption holds.

Assume that there exists an adversarial client \(\mathcal {A}\) who can generate a valid search token for some nonauthorized keyword \(w_0\), so he can get the correct value \((g^{1/w'} \mod n)\). In this case, we can use \(\mathcal {A}\) to construct an efficient algorithm \(\mathcal {B}\) to solve the strong RSA problem with a non-negligible probability by Euclidean algorithm. Consider the properties of RSA function, actually unless the client can compute the correct value \(g^{1/w'} \mod n\), or no one can generate a valid search token for non-authorized keyword \(w'\).

5 Comparison and Analysis

In this section, we simply analyze the efficiency of our scheme by providing the cost of communication and computation in our scheme. Here we set all the number of keywords be one so to compare easily. Let \(|G|,|\mathbb Z_p|\) respectively be the size of the group element \(\mathbb G\), security parameter size. exp denotes the computation cost of the exponential operation. Table 1 lists some classical similar schemes about searchable encryption.

Table 1. The communication and computation cost of some classical retrieval scheme

From the table, we can see that our searchable encryption achieves a balance in diversified function and communication cost. The size of \(\mathrm {EDB}\) can keep the size of O(|DB|), which is similar with the scheme proposed by Xu [24]. And we also realize the multi-client function in our paper without increasing much computation cost.

6 Conclusion

We construct an efficient and practical multi-client symmetric searchable encryption scheme with physical deletion property via RSA function in the random oracle model, and prove the security of the scheme by using the strong RSA function and four attack lemmas. The scheme gives a general method to extend the single reader model searchable encryption scheme to multiple readers. We also present the detailed communication cost and computation cost of the proposed scheme and point out that our scheme is more efficient than other classical ones by comparing the running time with some classical searchable encryption in each phase.