Skip to main content
Sensors (Basel, Switzerland) logoLink to Sensors (Basel, Switzerland)
. 2019 Mar 1;19(5):1059. doi: 10.3390/s19051059

Searchable Encryption Scheme for Personalized Privacy in IoT-Based Big Data

Shuai Li 1, Miao Li 1, Haitao Xu 1,*, Xianwei Zhou 1
PMCID: PMC6427167  PMID: 30832294

Abstract

The Internet of things (IoT) has become a significant part of our daily life. Composed of millions of intelligent devices, IoT can interconnect people with the physical world. With the development of IoT technology, the amount of data generated by sensors or devices is increasing dramatically. IoT-based big data has become a very active research area. One of the key issues in IoT-based big data is ensuring the utility of data while preserving privacy. In this paper, we deal with the protection of big data privacy in the data storage phase and propose a searchable encryption scheme satisfying personalized privacy needs. Our proposed scheme works for all file types including text, audio, image, video, etc., and meets different privacy needs of different individuals at the expense of high storage cost. We also show that our proposed scheme satisfies index indistinguishability and trapdoor indistinguishability.

Keywords: Internet of Things, big data, searchable encryption, personalized privacy needs, index indistinguishability, trapdoor indistinguishability

1. Introduction

Internet of Things (IoT) has become a significant part of our daily life over the past few years. A huge number of sensors or intelligent devices have been integrated together to interconnect people with the physical world, which also generates massive sensing data. Data generated by IoT devices are collected, disseminated, and exchanged among different people, business, and societies. With the development of IoT, the amount of data generated by organizations or individuals is increasing dramatically [1].

Although the massive data generated in the IoT environment is of significant value, exploring and using the extraordinary value of IoT data will increase the risk of privacy breach [2]. To obtain profits, the collection, storage, and reuse of our personal data poses a serious threat to our privacy. Consequently, researchers are faced with the challenge of ensuring the utility of data while preserving privacy. Various techniques have been developed to protect data privacy. Generally, these techniques for data privacy can be grouped based on the stages of big data life cycle, as follows [3].

  • Data generation: In the data generation phase, access restriction, and falsifying data techniques are used.

  • Data storage: The approaches in the data storage phase are mainly based on encryption techniques.

  • Data processing: Anonymization techniques as well as clustering, classification, and association rule mining-based techniques are used in the data processing phase.

In this paper, we will focus on the protection of big data privacy in the data storage phase of the big data life cycle. In the IoT environment, the sensing data generated by various sensors and devices will be collected and uploaded to cloud servers, where cloud servers can provide massive storage and cloud computing services. We know that encryption techniques are used for the protection of big data privacy in the data storage phase. When a large amount of encrypted data is stored in cloud servers, the first consideration is confidentiality of the data, which can be ensured by secure and efficient encryption schemes. However, when the data user wants to retrieve the data containing a specific keyword, the cloud server cannot respond to the data user’s retrieval request, because it cannot decrypt the encrypted data. All these problems can be solved by searchable encryption schemes [4,5], such as searchable symmetric encryption [6], public key encryption with keyword search [7], etc. The searchable encryption scheme mainly includes three entities—data owner, data user, and cloud server. The data owner outsources the encrypted data to the cloud server. The data user queries the encrypted data containing a specific keyword to the cloud server. The cloud server stores and retrieves the encrypted data.

In existing searchable encryption schemes, the data user can access all the data owned by the data owner, which can result in a privacy breach for the data owner. On the one hand, the data owner may be willing to share the data with some specific data users, but not with other data users. On the other hand, the data owner may be willing to share specific data with the data user, but not willing to share other data. Therefore, the data user accesses all the data owned by the data owner, which can result in a privacy breach for the data owner. Furthermore, additional information in the data owned by the data owner can also result in a privacy breach for the data owner. Privacy is subjective, and different people have different privacy needs. For example, the hidden text in a typical Word file includes a lot of sensitive personal information [8]. However, this additional information, which may disclose the privacy of the data owner, is useless for some data users. In data mining, data preprocessing is used to transform raw data into an understandable format [9]. In natural language processing, text feature extraction is used to transform a list of words into a feature set that is usable by a classifier [10]. In speech recognition and image recognition, feature extraction is a key step [11,12]. It means that this additional information may be discarded by the data user in the feature extraction phase. In summary, the data user accessing all the data owned by the data owner will result in a privacy breach for the data owner, but will not improve the utility of the data.

In this paper, we will propose a searchable encryption scheme for personalized privacy protection in IoT-based big data. The main contributions of our proposed scheme are as follows:

  • In our proposed scheme, the data owner generates the file features at different levels, and uploads the encrypted file features to the cloud server.

  • The proposed scheme makes a trade-off between ensuring the utility of the data and preserving the privacy, and meets the different privacy needs of different individuals.

The rest of this paper is as follows. Section 2 discusses the recent searchable encryption scheme. Section 3 presents necessary notations and definitions. Section 4 formalizes the searchable encryption scheme for meeting the personalized privacy needs in big data and presents main security definition. Section 5 describes the detailed construction of our proposed scheme. Section 6 discusses the security of our proposed scheme. Section 7 performs real time experimental results and makes a comparison of our proposed scheme with the existing schemes. The last section is the conclusion of this paper.

2. Related Work

Several different searchable encryption schemes have been proposed to allow the data user to retrieve the encrypted data [4,5]. In this section, we give a simple review on the existing work of the searchable encryption schemes.

In 2000, Song et al. [6] first proposed a searchable encryption scheme based on the symmetric encryption algorithm, which is called searchable symmetric encryption (SSE). However, their scheme has the following limitations: it is not proven to be a secure searchable encryption scheme; the distribution of the underlying plaintexts is vulnerable to statistical attacks; and the search time is linear to the length of the document collection. To overcome these limitations, Goh et al. [13] and Chang and Mitzenmacher [14] deployed a masked index table for SSE and introduced the notion of security for indexes. Curtmola et al. [15] generalized the security definitions of SSE and proposed two SSE schemes which are secure under the new security definitions. The search time of their schemes is linear to the number of documents. Subsequently, several SSE schemes were proposed for improvement. For example, Cash et al. [16] proposed an SSE scheme that supports conjunctive search and general Boolean queries on outsourced symmetrically encrypted data; Salam et al. [17] proposed a privacy-preserving data storage and retrieval system in cloud computing; Li et al. [18] proposed three different SSE schemes that can guard against a coercer by using the deniable encryption idea; Soleimanian et al. [19] proposed an SSE scheme to be publicly verifiable.

Although SSE schemes have high efficiency, they suffer from complicated secret key distribution. To resolve this problem, Boneh et al. [7] introduced a searchable encryption scheme based on public key cryptography, namely public key encryption with keyword search (PEKS). Waters et al. [20] showed that the PEKS schemes based on bilinear map could be applied to build encrypted and searchable auditing logs. However, the bilinear pairing operation is very complicated. Di et al. [21] introduced a PEKS scheme without bilinear pairing. The original PEKS scheme in [7] requires a secure channel to transmit the trapdoors. To overcome this limitation, Baek et al. [22] proposed a new PEKS scheme without requiring a secure channel. Byun et al. [23] introduced the off-line keyword-guessing attack (KGA) and pointed out that the original PEKS scheme in [7] was susceptible to KGA. Rhee et al. [24] proposed the notion of trapdoor indistinguishability and showed that trapdoor indistinguishability is a sufficient condition for preventing outside KGAs. Jeong et al. [25] showed that constructing secure PEKS schemes against inside KGA is impossible under the original PEKS framework in [7]. Xu et al. [26] proposed a PEKS scheme to against inside KGA. More recently, various improved PEKS schemes have been proposed. For example, Liang et al. [27] proposed a searchable attribute-based proxy re-encryption system to achieve privacy-preserving keyword search and encrypted data sharing as well as keyword update; Chen et al. [28] proposed a dual-server PEKS scheme to against inside KGA launched by the malicious server; Yang et al. [29] proposed a semantic key word searchable proxy re-encryption scheme for secure cloud storage using lattice-based cryptographic primitives; Wu et al. [30] designed an efficient and secure searchable encryption protocol using the trapdoor permutation function for cloud-based IoT; Yin et al. [31] proposed a ciphertext-policy attribute-based searchable encryption scheme to achieve keyword-based search and fine-grained access control over encrypted data.

Table 1 shows a simple comparison of some existing searchable encryption schemes. In the design of searchable encryption scheme, privacy is a key concern. However, in all the existing searchable encryption schemes, the data user can access all the data owned by the data owner, which can result in a privacy breach for the data owner.

Table 1.

A comparison of some existing searchable encryption schemes.

Type Limitation Characteristic Literature
SSE need key distribution masked index [13,14]
boolean queries [16]
against the coercer [18]
publicly verifiable [19]
PEKS lower search efficiency without bilinear pairing [21]
without secure channel [22]
keyword update [27]
against inside KGA [28]
synonym keyword search [29]
fine-grained access control [31]

3. Preliminaries

A summary of the notations used in this paper is presented in Table 2.

Table 2.

Summary of notations.

Notation Description
λ The security parameter
G A cyclic group of order q
g A generator of G
negl(λ) A negligible function with respect to λ
G A cyclic group of order q
g A generator of G
(pko,sko) The public/private key pairs for the data owner
(pku,sku) The public/private key pairs for the data user
n The number of the file of the data owner
Fi The i-th file of the data owner (1in)
nf+1 The number of the file feature level
l A file feature level (0lnf)
Li The set of the authorized file feature level of Fi
Fil The file feature of Fi at level l
F The file features set {Fil:1in,0lnf}
F The encrypted file features set
Wl0 The keyword set of the file features set {Fil0:1in}
w A keyword in Wl0
Ind The index set
Ind The encrypted index set
Tw,l The trapdoor with respect to w and l

The set of all binary strings of length n is denoted as {0,1}n, and the set of all finite binary strings is denoted as {0,1}.

An index table (or dictionary) denotes the data structure of the form I[key]=value. Given a key, the value matching the key is returned.

A function μ:NN is negligible if for every positive polynomial p(·) and all sufficiently large λ, μ(λ)<1p(λ). We similarly write f(λ)=negl(λ) to mean that there exists a negligible function μ(·) such that f(λ)μ(λ) for all sufficiently large λ.

The following basic cryptographic primitives can be found in [32].

A symmetric encryption scheme is a tuple E=(Gen,Enc,Dec) of probabilistic, polynomial-time (PPT) algorithms, where Gen takes the security parameter λ as input, and outputs a secret key k; Enc takes a key k and a message m{0,1} as input, and outputs a ciphertext c=Enc(k,m); Dec takes a key k and a ciphertext c as input, and outputs m if c=Enc(k,m).

For any symmetric encryption scheme E=(Gen,Enc,Dec), any adversary A and any value λ for the security parameter, the chosen-plaintext attack (CPA) indistinguishability experiment SEA,Ecpa(λ) is defined as:

  1. A random key k is generated by running Gen(λ).

  2. The adversary A is given input λ and oracle access to Enc(k,·), and outputs a pair of messages m0, m1 of the same length.

  3. A random bit b{0,1} is chosen, and then a ciphertext c=Enc(k,mb) is computed and given to A. c is called the challenge ciphertext.

  4. The adversary A continues to have oracle access to Enc(k,·), and outputs a bit b.

  5. The output of the experiment is defined to be 1 if b=b, and 0 otherwise. In the case SEA,Ecpa(λ)=1, we say that A succeeded.

Definition 1.

A symmetric encryption scheme E=(Gen,Enc,Dec) is CPA-secure if for all PPT adversaries A there exists a negligible function negl such that

Pr[SEA,Ecpa(λ)=1]12+negl(λ),

where the probability is taken over the random coins used by A, as well as the random coins used in the CPA indistinguishability experiment.

For any adversary A and any value λ for the security parameter, the computational Diffie-Hellman (CDH) experiment CDHA,Setup(λ) is defined as:

  1. Run Setup(λ) to obtain output (G,q,g), where G is a cyclic group of order q (with bit length λ) and g is a generator of G.

  2. Randomly choose a, bZq.

  3. A is given G, q, g, ga, gb and outputs hG.

  4. The output of the experiment is defined to be 1 if h=gab, and 0 otherwise.

Definition 2.

The CDH problem is hard relative to Setup if for all PPT adversaries A there exists a negligible function negl such that

Pr[CDHA,Setup(λ)=1]negl(λ).

4. System Model

The searchable encryption scheme for personalized privacy protection mainly includes three entities, i.e., the data owner, the data user, and cloud server. The data owner outsources the encrypted file features to the cloud server. The data user queries the encrypted file features containing a specific keyword to the cloud server. The cloud server stores and retrieves the encrypted file features. As the existing searchable encryption schemes, in this paper, the data owner is considered fully trusted. The data user is considered malicious, which means it may attempt to learn more information than it can retrieve. The cloud server is considered honest but curious in the sense that it may try to learn as much information as possible from the stored encrypted data and correctly execute the searchable encryption protocol.

Given n files Fi, 1in, and a non-negative integer l, let Fil denote the file feature of Fi at level l. Specially, let Fi0=Fi, i.e., the file feature of Fi at level 0 is still Fi.

Let nf+1 denote the number of the file feature level (FFL). The data owner wishes to store the file features set F={Fil:1in,0lnf} on the cloud server. The objectives of the data owner are as follows:

  • For 1in, 0lnf, the file feature Fil are stored on the cloud server such that the confidentiality of Fil is preserved.

  • The data user queries for a keyword w and an FFL l to retrieve all authorized file features Fil such that wFil0 for a given l0 in a secure and efficient way.

4.1. Formal Definition

The searchable encryption scheme for meeting the personalized privacy needs consists of the following algorithms:

  • Setup(λ): This algorithm is run by the data owner. It takes the security parameter λ as input, and outputs the global parameter Λ.

  • KeyGen(Λ): This algorithm is run by the data owner and the data user, respectively. It takes the global parameter Λ as input, and outputs public/private key pairs (pko,sko) and (pku,sku) for the data owner and the data user, respectively.

  • Store(F,pku,sko): This algorithm is run by the data owner. It takes the file features set F, the data user’s public key pku and the data owner’s private key sko as input, and outputs the encrypted file features set F and the encrypted index set Ind.

  • Trapdoor(w,l,pko,sku): This algorithm is run by the data user. It takes a keyword w, an FFL l, the data owner’s public key pko, and the data user’s private key sku as input, and outputs the trapdoor Tw,l.

  • Search(F,Ind,Tw,l): This algorithm is performed interactively between the cloud server and the data user. It takes the encrypted file features set F, the encrypted index set Ind, and the trapdoor Tw,l as input, and outputs all authorization file features Fil such that wFil0 for a given l0.

4.2. Security Definition

The searchable encryption scheme for meeting the personalized privacy needs must satisfy the index indistinguishability and the trapdoor indistinguishability under chosen keyword-FFL pair attack. As per literature [15], we define two challenge-response games GameI and GameT between the adversary A and the challenger C to show the index indistinguishability and the trapdoor indistinguishability under chosen keyword-FFL pair attack, respectively.

The adversary A plays GameI with the challenger C and attempts to distinguish an encrypted index of the given keyword-FFL pair from some encrypted indexes. If A wins GameI, then A has obtained some useful information from some encrypted indexes.

  GameI:

  • Setup: 

    Challenger C runs Setup(λ) and KeyGen(Λ) to generate the global parameter Λ and the public/private key pairs (pko,sko) and (pku,sku) of the data owner and the data user respectively, and sends Λ, pko and pku to A.

  • Adaptive query:
    The adversary A makes the following queries to C:
    • -
      The adversary A adaptively selects the keyword-FFL pair (w,l) for the encrypted index query. C responds with Ind[w].
    • -
      The adversary A adaptively selects the keyword-FFL pair (w,l) for the trapdoor query. C responds with Tw,l.
  • Challenge: 

    The adversary A sends two challenged keyword-FFL pairs (w0,l0), (w1,l1) to C. C picks a random number b{0,1} and sends the encrypted index Ind[wb] of the keyword-FFL pair (wb,lb) to A.

  • Guess: 

    The adversary A outputs b{0,1} and wins the game if b=b.

Definition 3.

We say the searchable encryption scheme for meeting the personalized privacy needs satisfies the index indistinguishability under chosen keyword-FFL pair attack if for all PPT adversaries A there exists a negligible function negl such that

Pr[AwinsGameI]12+negl(λ).

Adversary A plays GameT with challenger C and attempts to distinguish a trapdoor of the given keyword-FFL pair from some trapdoors. If A wins GameT, then A has obtained some useful information from some trapdoors.

  GameT:

  • Setup: 

    C runs Setup(λ) and KeyGen(λ) to generate the global parameter Λ and the public/private key pairs (pko,sko) and (pku,sku) of the data owner and the data user respectively, and sends Λ, pko and pku to A.

  • Adaptive query:
    A makes the following queries to C:
    • -
      Adversary A adaptively selects the keyword-FFL pair (w,l) for the encrypted index query. C responds with Ind[w].
    • -
      Adversary A adaptively selects the keyword-FFL pair (w,l) for the trapdoor query. C responds with Tw,l.
  • Challenge: 

    Adversary A sends two challenged keyword-FFL pairs (w0,l0), (w1,l1) to C. C picks a random number b{0,1} and sends the trapdoor Twb,lb of the keyword-FFL pair (wb,lb) to A.

  • Guess: 

    Adversary A outputs b{0,1} and wins the game if b=b.

Definition 4.

We say the searchable encryption scheme for meeting the personalized privacy needs satisfies the trapdoor indistinguishability under chosen keyword-FFL pair attack if for all PPT adversaries A there exists a negligible function negl such that

Pr[AwinsGameT]12+negl(λ).

5. Proposed Scheme

In this section, we present our proposed searchable encryption scheme for meeting the personalized privacy needs. It consists of the following algorithms.

Setup(λ) is run by the data owner. It takes the security parameter λ as input, and performs the following:

  1. Choose a cyclic group G of prime order q and a generator g of G.

  2. Choose a symmetric encryption scheme E=(Gen,Enc,Dec).

  3. Choose two collision-resistant hash functions H1:G{0,1}λ and H2:{0,1}{0,1}λ.

  4. Set the global parameter Λ=(G,q,g,E,H1,H2).

KeyGen(Λ) is run by the data owner and the data user, respectively. It takes the global parameter Λ as input, and performs the following:

  1. Randomly select two elements ko and ku in Zq as the private keys of the data owner and the data user, respectively.

  2. Compute gko and gku in G as the public keys of the data owner and the data user, respectively.

Store(F,pku,sko) is run by the data owner. It takes the file features set F, the data user’s public key pku=gku and the data owner’s private key sko=ko as input, and performs the following:

  1. Compute k1=H1((gku)ko).

  2. For 1in, 0lnf, randomly select idil{0,1}λ as the identifier of Fil, run algorithm Gen(λ) to generate the encryption key ekil of Fil, and compute idil=Enc(k1,idil), ekil=Enc(k1,ekil), Fil=Enc(ekil,Fil).

  3. Create the index table F such that F[idil]=Fil for every 1in and 0lnf.

  4. Given an FFL l0, create the keyword set Wl0 of the file features set {Fil0:1in}.

  5. For wWl0, compute w=Enc(k1,H2(w)).

  6. For 0lnf, compute l=Enc(k1,H2(l)).

  7. For 1in, construct the set Li of the authorized FFL of the file Fi. In other words, lLi implies the date user has authorization to access the file feature Fil.

  8. Create the index table Ind such that Ind[w]={(idil,ekil,l):wFil0,lLi,1in} for every wWl0.

  9. Send F and Ind to the cloud server.

Trapdoor(w,l,pko,sku) is run by the data user. It takes a keyword w, an FFL l, the data owner’s public key pko=gko and the data user’s private key sku=ku as input, and performs the following:

  1. Compute k2=H1((gku)ko).

  2. Compute Tw,l=Enc(k2,H2(w)),Enc(k2,H2(l)).

Search(F,Ind,Tw,l) is performed interactively between the cloud server and the data user. It takes the encrypted file features set F, the encrypted index set Ind and the trapdoor Tw,l as input, and performs the following:

  1. The cloud server: Given Tw,l=(T1,T2), search Ind[T1] to obtain the set S={(s1,s2,s3)Ind[T1]:s3=T2} and send S to the data user.

  2. The data user: Given S, create two index tables S1 and S2 such that S1[rs]=Dec(k2,s1), S2[rs]=Dec(k2,s2) for every s=(s1,s2,s3)S, where k2=H1((gku)ko) and rs (sS) are randomly selected in {0,1}λ. Send S1 to the cloud server and store S2.

  3. The cloud server: Given S1, create the index table R such that R[rs]=F[S1[rs]] for every key rs in S1 and send R to the data user.

  4. The data user: Given S2 and R, compute Dec(S2[rs],R[rs]) for every key rs in S2.

Remark 1.

Please note thatk1=H1((gku)ko)=H1((gku)ko)=k2, thenT1=w, T2=l. Thus,s1=idil,s2=ekil,S1[rs]=idil,S2[rs]=ekil,R[rs]=F[S1[rs]]=Filfor everys=(s1,s2,s3)S, wherewFil0,lLi,1in. Therefore, our proposed scheme is correct.

Given an FFL l0, creating the keyword set Wl0 of the file features subset {Fil0:1in} means that Fil0, 1in must be text. Thus, our proposed scheme works for all file types including text, audio, image, video, etc. as long as there exists an FFL l0 such that the file feature of the file at l0 is text.

If the authorized FFL set of the ordinal file is only created by the data owner, then the data user cannot access to the unauthorized file features, thus our proposed scheme meets the different privacy needs of different individuals.

Our proposed scheme can be extended to the multi-user scenario. Let no and nu be the number of the data owners and the data users, respectively. In the multi-user scenario, the public/private key pairs are first generated for every data owner and the data user; the file features stored on the cloud server is an no-ary vector, where the i-th element is the encrypted file features set of the i-th data owner; the index stored on the cloud server is an no×nu matrix, where the i-th row and j-th column element is the encrypted index set that the i-th data owner created for the j-th data user.

It is obvious that our proposed scheme needs increasing storage space when nf is getting bigger. In particular, our proposed scheme has similar storage space to the existing searchable encryption schemes when nf=0.

6. Security Analysis

In this section, we show that our proposed scheme satisfies the index indistinguishability and the trapdoor indistinguishability under chosen keyword-FFL pair attack.

Theorem 1.

If E=(Gen,Enc,Dec) is CPA-Secure and the CDH problem is hard relative to Setup , then our proposed scheme satisfies the index indistinguishability under chosen keyword-FFL pair attack.

Proof. 

If there exists a PPT, and adversary A wins GameI, then there exists a simulator B such that SEB,Ecpa(λ)=1 or CDHB,Setupcpa(λ)=1.

In the setup phase, C runs Setup(λ) and KeyGen(Λ) to generate the global parameter Λ=(G,q,g,E,H1,H2), and the public/private key pairs (pko=gko,sko=ko) and (pku=gku,sku=ku) of the data owner and the data user respectively. Then, C sends Λ, pko=gko and pku=gku to A.

In the adaptive query phase, assume A makes nq-1 queries to C adaptively. The q-th query can be:

  • -

    A adaptively selects the keyword-FFL pair (wq,lq) for the encrypted index query. C responds with Ind[wq]={(idilq,ekilq,lq):wqDi,lqLi,1in}, where Li is the authorized FFL set of Fi, idilq=Enc(k1,idilq), ekilq=Enc(k1,ekilq), lq=Enc(k1,H2(lq)), k1=H1((gko)ku).

  • -

    A adaptively selects the keyword-FFL pair (wq,lq) for the trapdoor query. C responds with Twq,lq=(Enc(k2,H2(wq)),Enc(k2,H2(lq)), where k2=H1((gku)ko).

In the challenge phase, A sends two challenged keyword-FFL pairs (w0,l0), (w1,l1) to C. C picks a random number b{0,1} and sends the encrypted index Ind[wb]={(idilb,ekilb,lb):wbDi,lbLi,1in} of the keyword-FFL pair (wb,lb) to A, where idilb=Enc(k1,idilb), ekilb=Enc(k1,ekilb), lb=Enc(k1,H2(lb)) and k2=H1((gk1)ku).

In the guess phase, A outputs its guess b1{0,1} indicating whether the challenge Ind[wb] is the encrypted index of (w0,l0) or (w1,l1).

From the perspective of A, idilq=Enc(k1,idilq) and ekilq=Enc(k1,ekilq) are random values in {0,1}λ for every 1in and 2qnq. Please note that k1=H1((gku)ko)=H1((gko)ku)=k2. Then the information obtained by the adversary A in GameI was the same as the information obtained by a simulator B in the CPA indistinguishability experiment SEA,Ecpa(λ) and in the CDH experiment CDHA,Setup(λ). Thus, if A wins GameI then SEB,Ecpa(λ)=1 or CDHB,Setup(λ)=1, i.e.,

Pr[AwinsGameI]SEB,Ecpa(λ)+CDHB,setup(λ)12+negl(λ).

Therefore, our proposed scheme satisfies the index indistinguishability under chosen keyword-FFL pair attack if E=(Gen,Enc,Dec) is CPA-Secure and the CDH problem is hard relative to Setup. □

Similarly, we can prove the following theorem:

Theorem 2.

IfE=(Gen,Enc,Dec)is CPA-Secure and the CDH problem is hard relative toSetup, then our proposed scheme satisfies the trapdoor indistinguishability under chosen keyword-FFL pair attack.

7. Performance Analysis

As shown in Table 3, we present a comprehensive comparison of the computation cost between our proposed scheme and some existing searchable encryption schemes. The notations used in Table 3 are as follows:

  1. Tbp: Time cost for a bilinear pairing.

  2. Th: Time cost for a hash function.

  3. Texp: Time cost for an exponentiation operation in G.

  4. Tmul: Time cost for a multiplication operation in G.

  5. Tenc: Time cost for an encryption process of E.

  6. Tdec: Time cost for a decryption process of E.

Table 3.

Computation cost: a comprehensive comparison.

Scheme Computation
Storage Phase Trapdoor Phase Search Phase
Boneh et al. [7] Tbp+2Th+2Texp Th+Texp Tbp+Th
Rhee et al. [24] Tbp+2Th+2Texp 2Th+3Texp Tbp+2Th+2Texp+Tmul
Xu et al. [26] 2Tbp+4Th+4Texp 2Th+2Texp 2Tbp+2Th
Chen et al. [28] Th+4Texp+2Tmul Th+4Texp+2Tmul 7Texp+3Tmul
Our scheme Texp+3Th+5Tenc Texp+3Th+2Tenc Texp+Th+2Tdec

To meet the basic security level for comparison, SHA-256 and AES-256 is selected as the collision-resistant hash function and the symmetric encryption scheme, respectively. The cyclic group G of order q is generated by a point on an elliptic curve E(Fp), where q and p are the 256-bits and 521-bits prime numbers, respectively. To evaluate the efficiency of the five schemes, we perform our experiments on a computer with 2.4 GHz Intel Core i7 and 8 GB RAM.

As shown in Figure 1, Figure 2 and Figure 3, our proposed scheme is the most efficient in storage phase and search phase. In trapdoor phase, our proposed scheme has a higher computational cost than that of Boneh et al. [7], although it is still lower than other schemes. In summary, the performance of our proposed scheme is more efficient than four schemes studied in [7,24,26,28].

Figure 1.

Figure 1

Computation cost at storage phase.

Figure 2.

Figure 2

Computation cost at trapdoor phase.

Figure 3.

Figure 3

Computation cost at search phase.

8. Conclusions

In this paper, we have proposed a searchable encryption scheme for meeting personalized privacy needs. Our proposed scheme mainly includes three entities, i.e., the data owner, the data user, and cloud server. The data owner outsources the encrypted file features to the cloud server. The data user queries the encrypted file features containing a specific keyword to the cloud server. The cloud server stores and retrieves the encrypted file features. Compared with the existing searchable encryption schemes, our proposed scheme works for all file types including text, audio, image, video, etc., and meets different privacy needs of different individuals at the expense of high storage cost. We also show that our proposed scheme satisfies index indistinguishability and trapdoor indistinguishability under chosen keyword-FFL pair attack. In other words, our proposed scheme is secure against inside KGA. Performance analysis shows that our proposed scheme is efficient in storage phase, trapdoor phase, and search phase.

Considering the decreasing costs of storage, storage cost is not a problem if nf+1, i.e., the number of the FFL is small in our proposed scheme. However, storage cost is still a problem if nf is too large in our proposed scheme. Thus, choosing an appropriate nf is an important work in the future.

Acknowledgments

The authors would like to thank the editor and the anonymous reviewers for their valuable comments and suggestions that improved the quality of this paper.

Abbreviations

The following abbreviations are used in this manuscript:

IOT Internet of Things
SSE searchable symmetric encryption
PKES public key encryption with keyword search
KGA keyword guessing attac
FFL The file feature level

Author Contributions

Writing—original draft preparation, S.L.; writing—review and editing, M.L., H.X. and W.Z.

Funding

This work is supported by the National Natural Science Foundation of China under Grant (No. U1603116, No. 61701020).

Conflicts of Interest

The authors declare no conflicts of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  • 1.Lohr S. The age of big data. New York Times. Feb 11, 2012.
  • 2.John Walker S. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Houghton Mifflin Harcourt; Boston, MA, USA: 2014. [Google Scholar]
  • 3.Mehmood A., Natgunanathan I., Xiang Y., Hua G., Guo S. Protection of big data privacy. IEEE Access. 2016;4:1821–1834. doi: 10.1109/ACCESS.2016.2558446. [DOI] [Google Scholar]
  • 4.Bösch C., Hartel P., Jonker W., Peter A. A survey of provably secure searchable encryption. ACM Comput. Surv. 2015;47:18. doi: 10.1145/2636328. [DOI] [Google Scholar]
  • 5.Poh G.S., Chin J.J., Yau W.C., Choo K.K.R., Mohamad M.S. Searchable symmetric encryption: Designs and challenges. ACM Comput. Surv. 2017;50:40. doi: 10.1145/3064005. [DOI] [Google Scholar]
  • 6.Song D.X., Wagner D., Perrig A. Practical techniques for searches on encrypted data; Proceedings of the 2000 IEEE Symposium on Security and Privacy; Berkeley, CA, USA. 14–17 May 2000; pp. 44–55. [Google Scholar]
  • 7.Boneh D., Di Crescenzo G., Ostrovsky R., Persiano G. Public key encryption with keyword search; Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques; Interlaken, Switzerland. 2–6 May 2004; Berlin/Heidelberg, Germany: Springer; 2004. pp. 506–522. [Google Scholar]
  • 8.Byers S. Information leakage caused by hidden data in published documents. IEEE Secur. Privacy. 2004;2:23–27. doi: 10.1109/MSECP.2004.1281241. [DOI] [Google Scholar]
  • 9.Hand D.J. Principles of data mining. Drug Safety. 2007;30:621–622. doi: 10.2165/00002018-200730070-00010. [DOI] [PubMed] [Google Scholar]
  • 10.Feldman R., Sanger J. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press; Cambridge, UK: 2007. [Google Scholar]
  • 11.Hirsch H.G., Pearce D. The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions; Proceedings of the ASR2000-Automatic Speech Recognition: Challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW); Beijing, China. 16–20 October 2000. [Google Scholar]
  • 12.Hong Z.Q. Algebraic feature extraction of image for recognition. Pattern Recogn. 1991;24:211–219. doi: 10.1016/0031-3203(91)90063-B. [DOI] [Google Scholar]
  • 13.Goh E.J. Secure indexes. IACR Cryptol. ePrint Arch. 2003;2003:216. [Google Scholar]
  • 14.Chang Y.C., Mitzenmacher M. Privacy preserving keyword searches on remote encrypted data; Proceedings of the International Conference on Applied Cryptography and Network Security; New York, NY, USA. 7–10 June 2005; pp. 442–455. [Google Scholar]
  • 15.Curtmola R., Garay J., Kamara S., Ostrovsky R. Searchable symmetric encryption: Improved definitions and efficient constructions. J. Comput. Secur. 2011;19:895–934. doi: 10.3233/JCS-2011-0426. [DOI] [Google Scholar]
  • 16.Cash D., Jarecki S., Jutla C., Krawczyk H., Roşu M.C., Steiner M. Advances in Cryptology–CRYPTO 2013. Springer; Berlin/Heidelberg, Germany: 2013. Highly-scalable searchable symmetric encryption with support for boolean queries; pp. 353–373. [Google Scholar]
  • 17.Salam M.I., Yau W.C., Chin J.J., Heng S.H., Ling H.C., Phan R.C., Poh G.S., Tan S.Y., Yap W.S. Implementation of searchable symmetric encryption for privacy-preserving keyword search on cloud storage. Hum. Centr. Comput. Inf. Sci. 2015;5:19. doi: 10.1186/s13673-015-0039-9. [DOI] [Google Scholar]
  • 18.Li H., Zhang F., Fan C.I. Deniable searchable symmetric encryption. Inf. Sci. 2017;402:233–243. doi: 10.1016/j.ins.2017.03.032. [DOI] [Google Scholar]
  • 19.Soleimanian A., Khazaei S. Publicly verifiable searchable symmetric encryption based on efficient cryptographic components. Des. Codes Cryptogr. 2019;87:123–147. doi: 10.1007/s10623-018-0489-y. [DOI] [Google Scholar]
  • 20.Waters B.R., Balfanz D., Durfee G., Smetters D.K. Building an Encrypted and Searchable Audit Log. Volume 4. NDSS; San Diego, CA, USA: 2004. pp. 5–6. [Google Scholar]
  • 21.Di Crescenzo G., Saraswat V. Public key encryption with searchable keywords based on Jacobi symbols; Proceedings of the International Conference on Cryptology in India; Chennai, India. 9–13 December 2007; pp. 282–296. [Google Scholar]
  • 22.Baek J., Safavi-Naini R., Susilo W. Public key encryption with keyword search revisited; Proceedings of the International conference on Computational Science and Its Applications; Perugia, Italy. 30 June–3 July 2008; pp. 1249–1259. [Google Scholar]
  • 23.Byun J.W., Rhee H.S., Park H.A., Lee D.H. Off-line keyword guessing attacks on recent keyword search schemes over encrypted data; Proceedings of the Workshop on Secure Data Management; Seoul, Korea. 10–11 September 2006; pp. 75–83. [Google Scholar]
  • 24.Rhee H.S., Park J.H., Susilo W., Lee D.H. Trapdoor security in a searchable public-key encryption scheme with a designated tester. J. Syst. Softw. 2010;83:763–771. doi: 10.1016/j.jss.2009.11.726. [DOI] [Google Scholar]
  • 25.Jeong I.R., Kwon J.O., Hong D., Lee D.H. Constructing PEKS schemes secure against keyword guessing attacks is possible? Comput. Commun. 2009;32:394–396. doi: 10.1016/j.comcom.2008.11.018. [DOI] [Google Scholar]
  • 26.Xu P., Jin H., Wu Q., Wang W. Public-key encryption with fuzzy keyword search: A provably secure scheme under keyword guessing attack. IEEE Trans. Comput. 2013;62:2266–2277. doi: 10.1109/TC.2012.215. [DOI] [Google Scholar]
  • 27.Liang K., Susilo W. Searchable attribute-based mechanism with efficient data sharing for secure cloud storage. IEEE Trans. Inf. Forensics Secur. 2015;10:1981–1992. doi: 10.1109/TIFS.2015.2442215. [DOI] [Google Scholar]
  • 28.Chen R., Mu Y., Yang G., Guo F., Wang X. Dual-server public-key encryption with keyword search for secure cloud storage. IEEE Trans. Inf. Forensics Secur. 2016;11:789–798. doi: 10.1109/TIFS.2016.2599293. [DOI] [Google Scholar]
  • 29.Yang Y., Zheng X., Chang V., Tang C. Semantic keyword searchable proxy re-encryption for postquantum secure cloud storage. Concurr. Comput. Pract. Exp. 2017;29:e4211. doi: 10.1002/cpe.4211. [DOI] [Google Scholar]
  • 30.Wu L., Chen B., Zeadally S., He D. An efficient and secure searchable public key encryption scheme with privacy protection for cloud storage. Soft Comput. 2018;22:7685–7696. doi: 10.1007/s00500-018-3224-8. [DOI] [Google Scholar]
  • 31.Yin H., Zhang J., Xiong Y., Ou L., Li F., Liao S., Li K. CP-ABSE: A Ciphertext-Policy Attribute-Based Searchable Encryption Scheme. IEEE Access. 2019;7:5682–5694. doi: 10.1109/ACCESS.2018.2889754. [DOI] [Google Scholar]
  • 32.Lindell Y., Katz J. Introduction to Modern Cryptography. Chapman and Hall/CRC; Boca Raton, FL, USA: 2014. [Google Scholar]

Articles from Sensors (Basel, Switzerland) are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES