Contact at or 8097636691/9323040215
Responsive Ads Here

Monday, 6 August 2018

A Privacy Leakage Upper Bound Constraint-Based Approach For Cost-Effective Privacy Preserving Of Intermediate Data Sets In Cloud

Cloud computing is an evolving paradigm with tremendous momentum, but its unique aspects exacerbating security and privacy challenges. Cloud computing provides massive computation power and storage capacity which enable users to deploy computation and data-intensive applications without infrastructure investment. Along the processing of such applications, a large volume of intermediate data sets will be generated, and often stored to save the cost of re-computing them. However, preserving the privacy of intermediate data sets becomes a challenging problem because adversaries may recover privacy sensitive information by analyzing multiple intermediate data sets. Encrypting ALL data sets in cloud is widely adopted in existing approaches to address this challenge. But we argue that encrypting all intermediate data sets are neither efficient nor cost effective because it is very time consuming and costly for data-intensive applications to end decrypt data sets frequently while performing any operation on them. In this paper, we propose a novel upper bound privacy leakage constraint-based approach to identify which  intermediate data sets need to be encrypted and which do not, so that privacy-preserving cost can be saved while the privacy requirements of data holders can still be satisfied. Evaluation results demonstrate that the privacy-preserving cost of intermediate data sets can be significantly reduced with our approach over existing once where all data sets are encrypted. Index Terms --Cloud Computing, Data Sets, Privacy Preserving, Data Privacy Management, Privacy Upper Bound;

Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). The name comes from the common use of a cloud-shaped symbol as an abstraction for the complex infrastructure it contains in system diagrams Figure1: Structure of cloud computing Cloud computing entrusts remote services with a user's data, software and computation. Cloud computing consists of hardware and software resources made available on the Internet as managed third-party services. These services typically provide access to advanced software applications and high-end networks of server computers The goal of cloud computing is to apply traditional supercomputing, or high-performance computing power, normally used by military and research facilities, to perform tens of trillions of computations per second, in consumer-oriented applications such as financial portfolios, to deliver personalized information, to provide data storage or to power large, immersive computer games The cloud computing uses networks of large groups of servers typically running low-cost consumer PC technology with specialized connections to spread data processing chores across them. This shared IT infrastructure contains large pools of systems that are linked together. Often, virtualization techniques are used to maximize the power of cloud computing.

Cloud users can store their valuable intermediate data sets selectively when processing original data sets in a data intensive application in order to curtail the overall expenses by avoiding frequent re-computation to obtain these data sets. Data users often reanalyze results, conduct new analysis, or share some intermediate results with others for collaboration. The secure encryption of privacy preserving of dynamic data sets are used to identify which intermediate data sets need to be encrypted and which do not, so that privacy preserving cost can be saved. The technical approaches for preserving the privacy of data sets stored in cloud mainly include encryption and anonymization. On one hand, encrypting all data sets, an effective approach, is widely adopted in current research.

However, processing on encrypted data sets efficiently is a challenging task, because most of the applications run on unencrypted data sets. Although homomorphism encryption which theoretically allows performing computation on encrypted data sets, applying algorithms are rather expensive due to their inefficiency. On the other hand, partial information of data sets, example aggregate information, is required to expose to data users in most cloud applications like data mining and analytics. In such cases, data sets are anonymized rather than encrypted to ensure both data utility and privacy preserving. Current privacy-preserving techniques like generalization can withstand most privacy attacks on one single data set, while preserving privacy for multiple data sets is still a challenging problem. Thus, for preserving privacy of multiple data sets, it is promising to anonymize all data sets first and then encrypt them before storing or sharing them in cloud. Usually, the volume of intermediate data sets is huge. Hence, encrypting all intermediate data sets will lead to high overhead and low efficiency when they are frequently accessed or processed. To address this issue, the system proposes to encrypt a part of intermediate data sets rather than all for reducing privacy preserving cost.


This paper has proposed an approach that identifies which part of intermediate data sets needs to be encrypted while the rest does not, in order to save the privacy preserving cost. A tree structure has been modeled from the generation relationships of intermediate data sets to analyze privacy propagation among data sets. The problem of saving privacypreserving cost as a constrained optimization problem which is addressed by decomposing the privacy leakage constraints has been modeled. A practical heuristic algorithm has been designed accordingly. Evaluation results on real-world data sets and larger extensive data sets have demonstrated the cost of preserving privacy in cloud can be reduced significantly with this approach over existing ones where all data sets are encrypted.