Advanced Approach in Sensitive Rule Hiding

In this paper, we establish the boundedness of strongly singular integrals operators T and commutators b T on generalized Morrey spaces, where b T are generated by n R BMO functions b and the strongly singular integrals operators T .


Introduction
The concept of Privacy-Preserving has recently been proposed in response to the concerns of preserving personal or sensible information derived from data mining algorithms.Successful applications of data mining have been demonstrated in marketing, business, medical analysis, product control, engineering design, bioinformatics and scientific exploration, among others.The current status in data mining research reveals that one of the current technical challenges is the development of techniques that incorporate security and privacy issues.The main reason is that the increasingly popular use of data mining tools has triggered great opportunities in several application areas, which also requires special attention regarding privacy protection.There have been two types of privacy concerning data mining.The first type of privacy, called output privacy, is that the data is minimally altered so that the mining result will preserve certain privacy (Evfimievski, 2002, Oliveira, Zaiane, 2003a, Oliveira, Zaiane, 2003 b).The second type of privacy, input privacy, is that the data is manipulated so that the mining result is not affected or minimally affected (Dasseni, Verykios, Elmagarmid, Bertino, 2001).
For example, through data mining, one is able to infer sensitive information, including personal information, or even patterns from non-sensitive information or unclassified data.As a motivating example of privacy issue in data mining discussed in (Yi-Hung Wu, Chia-Ming Chiang, and Arbee L.P. Chen, 2007).Consider a supermarket and two breads suppliers A and B. If the transaction database of the supermarket is released, A (or B) can mine the association rules related to his/her breads and apply the rules to the sales promotion and the goods supply.As a result, a supplier is willing to exchange a lower price of goods for the database with the supermarket.From this aspect, it is good for the supermarket to release the database.However, the conclusion can be opposite if a supplier uses the mining methods in a different way.For instance, if A finds the association rules related to B's breads, saying that most customers who buy cheese also buy B's breads, he/she can run a coupon that gives a 10 percent discount when buying A's breads together with cheese.Gradually, the amount of sales on B's breads is down and B cannot give a low price to the supermarket as before.Finally, A monopolizes the bread market and is unwilling to give a low price to the supermarket as before.From this aspect, releasing the database is bad for the supermarket.Therefore, for the supermarket, an effective way to release the database with sensitive rules hidden is required.This leads to the research of sensitive rule hiding.
In this work, the sensitive rules are given and the algorithm ISSRH is proposed to modify data in database so that sensitive rules containing specified sensitive items on the right hand side of rule cannot be inferred through association rule mining.The proposed algorithm is based on modifying or perturbing the database transactions so that the confidence of the association rules can be reduced.
The rest of the paper is organized as follows.Section 2 gives the view of the previous works.Section 3 presents the statement of the problem.Section 4 presents the framework of the approach.Section 5 presents the proposed algorithm for sensitive rule hiding.Section 6 shows the example of the proposed algorithm.Section 7 analyzes the characteristics of the algorithm.Concluding remarks and future work are described in Section 8.

Related Work
In output privacy, given specific rules or patterns to be hidden, many data altering techniques for hiding association, classification and clustering rules have been proposed.For association rules hiding, two basic approaches have been proposed.The first approach (Saygin , Verykios , Clifton, 2001, Verykios, Elmagarmid , Bertino , Saygin , Dasseni ,2004) hides one rule at a time.It first selects transactions that contain the items in a give rule.It then tries to modify transaction by transaction until the confidence or support of the rule fall below minimum confidence or minimum support.Either removing items from the transaction or inserting new items to the transactions does the modification of transaction.The second approach (Oliveira, Zaiane, 2002a, Oliveira, Zaiane, 2002b, Oliveira, Zaiane, 2003a, Oliveira, Zaiane, 2003 b) deals with groups of restricted patterns or association rules at a time.It first selects the transactions that contain the intersecting patterns of a group of restricted patterns.Depending on the disclosure threshold given by users, it sanitizes a percentage of the selected transactions in order to hide the restricted patterns.However, both the above approaches require hidden rules or patterns been given in advance.
The work presented here differs from the related work in some aspects are as follows: First, database indexing is performed.Second, correlations among the sensitive rules are considered.Third, avoids the modification in transactions unnecessarily, if the confidence of the sensitive rule gets reduced.Fourth, alter the transactions in the cluster and finally changes can be updated in the database, which reduces the time period of database updating.

Problem Statement
The problem of mining association rules was introduced in (Agrawal, Imielinski, Swami, 1993).Let I = {i1, i2, . . ., im} be a set of literals, called items.Given a set of transactions D, where each transaction T in D is a set of items such that T I, an association rule is an expression X=> Y where X I , Y I , and X Y = .The X and Y are called respectively the body (left hand side) and head (right hand side) of the rule.An example of such a rule is that 90% of customers buy hamburgers also buy Coke.The 90% here is called the confidence of the rule, which means that 90% of transaction that contains X (hamburgers) also contains Y (Coke).The confidence is calculated as |X U Y | / |X|, where |X| is the number of transactions containing X and |X U Y | is the number of transactions containing both X and Y.The notation U here is not the set union operator.The support of the rule is the percentage of transactions that contain both X and Y, which is calculated as |XU Y | / N, where N is the number of transactions in D. In other words, the confidence of a rule measures the degree of the correlation between item sets, while the support of a rule measures the significance of the item sets.A typical association rule-mining algorithm first finds all the sets of items that appear frequently enough to be considered significant and then it derives from them the association rules that are strong enough to be considered interesting.The problem of mining association rules is to find all rules that are greater than the user-specified minimum support and minimum confidence.
The objective of data mining is to extract hidden or potentially unknown but interesting rules or patterns from databases.However, the objective of privacy preserving data mining is to hide certain sensitive information so that they cannot be discovered through data mining techniques (Agrawal, Imielinski, Swami, 1993, Evfimievski, Gehrke , Srikant, 2003).
In this work, an algorithm ISSRH (Increase Support Sensitive Rule Hiding) is proposed, to hide the sensitive rules that contain sensitive items, so that sensitive rules containing specified sensitive items on the right hand side of rule cannot be inferred through association rule mining.More specifically, given a transaction database D, a minimum support, a minimum confidence and a set of sensitive items Y, the objective is to minimally modify the database D such that no sensitive rules containing sensitive items Y on the right hand side of the rule will be discovered.

Framework of the approach
Figure 1 shows the framework of the approach that consists of six processes.Initially, indexing is performed in the database.Then association rules are mined from the database.Sensitive items are identified to find the sensitive rules.Then the sensitive rules are generated.. Clustering is performed on the sensitive rules to group the similar items.The rule hiding process is performed and the transactions are updated in the transaction table and finally it is updated in the original database.The main challenge of rule hiding is how to select the items and transactions to modify.The proposed framework hides the sensitive rules.

Proposed Algorithm
In order to hide an sensitive rule, X => Y , it can be either decrease its supports, (|X|/N or |X U Y |/N), to be smaller than pre-specified minimum support or its confidence (|X U Y |/|X|) to be smaller than pre-specified minimum confidence.In the transactions that do not contain both X and Y, to increase the support of X only, the left hand side of the rule, it would reduce the confidence of the rule.In order to hide sensitive rules, when considering hiding sensitive rules with 2 items, x => z, where z is a sensitive item and x is a single large one item.In theory, association rules may have more specific rules that contain more items, e.g., xY => z, where Y is a large item set.However, for such rule to exist, its confidence must be greater than the confidence of x => z, i.e., conf (xY => z) > conf (x => z) or |xY z| > conf (x => z) |xY|.For higher confidence rules, such as conf (x => z) = 1, there will be no more specific rules.In addition, once the more general rule is hidden, the more specific rule might be hidden as well.
The algorithm tries to decrease the support of the right hand side of the rule.a transformed database D', where rules containing Y on RHS will be hidden.
Step 1: Indexing the transactional database.
Step 2: Generate the association rules.
Step 3: Selecting the Sensitive rules with single antecedent and consequent with the sensitive item in the consequent.(x->y) Step 4: Constraint based clustering -Clustering the rules with the right hand side has the common item and indexing the rules.
Step 5: Check all the rules in the cluster.Step 4: update database D as transformed database D'.
The algorithm tries to generate the association rule using Agrawal, Imielinski, Swami, 1993).Then it selects the sensitive rules with the sensitive items in the right hand side.Cluster the rules with the common item in the right hand side of the rule (Han, Kamber,2001) and index the rule using (Oliveira, Zaiane, 2003 a).
(Yi-Hung Wu, Chia-Ming Chiang, and Arbee L.P. Chen, 2007) Therefore it would take maximum of k no of executions to hide the rule.The rules in every cluster will be hidden.

Example
This section shows the example to demonstrate the proposed algorithm to hide the sensitive rules.
The items in database can be represented as a bit vector The item C is considered as sensitive item.The sensitive rule with single antecedent and consequent is , B->C.The rule is clustered.In the fifth transaction the item B will be added and placed as 1.
Then the database will be updated as After updating the database the rules B->C will have the confidence as 0.6, which is less than minimum confidence and hidden.While hiding the rule the rules AB->C, B->AC are also hidden and the rule A->B is generated as side effect.

Analysis
This section analyses some of the characteristics of the proposed algorithm.The first characteristic is the database indexing.The indexing helps in reducing the number of scanning of the database.The second characteristic is the time effect.The time taken to scan the database to search the sensitive rules in the database is reduced because of clustering the sensitive rules.The third characteristic is the database effect.The minimum numbers of transactions are modified because of correlation among the sensitive rules.The fourth characteristic is the efficiency of the algorithm.The database is updated after all the rules are hidden that saves the updating time.The fifth characteristic is the transaction effect.The alteration in the transactions are stopped when the confidence of the sensitive rules are reduced than the minimum confidence.

Conclusion
In this work, the database privacy problems caused by data mining technology are discussed and the algorithm for hiding sensitive rules is presented.The proposed algorithm here can automatically hide sensitive rule sets.The previous works does not consider the characteristics that are discussed here.Example illustrating the proposed algorithm is given and the characteristics of the algorithm are analyzed.Further the efficiency of the algorithm will be analyzed and improved by reducing the side effects.Wainger S.etc (Wainger, S. 1965.) had studied the boundedness of the operators T on ) ( n q R L .Chanillo (Chanillo, S. 1984.)developed weighted ) ( n q R L theory by virtue of a basal lemma which will be mentioned later.Garcia-Cuerva J. etc (Garcia-Cuerva, J. , Harboure, E. , Segovia, C. and Torrea, J. L.) obtained the boundedness of higher-order commutators on weighted In this paper we will show the boundedness of strongly singular integrals operators T on generalized Morrey spaces by weighted inequations.Furthermore, we will show the boundedness of the commutators generalized by BMO functions b of operators T by virtue of sharp estimate.

Boundedness of strongly singular integrals operators T on generalized Morry spaces
We fix the following notations in Lemma 1 and Theorem 1: , if T are strongly singular integrals then for , we have And by Lemma 1 we have in the last inequation.

Boundedness of the commutators on generalized Morrey spaces
Lemma 2 (Mizuhara T. 1991.)For , where M are H-L maximum operators.

T
paper, we establish the boundedness of strongly singular integrals operators T and commutators b T on generalized Morrey spaces, where b T are generated by n R BMO functions b and the strongly singular integrals operators T .Keywords: Strongly singular integrals, Generalized Morrey spaces, Commutators, generated by functions b and operators T as the following: Morrey, C. B. 1938.)  proposed the classical Morrey spaces when he studied the properties of local solutions of second order elliptic equations.In Ref.(Mizuhara T.. 1991.)Mizuhara introduced the following generalized Morrey spaces.Morrey spaces.
and T are strongly singular integrals then T are bounded on ) (