A Formal Concept Analysis Approach to Data Mining: The QuICL Algorithm for Fast Iceberg Lattice Construction

,


Introduction
Association rule mining (ARM) is the task of identifying meaningful implication rules of the form X  Y exhibited in a data set, where X and Y are subsets of the items (i.e., possible distinct values of columns of a data set) and X  Y is  (Agrawal et al., 1993).The degree to which a rule is meaningful is defined by: i) support, the number of times both X and Y occur in the data set, and ii) confidence, the number of times that X  Y holds true relative to all occurrences of X. Mining association rules typically involves two steps: i) identifying frequent item (FI) sets (i.e., X  Y that meet a minimum support threshold), and ii) deriving association rules from the item sets that meet a level of confidence.
A well known algorithm to extract FI sets from a data set is Apriori (Agrawal & Srikant, 1994).Apriori searches the space of all patterns in an iterative bottom-up breadth-first manner.Each iteration obtains counts for its current set of candidate patterns and removes from further consideration any candidate patterns that are not frequent or cannot be frequent.Apriori has proved to be efficient for mining frequent patterns of small length.However, for long patterns Apriori can be I/O intensive since each iteration requires a full scan of the data set.Furthermore, a bottom-up algorithm must obtain counts for each set in the power set of all items composing each frequent pattern.Thus, Apriori may be an intractable solution for FI sets of even moderate length (Han & Kamber, 2006).
ARM has be an active area of research (Pei et al., 2000;Zaki & Hsiao, 2002;Wang et al., 2003, Uno et al., 2004;Lucchese et al., 2006).However, this research has primarily focused on innovative theory and techniques for efficient extraction of FI sets.As such, they have fallen short of the overall task of mining association rules (Yahia et al., 2006).Key information not generated by these works is the derivation of upper covers of each FI set.An upper cover of a FI set I is a set of FI sets U such that  I u  U, I u  I and there does not exist a FI set I 2 where I u  I 2  I. Upper covers are needed to produce a set of association rules whose size is constrained to a number that can be readily exploited by an end user (Zaki & Hsiao, 2005;Yahia et al., 2006).
An alternate approach to frequency counting can be found in formal concept analysis (FCA) (Ganter & Wille, 1997).FCA is a branch of applied mathematics that has been applied to a wide variety of applications including linguistics, text retrieval, and economics (Ganter et al., 2005).It originated in the early 1980's and was first formalized in 1982 (Wille).It has since inspired numerous publications (Priss, 2006).According to FCA, a concept is defined as: Furthermore, between any two concepts C 1 = (O 1 , I 1 ) and C 2 = (O 2 , I 2 ) an order < exists between C 1 and C 2 iff O 1  O 2 (or equivalently I 1  I 2 ).The set of object ids of a concept is its extent and the set of items is its intent.
Let L be the set of all concepts derived from a data set where the attribute-values define the set of items and the tuple ids define the set of object ids.The concepts of L can be arranged in a lattice such that a connection (i.e., edge) is made between any two concepts C 1 and C 2 for which order < exists and there is no concept C 3 for which C 1 < C 3 < C 2 .Given this property, tree terminology can be applied to a lattice.An ancestor concept C a of concept C 1 is any concept for which an order C 1 < C a exists.A descendent concept C d of concept C 1 is any concept for which an order C 1 > C d exists.A parent concept C p of concept C 1 is ancestor concept for which there is no concept C 3 such that C 1 < C 3 < C p .A child concept C c of concept C 1 is descendent concept for which there is no concept C 3 such that C 1 > C 3 > C c .An example of a concept lattice is depicted in Figure 1.
Property 3: Extent of concept C is the  of the O of all parent concepts of C,  with the set of O defined by each I i  I of C that is not  I of a parent concept of C; dually the intent of a concept C is the  of the I of all child concepts of C,  with the set of I defined by each O i  O of C that is not  O of any child concept of C.
Concept lattices are of benefit to ARM.A concept's intent corresponds to an item set and the cardinality of extent corresponds to the item set support.Furthermore, the definition of a concept embodies the mathematical notion of closure.Thus, nodes of the concept lattice represent only closed item sets (i.e., an item set whose closure yields the same set), whose number can be orders of magnitude lower than the number of all item sets (Stumme, 2002).The concept lattice still contains the necessary and sufficient information to extract association rules and to compute both support and confidence.For example, from the concept ({O1O2O8}, {a1b1}) of Figure 1, the association rule a 1  b 1 can be mined.The support for a 1  b 1 can be extracted from the lattice by traversing any path from the bottom of the lattice through concepts where {a 1 b 1 } is a subset of a concept's intent.Support is the size of the extent of the highest concept where {a 1 b 1 } is a subset of a concept's intent.In this case, support for a 1  b 1 is 3, or 30%.Likewise support for a 1 , the antecedent of a 1  b 1 , can be extracted.The support for a 1 is 8, or 80%.Confidence is computed as support(rule) / support(antecedent(rule)). Thus, the confidence of a 1  b 1 is 37.5%.On the other hand, the confidence for b 1  a 1 is 100%, since the antecedent, now b 1 , has a support of 30%.In the same manner the association rules a 1  b 2 50% supp 62.5% conf , b 2  a 1 50% supp 71.4% conf , and a 1 b 2  c 1 50% supp 100% conf can be mined from the concept ({O3O4O5O9O10}, {a1b2c1}).While a concept lattice contains the necessary and sufficient information to compute confidence and support, it includes concepts that do not meet the minimum support.Thus, use of a lattice construction algorithm for ARM a may incur substantial overhead, since such concepts are essentially unnecessary artifacts.
An iceberg lattice is a lattice that contains only the concepts whose support meets a given threshold.For example, Figure 2 depicts the concept lattice of Figure 1 as an iceberg lattice for both a minimum support threshold of 60% and 40%.As the threshold is lowered, more detail of the underlying concept lattice is revealed.An iceberg lattice provides a model from which association rules can be efficiently mined (Stumme, 2002).Consider the alternate notation of an iceberg lattice depicted in Figure 3 that corresponds to the bottom iceberg lattice of Figure 2.Each concept node is labeled with a percentage representing the support together with any items, if any, for which there does not exist a greater concept containing the item.The edges are labeled with a percentage indicating the effective drop in confidence between two concepts.This notation enables association rules to be directly read from an iceberg lattice.An association rule α 1  α 2 will hold with 100% confidence for any concepts C 1 and C 2 where C 1 is labeled with α 1 , C 2 is labeled with α 2 , and C 1 < C 2 .The support for the association rule is the support of C 1 .For example, association rule d 1  a 1 50% supp 100% conf can be read from lattice.Furthermore, an association rule α 1 α 2  α 3 will hold with 100% confidence for any concepts C 1 , C 2 , and C 3 where C 1 is labeled with α 1 , C 2 is labeled with α 2 , C 3 is labeled with α 3 , and C 3 > meet (i.e., greatest common sub-concept) of C 2 and C 1 .The support of the association rule is the support of the meet concept.For example, the association rule a 1 b 2  c 1 50% supp 100% conf can be read.An association rule α 1  α 2 with less than 100% confidence can be read from any concepts C 1 and C 2 where C 1 is labeled with α 1 , C 2 is labeled with α 2 , and C 1 < C 2 .The support will be the support of C 2 .The confidence will be the product of the confidences noted on the edges along the path from C 1 to node C 2 .For example, the association rule a 1  d 1 50% supp 62.5% conf can be read from the lattice of Figure 3.By a combination of the previous steps further association rules can be read.For example, the association rule a 1 b 2  d 1 40% supp 80% conf can be read from the lattice (the meet of a 1 b 2  the node labeled 40% support with 80% conf , and d 1 is an ancestor of that node).Similarly, c 1 b 2  d 1 40% supp 66.7% conf (the meet of c 1 b 2  the node label 60% support with 100% conf , the node label 60%  the node label 50% with 83.4% conf , the node label 50%  the node label 40% with 80% conf , therefore c 1 b 2  the node label 40% with a 66.7% conf drop in overall confidence (Note 1), d 1 is an ancestor of the node label 40%).
Extracting association rules from a list of FI sets may yield an excessive number, even when applying strict thresholds to both support and confidence.The rules may contain highly redundant information, for example The excessive size and redundancy impedes the usefulness of the extracted rules.What is desired is a meaningful subset that can be exploited by an end user.A basis is a minimal subset of association rules that can be combined to form all association rules without any loss of information.A basis can be extracted from an iceberg concept lattice using a systematic traversal of the lattice.
The Duquenne-Guigues (1986) basis provides extraction of a minimal set of association rules with 100% confidence and the Luxenburger (1991) basis provides extraction of a minimal set of association rules with less than 100% confidence.Stumme et al. (2001) offer algorithms to traverse and extract the Duquenne-Guigues basis and the Luxemburger basis from an iceberg concept lattice.
Figure 3. Iceberg lattice using an alternate notation.Each concept node is labeled with a percentage representing the support together with any items, if any, for which there does not exist a greater concept containing the item.
Edges are labeled with a percentage indicating the effective drop in confidence between the two concepts Given that an iceberg concept lattice provides an analysis tool to succinctly identify a basis of association rules, algorithms to construct an iceberg lattice are needed.This paper presents the Quick Iceberg Concept Lattice (QuICL) algorithm used to efficiently construct an iceberg lattice.When combined with lattice traversal algorithms, such as Stumme et al. (2001), QuICL provides an efficient ARM solution to generate of a basis of association rules that can be exploited by an end user.Beyond application to ARM, QuICL is a very efficient lattice construction algorithm that offers orders of magnitude improvement over past algorithms.

Classical Association Rule Mining -Mining of Frequent Item Sets
There have been a number of algorithms developed to address the mining of long FI sets.Most notable are CLOSET (Pei et al., 2000), CHARM (Zaki & Hsiao, 2002), and CLOSET+ (Wang et al., 2003).CHARM constructs an itemset-tidset (IT) tree whose nodes are similar to the nodes of a concept lattice.It is a top-down, depth-first search that exploits a notion of equivalence classes to skip levels in order to quickly identify closed FI sets.CHARM involves a vertical data representation (i.e., list of object ids per item) and uses a difference based representation to enumerate the sets of object ids below the first level of its IT tree.It uses intersection to incrementally add data to its IT tree.The IT tree is dynamically pruned during processing using properties of set union and closure.Intersection is noted as an expensive operation that impedes the performance of the CHARM algorithm (Wang et al., 2003).Alternatively, CLOSET uses a frequent pattern (FP) tree to provide a compact representation of the data in memory.The FP tree is a horizontal representation that maintains counts, each relative to a context of an ordered list of frequent items.Branches are added to the FP tree upon processing an object whose items omit one or more items in the path of an existing branch.Following construction of the FP tree, a divide and conquer algorithm that performs physical bottom-up projections on the FP tree together with item set merging and sub-item set pruning is used to identify the set of closed FI sets.CLOSET is shown to be effective for dense data sets (i.e., many items per transaction with few distinct items), but CLOSET's performance degrades rapidly on sparse data sets (i.e., few items per transaction with many distinct items) as the minimum support threshold is lowered.CLOSET+ offers enhancements to CLOSET; top-down pseudo projection algorithm to address sparse data sets and item and skipping to further prune the search space.
A survey provides an analysis of algorithms for mining closed FI sets from both a theoretical and analytical viewpoint (Yahia et al., 2006).Algorithms evaluated include TITANIC (Stumme et al., 2002), CLOSET, CLOSET+, and CHARM.TITANIC is a test and generate algorithm along the lines of Apriori that leverages theory from FCA. Yahia et al. draw several conclusions.There has been "frenzied activity" in developing algorithms to efficiently mine FI sets.These algorithms have made significant progress by leveraging theory in combination with carefully designed compact data structures.However, this activity has lost sight of the overall goal of producing a set of association rules that is "of exploitable size by end users".All algorithms fail to produce the upper covers and therefore unable to generate a reasonable basis of association rules.Without the upper covers, the derivation of association rules from the FI sets of even a modest context will generate an excessive number of rules that cannot be reasonably comprehended by end users.Other studies derive the same conclusion (Valtchev et al., 2004;Zaki & Hsiao, 2005;Lakha & Stumme, 2005).

Missaoui, Godin, and Alaoui Algorithm
Missaoui, Godin, and Alaoui (1995) algorithm (GMA) is an often cited lattice construction algorithm.It is an incremental algorithm.That is, given a concept lattice L and a new object O i with its set of items I, GMA will insert the new object into the lattice to produce a new concept lattice L+. Figure 4 depicts the incremental insertion of the first six objects relation R of Figure 1.As seen in Figure 4, the insertion of an object can result in modifying the extent of several existing concepts, generation of several new concepts, addition of links, and occasional removal of links.The insertion of a single object may result in numerous modified concepts and addition of many new concepts.
The strategy for GMA is to partition the current set of concepts into three groups: modified, generator, and old.Modified are concepts into which the object id of the next object is added.Generators are concepts are used to generate new concepts.All other concepts are considered old and play no role in the insertion process.Modified concepts are readily identified.They are concepts with an intent that is a subset of the next object's items.The identification of generator concepts, on the other hand, is more involved.Any concept whose intent intersects with, but not a subset of, the object's items is potentially a generator.However, not all such concepts are generators.A concept is a generator provided there does not exist an ancestor whose intent when intersected with the next object's items produces the same intersection set.For example, when inserting O 6 in Figure 4 the concepts ({O3O4O5}, {a1b2c1}), ({O3O5}, {a1b2c1d1}), and ({O4}, {a1b2c1d4}) all have an intersection set of {a1c1}, only ({O3O4O5}, {a1b2c1}) is a generator.Each generator concept is used to create a new concept having the extent of the generator union the object id as its extent and the intersection set as its intent.
New concepts must be further linked into the lattice by searching for the parents.A potential parent is any concept, existing or generated, whose intent is a subset of the new concept's intent.In order to preserve the lattice property (i.e., connection exists between two concepts C 1 and C 2 provided there is no concept C 3 for which C 1 < C 3 < C 2 ), the potential parent is a parent only if it does not have a child whose intent is a subset of the new concept.The search for potential parents can be constrained to only consider concepts that are modified or generated.
Occasionally a link between a parent and a child must be removed.This occurs when a parent for a concept is found and that parent is currently the parent of the generator concept that created the new concept.An example is the insertion of object O 4 shown in Figure 4.In these cases the new concept is being inserted between the parent and the generator.The removal of the link is required to preserve the lattice connection property.
The complete GMA algorithm is given in Algorithm 1. Lines 1 through 10 bootstrap the lattice upon insertion of the first object; for subsequent objects ensure the bottom concept contains all of the object's items.Line 11 declares a vector of sets that is used to: i) verify a potential generator is valid, and ii) limit the search for parents.
Only modified and generated concepts are placed into this vector (lines 15 and 22).Line 12 provides the main loop to iterate over all concepts in the lattice in a top-down breath-first order (Note 2).Lines 13 through 17 identify and process modified concepts.Line 20 tests if a concept is a generator.If so, a new concept is generated (line 21) and linked into the lattice (line 23).Lines 24 through 33 search the Processed vector for the parents and links them to the new concept.

Other Lattice Construction Algorithms
Other notable lattice construction algorithms include Valtchev, Missaoui, and Lebrun divide and conquer (2002b), GALICIA-T (Valtchev et al., 2002), and Nourine and Raynaud (2002).Lindig and Datensystene begins by constructing a known concept, such as the top (or bottom concept), and then proceeds to generate its children (or parents).The process repeats for each found concept until the lattice is complete.Valtchev et al. divide and conquer recursively partitions the input data set into two sets, either based on items or objects.At each level a concept lattice is constructed for each set and the resulting lattices are then merged.GALICIA-T uses a trie data structure to represent the set of concept intents whereby each edge of the trie denotes the addition of an item in an item set.GALICIA-T algorithm inserts the next object into the lattice through a guided traversal of the trie to produce an independent trie data structure.The generated trie represents a set of new concepts that are then merged back into the source trie.The lattice is thus an adjunct to the core trie structure.Similarly, Nourine and Raynaud use a trie to represent its lexicographic tree.Each edge in the lexicographic tree denotes an object and nodes corresponding to concepts are augmented with an item list.The incremental insertion is performed on an item by item basis by using a union operation on object ids of concepts represented in the trie.If the result of union is present in the trie and augmented with an item list, the item is added to the node; otherwise it will be a new concept.For new concepts, the extent will be will be added to the trie as needed.Identification of children is performed by a test union and count procedure for each item in I that is  new concept's intent.
Let Concept be a tuple {O, I, Children} where O is a set of object ids, I is a set of items, and Children is a list of child concepts.
Let CBottom be a reference to the supremum of a concept lattice G ADD(Oi, I)

35.
return Algorithm 1. Godin, Missaoui, and Alaoui (GMA) lattice construction algorithm Kuznetsov and Obiedkov (2002) provide a comparative survey of several lattice construction algorithms.Algorithms include: GMA, Nourine andRaynaud, and Valtchev et al. divide and conquer.Findings indicate that there is no "best" algorithm and the each algorithm exhibit different performance depending on the data set.GMA is a good choice for sparse data sets, and batch algorithms are good for dense data sets.Valtchev et al. (2002) arrive at the same conclusions.Their study reports that GMA has good performance for data set with density (Note 3) less than 0.10, but lags with densities greater than 0.50.

Iceberg Lattice Construction Algorithms
Three algorithms to construct an iceberg lattice were found in literature: CHARM-L (Zaki & Hsiao, 2005), SPROUT (Choi, 2006), and Martin and Eklund (2008).CHARM-L, an extension to the CHARM algorithm, is an example of a lattice construction algorithm that is an integrated (Note 4) extension of a FI set miner.The lattice of the CHARM-L algorithm is maintained as an adjunct data structure from CHARM's IT tree.When the core CHARM processing identifies a new potential FI set, CHARM-L will attempt to insert a new concept representing the FI set into the lattice, as a child of the concept corresponding to the parent node in the IT tree.What remains is to identify concepts already in the lattice that are to become children of the new concept.Such identification is performed by intersecting concept id sets that are maintained within each node of the IT tree.
SPROUT is a lattice construction algorithm that provides an option to build an iceberg lattice.It begins by creating the top concept and then generates children by appending each object not in the concept's extent and inquiring the formal context for the item sets.Generated concepts are tested for closure and pre-existence.If not closed, the concept is discarded.If pre-existent, a parent-child link is added.The process repeated for each new concept.
Martin and Eklund is another algorithm that generates a lattice from a set of closed FI set found by a FI set miner.It maintains a border set of concepts that have been inserted into the lattice thereby limiting the concepts that must be examined during the insertion of the next closed FI set.

Methodology
While GMA and like algorithms are not directly suitable to construct an iceberg lattice, adapting the algorithm to add data incrementally on an item by item basis (i.e., vertical representation) and interchanging the roles of the set of object ids (O) and the set of items (I), results in an algorithm that can construct an iceberg concept lattice.The algorithm still performs a top-down level-wise search and insert process; however, these changes effectively invert the lattice.The addition of a predicate to ensure that the minimum support threshold has been met is the only remaining change needed to construct an iceberg lattice.
Preliminary tests (Note 5) validated the modified GMA algorithm functioned correctly.The Mushroom (Note 6) data set was used as the test case.The converted algorithm was tested with minimum supports of 50%, 30%, 10%, 1%, and 0%.The algorithm reported a number of concepts of 45, 427, 4,897, 51,672 and 238,709 respectively with execution times of 0.04, 0.39, 7.17, 160.28, and 1,198.08 seconds.The reported number of concepts is the same as found by the CHARM-L algorithm.While the execution time for high supports was comparable to CHARM-L, the performance significantly degraded by an order of magnitude as support is lowered.Thus, the modified GMA algorithm cannot compete with the leading ARM algorithms.This section describes the development of the Quick Iceberg Concept Lattice (QuICL -pronounced kwi-kəl) algorithm.QuICL provides incremental construction of a concept lattice along the lines of GMA, but approach the insertion process from the bottom of the lattice as opposed to a top-down, level-wise search for generators.The structure of the lattice is used to navigate to a point of change.Recursion is used to facilitate the location of additional points of change and enable linkage between parent and child concepts.The result is an algorithm that constructs all 238,709 concepts derived from the Mushroom data set in less than three seconds, a performance improvement over GMA that is near three orders of magnitude.

A Step Towards an Efficient Incremental Algorithm
A step towards an efficient incremental insertion algorithm for an iceberg lattice is to apply a few minor modifications to the representation of the lattice.In addition to interchanging the roles of the set of object ids (O) and the set of items (I) to invert the lattice, the cardinality of I in a given concept can be significantly reduced by exploiting the lattice property: if Thus, an item I i  I of concept C 1 does not need to be physically recorded in a concept if there exists a concept C 2 such that C 2 > C 1 and I i  I of concept C 2 .Instead, the item I i is implied by the lattice structure.An item I i need only be recorded in a concept at its maximal position (i.e., lowest position in the inverted lattice).This representation is also desirable for direct extraction of association rules (see Section 1).Another modification is to omit a topmost concept whose intent is the set of all items in the concept lattice.As a result, the concept lattice becomes a semi-lattice.The semi-lattice can be readily converted to a complete lattice by a post-construction step to add a common topmost parent for all concepts in the lattice that do not have parents.For the purpose of ARM, this post-step is not needed.The final modification is to redefine the bottom concept simply as an entry point into the lattice.Thus, the bottom concept does not hold any objects or items.It is created upon initial construction of an empty lattice and its intent and extent are not updated.
The previously mentioned changes will simplify the processing in the GMA algorithm without any loss of necessary information.The steps of GMA that add an item to the intent of concepts whose extent is a proper subset of the next item's objects are not needed, since the lattice structure will imply the item.As a result, concepts whose extent is a proper subset of the next item's objects will not need to be visited.Furthermore, the pre-steps to ensure the extent of the bottom concept includes new object ids can be eliminated.There is, however, one small side effect.In the event an item exists common to all objects, GMA would place that item and its object ids into the bottom concept.With the proposed changes, the item and object ids will be in a new concept that is the sole parent to the bottom concept.Given the proposed modifications to the lattice structure, Figure 5 depicts the progression of incremental item insertions of the data in relation R of Figure 1 into an inverted concept lattice.The final lattice of Figure 5 is the inverted form of the lattice given in Figure 1.Before presenting an algorithm to construct a lattice using the proposed structure, a few observations are noteworthy:  Insertion of an item whose extent  the extent of a concept C i within the lattice is accomplished by simply adding the item to C i .C i can be found by traversing the lattice from the bottom along any path where the item's extent  a concept's extent.An example is inserting d 3 in Figure 5.  Except for the previous case, a new concept C New will be added to the lattice.That concept will forever hold the item. If an empty lattice is defined as a bottom concept with an empty intent and extent, then any subsequent insertion of the new concept C New will always be performed above another concept.Let the concept above which C New is to be inserted be denoted as C Base .C Base can be identified by traversing the lattice along any path  where the item's extent is  of a concept's extent.For example, when inserting d 4 with object id set {O4} into the lattice of Figure 5 the base concept will be (, {O3O4O5O9O10}). For all parent concepts C p of C Base such that the extent of C p is not =, , or  of new item's extent, the new concept C New will be a sibling of each C p .C Base will be a child of the new concept.If the extent of a C p  item's extent is not empty then another new concept with an extent = extent of a C p  item's extent must be found or inserted above C p .Such concept can be found, or if needed created, by recursing using a null item and extent of C p  item's extent as the set of object ids.The concept returned from the recursive call will also be a parent of C New .An example of finding already existing concepts in the recursive call is inserting d 2 in Figure 5.An example of creating a new concept in the recursive call is inserting b 2 . For all parent concepts C p of C Base such that the extent of C p is  of extent of C New , C New will be inserted between C Base and C p .C Base will no longer be a child of C p .Instead the C New will be a child of C p and C Base will be a child of the C New .An example is inserting c 1 with object id set {O2O3O4O5O6O8O9O10} into the lattice of Figure 5.The object ids are a superset of the extent of concept (, {O3O4O5O9O10}).Thus, concept (, {O2O3O4O5O8O9O10}) is inserted between the base concept ({a1}, {O1O2O3O4O5O8O9O10}) and concept (, {O3O4O5O9O10}).
Given these observations, an alternative algorithm to GMA can be formulated.For each insertion, GMA processes all concepts in a top-down, breath-first manner to modify existing concepts and to generate new concepts.The top-down traversal is used to facilitate identification of generators and limit the search for parent concepts.The noted observations, however, suggest alternate approach.The identification of generator concepts can be performed from the bottom up using the lattice structure to navigate to a generator (i.e., a base concept).Furthermore, recursion can be used to find, or if needed create, the parent concepts.
Algorithm 2 presents an incremental insertion algorithm to construct a concept lattice.For this algorithm, each concept is a tuple composed of a list of items, a list of object ids, and a list of parent concepts.A designated empty concept named C Bottom provides an entry point into the lattice.The algorithm begins with the BUILD-LATTICE function.This function accepts a formal context K{I, O, R}.BUILD-LATTICE creates an empty concept lattice consisting of the bottom concept (line 1) and then incrementally adds each item into the lattice using the INSERT function (lines 2 and 3).After inserting all items, the bottom concept is returned (line 4).The INSERT function provides the incremental insertion of an item into the lattice or sub-lattice.INSERT is passed a reference to a concept, referred to as the base concept C Base , above which an item id I i together with its extent O is to be inserted.The item id can and will often be omitted when inserting into a sub-lattice (i.e., a subset of a lattice consisting of a concept and all its ancestors).INSERT involves three phases; i) navigate into the lattice and identify a list of concepts to be further processed, ii) if needed, construct a new concept, and iii) process the list of concepts identified by the first phase and link the new concept into the lattice.Both the navigation phase and link phase recursively call the INSERT function as needed.
INSERT first defines an empty list of tuples consisting of a type indicator with values SUP or ISET, an intersection set, and a reference to the concept that generated the intersection set (line 5).This list is populated during a navigate-prepare phase and is processed during the link phase.If O is a subset the extent of any parent concept then INSERT recurses using the parent as the new base concept (lines 12 and 13).This effectively navigates into the concept lattice to locate the position above which the item will be inserted.If O is superset of the extent of a parent then a tuple composed of SUP, a reference to the parent concept, and the parent's extent is added to the ToProcessList for later processing (lines 14 and 15).If O is neither equal to, subset, nor superset of the extent of a parent concept, and O intersect the extent of a parent is non-empty, then a tuple composed of ISET, a reference to the parent concept, and O intersect the extent of the parent is added to the ToProcessList (lines 16 and 17).
If comparison of O with the extents of all parent concepts does not encounter a parent concept where O is equal to or a subset of the parent's extent, then a new concept node will be constructed (line 18).The new concept will contain the item I i in its intent and O as its extent.The new concept will be a child of all SUP concepts in the ToProcessList, a sibling to the ISET concepts, and a parent to the base concept.

Algorithm 2. A recursive incremental lattice construction algorithm
After creating the new concept, the final phase of the algorithm processes the concepts in the ToProcessList and links the new concept into the lattice.For a parent concept in the ToProcessList with a SUP type, the parent will no longer be a parent of the base concept (line 20).Instead it will be the parent of the new concept.Thus, the parent concept is removed from the base concept's list of parents (line 21) and added to the new concept's parents (line 22).Each parent concept for which O is neither equal to, a subset of, nor superset of the parent's extent will be a sibling to the new concept.Furthermore, if O intersect the extent of a sibling is not empty then additional processing is required to add the information about O intersect the extent of a sibling into the lattice.Such siblings are the concepts in the ToProcessList that have an ISET type.A concept representing O intersect the extent of a sibling must be found within the lattice, or if absent created, and added as a parent of the new concept.To do this, the algorithm recurses using the sibling as the base concept, a null item, and O intersect the extent of the sibling as the set of object ids (line 24).The concept returned by the recursive call is added to the new concept's parents (line 25).Finally, the new concept is added to the parents of the base concept and the new concept is returned (lines 26 and 27).{O1O2O3O4O5O8O9O10}), has an empty intersection with {O3O4O5O9O10}.Thus, the recursive call completes by creating the concept (, {O3O4O5O9O10}) and adding it as a parent of ({a1}, {O1O2O3O4O5O8O9O10}).The new concept is returned from the recursive call.The returned concept is added as a parent of ({b2}, {O3O4O5O6O7O9O10}) by the base invocation of INSERT.
Processing the {SUP, ({a2}, {O6O7}), {O6O7}} tuple involves removing ({a2}, {O6O7}) from the parents of C Base , being {,}, and adding it as a parent to the C New , being ({b2}, {O3O4O5O6O7O9O10}).At this time all tuples in the ToProcessList have been processed.The first invocation of INSERT completes by adding C New as a parent to C Base and returning a reference to C New .
The walkthrough of b 2 inserting given in Figure 6 demonstrates a majority of the execution paths through the algorithm.However, the walkthrough did not execute the paths where the O in the call to INSERT are equal to or a subset of the parent's extent.Such execution paths are readily apparent in many of the other insertions depicted in Figure 5.For example, insertion of d 3 will call INSERT with C Base referencing the bottom concept {,}, I i = d 3 , and O = {O7}.The navigate-prepare phase will recurse with C Base referencing the concept ({b2}, {O3O4O5O6O7O9O10}), since the O is a subset of the extent.The navigate-prepare phase of the recursive call will further recurse with C Base referencing the concept ({a2}, {O6O7}).The navigate-prepare phase of this recursive call will encounter a parent concept whose extent equals O.That concept is ({c2}, {O7}).In this case I i , being d 3 , is inserted into the intent of ({c2}, {O7}) and a reference to this concept is returned back through all invocations.

A Shortcoming and a Correction
There is currently a defect in Algorithm 2 in that it may violate the lattice connection property (i.e., edge is made between any two concepts C 1 and C 2 for which order < exists and there is no concept C 3 for which C 1 < C 3 < C 2 ).These errors are due to relationships between the concepts referenced in the ToProcessList; either between two ISET tuples or between an ISET and SUP tuple.The processing of all related tuples results in adding invalid parent-child links.If there exists a non-trivial meet in the lattice between the referenced concepts then the intersection sets recorded in the tuples of ToProcessList of the related concepts will be the extent of the meet, and therefore the intersection sets will be the same.Thus, an approach to correcting the flaw is to remove all but one of the tuples in the ToProcessList of any tuples having the same intersection set.However, this approach is not sufficient since there exists cases where the invalid link does not involve a concept that is currently in the lattice.These cases are still the result of a relationship between concepts in the ToProcessList.A case is depicted in Figure 7. Here, the related concepts referenced in the ToProcessList are ({I1}, {O1O2O3O4}) and ({I2}, {O1O2O3O5}), and the meet concept is (, {O1O2O3}).The invalid link will occur regardless of the order in which the tuples of the ToProcessList are processed.The processing of {ISET, ({I1}, {O1O2O3O4}), {O1O2}} before {ISET, ({I2},{O1O2O3O4}), {O1O2O5}}, as shown, will create the concept (, {O1O2}) when processing {ISET, ({I1}, {O1O2O3O4}), {O1O2}}, then create concept (, {O1O2O5}) when processing {ISET, ({I2}, {O1O2O3O4}), {O1O2O5}}.On the other hand, if {ISET, ({I2}, {O1O2O3O4}), {O1O2O5}} is processed first, then both concepts (, {O1O2}) and (, {O1O2 O5}) will be created upon processing {ISET, ({I2}, {O1O2O3O4}), {O1O2O5}}.The subsequent processing of {ISET, ({I1}, {O1O2O3O4}), {O1O2}} will simply add the violating edge.Therefore a solution is to identify and remove all the tuples in the ToProcessList that have an intersection set that is a subset of the intersection set of other tuples.Thus to fully correct the problem, an algorithm to purge such tuples from the ToProcessList is needed.
A purge subsets algorithm involves comparing the intersection set of each tuple with the intersection set of every other tuple in the ToProcessList.This will introduce a potential O(n 2 m) asymptotic complexity when n is the number of tuples in the ToProcessList and m is the size of the intersection sets.While the number of tuples in a given ToProcessList is bounded by the number of parent concepts of a given base concept, it is desired that the purge subsets algorithm be highly efficient and avoid any unneeded processing.There is no need to compare two SUP tuples, since SUP tuples cannot be a subset of other tuples.Furthermore, two ISET tuples cannot be both a subset and superset of each other.Therefore, the only tests needed between any two tuples are: i) a subset test when the first tuple is an ISET, and ii) a superset test when the second tuple is an ISET.The later will only be After Inserting I 1 , I 2 , and I 3 performed if the first tuple is not an ISET, or if the result of the subset test is false.Furthermore, to obtain an O(n 2 m) complexity but not O(n 2 m 2 ) the sets of object ids must be maintained in sorted order.This is necessary for fast determination of subset and superset operations.These operations can be optimized to determine an outcome as soon as possible.A subset operation on sorted lists can report false if at any time an id is found in the first set that does not exist in the second, or the number of ids yet to be examined in the first set is greater than the number of ids yet to be examined in the second.Dually, a superset operation can report false if at any time an id is found in the second set that does not exist in the first, or the number of ids yet to be examined in the first set is less than the number of ids yet to be examined in the second.
Algorithm 3 presents an efficient algorithm to purge tuples in the ToProcessList.Function PURGE-SUBSETS accepts the ToProcessList tuples.Lines 1 and 2 provide loops to compare each tuple with every other tuple.Lines 3 through 6 perform the comparisons between the tuples and removal of the subset tuples as needed.

PURGE-SUBSETS(ToProcessList)
// ToProcessList is a list of tuples {Type, Concept, O} with // Type  {SUP, ISET}, Concept a reference to a concept, // and O a set of object ids 1. for each Pi  ToProcessList: 2.
for each Pj  ToProcessList  Pj comes after Pi: 3.

The QuICL Algorithm
In addition to calling the PURGE-SUBSETS function, there are three more enhancements; the first maintains the parent concepts in an order that may potentially improve performance, the second enables a specification of a minimum support threshold in order to construct iceberg lattices, and the third removes redundant intersect operations thereby further improving performance.The rationale to maintain parent concepts in a sorted order is to reduce the number of times the body of the navigate-prepare loop is executed.If during the iteration over parents, a parent concept whose extent is equal to or subset of the set of object ids is encountered the algorithm returns without testing the remaining parents.To increase the probability that such parent concepts are encountered sooner than later, the parents are maintained in descending order of the cardinality of extents.
To construct an iceberg lattice the insertion must discard any item whose extent does not meet a minimum support threshold.In addition, the processing must prevent construction of concepts whose extent would not meet the threshold.Since the extent for a new concept resulting from an intersection with another concept is the intersection set that is stored in the tuples of the ToProcessList, a predicate on the size of the intersection set can be used to prevent construction of such concepts.The predicate can be tested before adding an ISET tuple to the ToProcessList.
Testing and analysis of the Algorithm 2 revealed that more intersections are being performed than needed.This is the result of the same parent concepts being intersected from multiple invocations of the INSERT function.In such case, each invocation has a different base concept that shares the given parent.Even though each invocation may be passed a different set of object ids, the resulting intersection set will be the same during insertion of a given item.This is the case since the intersection set is ultimately the intersection of the parent's extent and the extent of the item being inserted.Thus, an enhancement is to cache (Note 7) each intersection set with its parent concept for the duration of an item insertion.Between item insertions all cached intersection sets are discarded.
While the intersection set of each invocation is the same, the outcome of comparison (i.e., =, , , and ) on which Algorithm 2 is dependent can be different.The outcome of comparison can be readily determined by performing tests on the cardinalities of the cached intersection set, the parent's extent, and the object id set passed to the INSERT function.Table 1 provides identification of an outcome based on the cardinality of these sets.
In caching the intersection set in the parent concept, care must be taken to avoid incurring a penalty (Note 8) in memory consumption.A penalty can be avoided by using the appropriate reference as the intersection set.If the outcome of comparison is equal or a subset, then cached intersection set is set to the object id set passed to INSERT.If the outcome of comparison is superset, then cached set is set to C Parent .O.If |O  C Parent .O| < minimum support, the cached set is set to an empty set.Only when the outcome of comparison is intersect and the intersection set meets the minimum support threshold will the new intersection set be cached.However, using a reference to this same set in the ISET tuples of the ToProcessList will result in no additional memory consumption.This set ultimately becomes the extent of a new concept that is added to the lattice.The applying these three changes to Algorithm 2 together with a call to PURGE-SUBSET is the QuICL algorithm, given in Algorithm 4. Line 30 provides the call to the PURGE-SUBSETS function.Lines 38 and 39 specify an order for parents of a concept.Line 2 discard items that do not meet the minimum support threshold.Line 14 tests that the size of the intersection set meets the minimum support threshold.Lines 9 through 12 obtain a reference to the intersection set IS.If the intersection set was previously computed then it is obtained from the cache, otherwise it is computed.Lines 16, 20, 23, and 26 use tests against the cardinality of the intersection set to determine an outcome of comparison.Lines 17, 21, 24 and 27 cache the intersection set during insertion of a given item.Line 3 clears the cached intersection sets following item insertion.
In a preliminary test, the QuICL algorithm constructed the complete lattice for the Mushroom data set in three seconds.This represents a gain in excess of two orders of magnitude over the GMA algorithm.

Results
QuICL was empirically evaluated against the CHARM, CHARM-L, and the iceberg enhanced GMA algorithms.The C version of the CHARM and CHARM-L were downloaded from the author's web site and translated to Java (Note 9).The CHARM implementation utilized memory mapped I/O to read the object ids from a vertical representation of a data set.On translating to Java, the memory mapped I/O was converted to the available random access classes.This introduced a performance problem since the CHARM implementation re-reads the sets of object ids multiple times when generating the first level of CHARM's IT tree.The implementation was enhanced to cache in memory the object id sets.The GMA algorithm with modifications for iceberg processing and QuICL (Algorithm 4) were directly implemented in Java.items, can account for 38% (e.g., T10I4D100k at 0.01% supp using QuICL) to 72% (e.g., T25I10D10K at 0.0% supp using QuICL) of memory.Furthermore, for large lattices the number of parent-child links account for another 15%.
CHARM-L provides a reduction in memory by not retaining object ids in its lattice.However, CHARM-L does maintain object ids in its IT tree.These entries are dynamically constructed during the traversal of the CHARMS IT tree and discarded upon completion of a branch.The memory consumed for each concept in the CHARM-L lattice is about three times the memory consumed by the concepts of QuICL.Due to very different approaches, the reduction or gain in memory usage when compared against QuICL is varied.CHARM-L exhibits the best memory usage on the Pumsb, Pumsb*, and chess data sets.These data sets contain large object id sets.Thus the difference based representation is providing significant reduction in memory usage.CHARM-L has comparable memory consumption on the Mushroom and T25I20D100K data sets, but a loss around a factor of three on the T10I4D100k and T25I10D100k data sets.For the later two datasets, the overhead to represent a concept is degrading the memory consumption as these involve very large lattices containing concepts whose cardinality of extent is small.
The CHARM algorithm does not construct a lattice.As such, its memory consumption is for processing its IT tree and construction its list of FI sets.Since CHARM-L is an extension to CHARM that constructs a concept lattice the memory consumption of CHARM is expected to be less than CHARM-L.This is indeed the case.However, the difference between the exhibited memory consumption of CHARM-L and CHARM should not be interpreted to be the memory for the lattice, since memory used for the IT tree is released upon processing a branch and may be reused for the lattice.

Conclusions
This paper has presented the QuICL algorithm, used to incrementally to construct an iceberg concept lattice.Its objective was to offer a lattice based algorithm whose overall performance in constructing a lattice is comparable to algorithms used for ARM.Furthermore, it was proposed that such algorithm would provide gains relative to the overall task of ARM.This objective has been met.The performance of QuICL is on the order of CHARM, a leading algorithm to mine FI sets, and QuICL additionally derives the upper covers.The lattices constructed by QuICL are of a form whereby association rules can be directly read and a basis can be readily generated.As such, the Stumme et al. (2001) algorithms can be used to extract the Duquenne-Guigues basis and Luxemburger basis.Thus, it is postulated that QuICL provides a significant gain in the overall task of ARM.QuICL enables the generation of association rules whose size is constrained to a number that can be exploited by the end user.Beyond this, it was proposed that new efficient algorithms to construct concept lattices may present a contribution to formal concept analysis.QuICL provides an order of magnitude gains in performance over GMA, an often cited incremental lattice construction algorithm.It is noted that GMA provides good performance on data sets whose density is less than 0.10.QuICL provides excellent performance on both sparse and dense data sets.For example, on the T10I4D100k, a sparse data set, QuICL provides a gain over GMA of two orders of magnitude (e.g., less than 120 seconds verses near 10,000 seconds at 0.0% supp ).On Mushroom, a dense data set, the same two order magnitude gain is realized (e.g., three seconds verses 200 seconds at 0.0% Supp ), likewise on Chess (e.g., less than ten second verses over 1,000 seconds at 55% supp ).Literature has noted there is no known "best" algorithm for lattice construction and that each algorithm demonstrates different performance on different data sets, yet QuICL provides the best all-around performance.
QuICL differs from past lattice construction algorithms in three notable ways.First, QuICL is a pure incremental lattice construction algorithm.That is, its sole data structure driving its processing is the lattice.Many other algorithms are driven by some other data structure and separately construct the lattice, although as an integral sub-task.For example GALICIA-T (Valtchev et al., 2002) uses a trie, Nourine and Raynaud (2002) uses its lexicographic tree, and CHARM-L uses its IT tree as it primary data structure.By being a pure incremental lattice construction algorithm, the foundation of QuICL is based solely on FCA.Additional theory derived from FCA may provide for further improvements to QuICL.Second, QuICL has recognized that it is sufficient to store an item at only its maximal position.There is no need to include the item in all descendent concepts.Thus for a given item insertion, the only modified concept will be the one where an item is inserted.This eliminates the need to modify a substantial number of concepts thereby significantly improving performance.Third, in comparing QuICL to GMA, both identify generator concepts.For QuICL, the generators are the base concepts that do not have a parent whose extent is a superset of the incoming object id set.QuICL differs from GMA in that it identifies the lowest generator concepts first, whereas GMA first identifies the highest.Thus, QuICL eliminates the step to validate a candidate generator is indeed a generator, a potentially time consuming process.Furthermore, since QuICL approaches the lattice from the bottom up, its recursion directly identifies the parent concepts.This eliminates the very expensive task of searching for parents incurred by GMA.This task is exacerbated on dense data sets.Given this discussion and the results presented herein, it is postulated that QuICL is the "best known" all around incremental lattice construction algorithm.An evaluation against a broader set of data sets and other lattice construction algorithms is needed to validate this claim.
An issue for QuICL, as well as FI set miners and lattice construction algorithms in general, is memory consumption.The exponential nature of the problem can quickly exhaust available memory.All algorithms used in this study failed to produce a complete lattice for four of the seven data sets.In each case the failure was due to memory constraints.CHARM and CHARM-L was able to process lower supports than QuICL and GMA.Further investigation into CHARM's difference based representation may shed light on additional improvements to QuICL.
Another issue for QuICL is seen in the runtime execution on the Chess, Pumsb, and Pumsb* datasets.These datasets contain items having very large object id sets (e.g., on Chess half the items exceed 1,500 object ids, on Pumsb some items have over 40,000 object ids).As a result many concepts have large extents.The time to perform intersections for these large sets is a considerable portion of execution time.Again, further investigation into CHARM's difference based representation may provide insights in addressing this issue.
An enhancement to QuICL that addresses both the memory consumption and large object ids sets is to exploit the lattice property: if O i  extent of concept C 1 then  C 2 | C 2 > C 1 , O i  extent of C 2 .Thus, an O i  O of concept C 2 does not need to be physically recorded in a concept if there exists a concept C 1 such that C 1 < C 2 and O i  O of concept C 1 .Instead, a given object O i need only be recorded in a concept at its minimal position (i.e., highest position in the inverted lattice).This forms a compressed lattice structure.The savings in memory is at the cost of a penalty in performance.Analysis of this tradeoff is a subject of future research.
QuICL was developed in 2009 (Smith).Since then a recent algorithm to generate an iceberg lattice for ARM has been developed (Szathmary et. al. 2011).Szathmary et.al. include an empirical evaluation of their algorithm, Snow-Touch, against CHARM-L using a number of the same datasets used to evaluate QuICL.A comparison of the results of Szathmary et al. against the results presented herein indicates that QuICL will exhibit leading performance.For example, on the Mushroom dataset at 25% supp the reported execution time of Snow-Touch is around 40 seconds whereas QuICL is under a tenth of a second (at 0% supp QuICL is around 3 seconds).Furthermore, the times of QuICL are better than the reported times for Snow-Touch on the Chess and T2510D10k datasets.

Definition 1 :
Given a set of object identifiers (ids) O, a set of items I, and a relation R such that R  O  I, a formal concept is a pair of sets O  O and I  I iff: O = {o  O |  i  I, oRi} and I = {i  I |  o  O, oRi}, where oRi denotes object o has item i in relation R.

Figure 2 .
Figure 2. Examples of an iceberg concept lattice.Top -iceberg concept lattice at 60% support.Bottom -iceberg concept lattice at 40%.These are derived from the lattice of Figure 1 by discarding concepts not meeting the minimum support threshold

Figure 4 .
Figure 4. Progression of incremental object insertion into a concept lattice.Bold text indicates new concepts, inserted items or inserted objects, G a generator concept, m a modified concept, and dashed lines are removed links  Ø 12. for each Ci  G in ascending |I| order:  Cj  Processed[|Intersect|] | Cj.I = Intersect: 21.CNew  new Concept (Ci.O  {Oi}, Intersect) 22.

Figure 5 .
Figure 5. Progression of incremental item insertion into a concept lattice.Bold text and weighted lines identify new elements.Dashed lines indicate removed links.R v is the vertical representation of R of Figure 1

Figure 7 .
Figure 7. Invalid edge generated between new concepts The intersection set is the result of intersecting the object set O passed to the INSERT function with the extent of a parent concept.A type SUP indicates O is a superset of the extent of the parent concept.Type ISET indicates that O is neither superset nor subset.INSERT proceeds to compare O with the extent of each parent of the base concept (lines 8 through 17).If O is equal to a parent concept's extent then the item I i , if supplied, is added to the concepts list of items (lines 9 and 10).The insertion is complete.For purposes discussed later, INSERT returns a reference to the modified concept (line 11).

Table 1 .
Determination of intersection outcome.C parent .IS is the cached intersection set of a parent concept, O is the object id set passed to INSERT, and C parent .O is the extent of a parent concept

Table 2 .
Data set and lattice characteristics.|O| is number of objects, |I| is the number of items, and |L| is number of concepts.Average degree is the average number of concepts in the upper cover of each given concept.Maximum degree is the maximum number of concepts in the upper cover of any concept The Chess data set is sequence of steps recorded for a game of chess.Pumsb data set contains census data.The Pumsb* data set is the Pumsb data set with removal of items whose support is greater than or equal to 80%.The T10I4D100k, T25I10D10k, and, T25I20D100k are synthetic data sets generated by the IBM Synthetic Data Generator.It generates data sets that emulate retail transactions according to a set of input parameters (e.g., number items, number transactions, average transaction length).The Mushroom, Chess, Pumsb, Pumsb* and T10I4D100k data sets were downloaded from the University of Helsinki Frequent Item Set Mining Data set Repository.The T25I10D10k and T25I20D100k data sets were downloaded from the High Performance Computing Laboratory of The Institute of Information Science and Technologies, Pisa, Italy.The characteristics of these data sets together with characteristics of their generated concept lattices are given in