hlcm-0.1: Fast algorithm for mining closed frequent itemsetsContentsIndex
HLCM
Description

Library for using the LCM algorithm in order to compute closed frequent pattern. Input must be a transaction database, either in text format (as a ByteString) or in [[Item]] format, where Item = Int.

Several bencharking functions allowing to tune parallel strategy used and depth cutoff are also provided.

Synopsis
type Frequency = Int
type Item = Int
runLCMstring :: ByteString -> Frequency -> [[Item]]
runLCMmatrix :: [[Item]] -> Frequency -> [[Item]]
benchLCM_parBuffer :: ByteString -> Frequency -> Int -> Int -> [[Item]]
benchLCM_parMap :: ByteString -> Frequency -> Int -> [[Item]]
Documentation
type Frequency = Int
type Item = Int
runLCMstring
:: ByteStringThe transaction database as a big string. Transactions are separated by newlines, items are separated by spaces
-> FrequencyMinimum frequency threshold for the frequent itemsets
-> [[Item]]Output: list of closed frequent itemsets
Get the data as a long bytestring, parses it and and executes LCM to discover closed frequent itemsets.
runLCMmatrix
:: [[Item]]The transaction database as matrix of items (List of List)
-> FrequencyMinimum frequency threshold for the frequent itemsets
-> [[Item]]Output: list of closed frequent itemsets
Get the data as a matrix of Items, parses it and and executes LCM to discover closed frequent itemsets.
benchLCM_parBuffer
:: ByteStringThe transaction database as a big string. Transactions are separated by newlines, items are separated by spaces
-> FrequencyMinimum frequency threshold for the frequent itemsets
-> Intvalue for parBuffer
-> Intdepth for cutting parallelism
-> [[Item]]Output: list of closed frequent itemsets

Use for benchmarking, parallel strategy = parBuffer by Simon Marlow. This strategy does not have space leak.

/Warning: outputs are unusable as is, because items are renamed internally, and in this function the reverse renaming is not performed. It is trivial to have it back by copying the code from runLCMstring./

benchLCM_parMap
:: ByteStringThe transaction database as a big string. Transactions are separated by newlines, items are separated by spaces
-> FrequencyMinimum frequency threshold for the frequent itemsets
-> Intdepth for cutting parallelism
-> [[Item]]Output: list of closed frequent itemsets

Use for benchmarking, parallel strategy = parMap from Control.Parallel.Strategies.

/Warning: outputs are unusable as is, because items are renamed internally, and in this function the reverse renaming is not performed. It is trivial to have it back by copying the code from runLCMstring./

Produced by Haddock version 2.6.1