Menu

ARM

Stephan Zavrel

The Association Rule Miner (ARM) is a shopping cart analyzer (association rule learner) searching for recommendations of the form users who bought/viewed item X also bought/viewed item Y based on the Apriori algorithm of R. Agrawal 1994. Basically, the ARM is trying to identify pairs of items <X,Y>, which appeared significantly often together in different user baskets, where a basket corresponds to the action history of a specific user.

Basic infomation

There are two major parameters for controlling association rule mining: The support and the confidence.

While support defines how often a set of items <X,Y> appear together in different user baskets, the confidence describes the likelihood that an item Y follows in the presence of item X. More formally, the confidence is defined as follows:

confidence=support(<X,Y>)/support(X)

How to tune

The parameters listed below help you to control the association rule mining process. Go to the administration view and select the 'configure tenant' icon of the appropriate tenant in the 'Managemnet' column of the presented tenant list. Scrolling to the bottom of the page, you will find three areas/blocks named viewed together, good rated together and bought together providing the settings for the appropriate recommendations (e.g. otherUsersAlsoViewed). Each of these blocks provides the following parameters:

  • minimal support default: 2; range: >= 0
    Defines in at least how many baskets a pair of items <X,Y> must be found to be considered for creating association rules.

  • support percentage default: 0.0; range[0 ..100]
    Defines in at least how many baskets a pair of items <X,Y> must be found - on a percentage basis concerning all baskets (users) - to be considered for creating association rules

For finding association rules, the higher value of minimal support and support percentage is taken. These two parameters are provided due to more flexibility, automated adaptions and minimal quality assurance. As a guideline, the higher the support, the better (more significant) and rare are the rules.

  • confidence percentage default: 0.0; range[0 ..100]
    Defines the likelihood (between 0% and 100%) that item Y follows in the presence of item X. As a guideline, the higher the confidence value, the better (more significant) and rare are the rules.

Please note: While the support value is only used as a filter criteria, the confidence value is also used as a sorting criteria when defining the best associated items to a given one. Currently only the best 50 association rules <X -="" Y=""> are defined for an item X

Sample: Given 8.000 users. Defining a minimal support of 4, a support percentage of 1% and a confidence of 100% means, that ARM is searching for all pairs if items <X,Y> which appear together in at least 80 (maximum of 4 and 10% of 8.000) different user baskets and where X only appears together with Y (confidence=100%).

ARM Advanced Settings

Beside the two parameters support and confidence additional metrics for measuring the importance of rules are provided. The following metrics are currently available:

  • lift
  • conviction
  • long tail correlation (ltc)

Lift and conviction are standard metrics, describing the interestingness of a rule. ltc is an experimental SAT made metric, describing the capacity of a rule X -> Y that a common item X refers to a rare item Y. ltc is defined as

'ltc(X ->Y) = support(X) x log( support(X)/support(Y) )*

and has the following characteristics:

  • ltc(X->Y) < 0: X is the rare item, pointing to a common item Y.
  • ltc(X->Y) = 0: X and Y have are equal popular
  • ltc(X->Y) > 0: A common item X is referring to a rare item Y.

Further information

Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994.


Related

Wiki: Home