Market Basket Analysis

Affinity analysis in data mining is the search of stable groups of events that occur together in a certain subject area. It is based on the search for association rules which describe patterns of the relashionships between events.

In retail, this technique is used to perform market basket analysis, identifying stable sets of products acquired by supermarket customers in one purchase (for example, "potatoes, onions and salad", "pasta and ketchup", "beer and chips", "tea and baked goods", etc.). It allows optimizing product range and store layout to "encourage" buyers through cross-selling or up-selling.

The method is also successfully used in other areas, e.g., for studying web page visits, analysing and predicting telecommunication equipment failures, in medicine, etc.

This example demonstrates consumer basket analysis of a retail chain that sells household chemicals.

Launch demo

Download example

Algorithm Description

1. Data Import

a) Input data

The dataset contains information from 5,000 receipts. A receipt with a list of purchased items is considered a transaction, and each item in the receipt is an element of the transaction.

Name Caption
 id ID
 item Item

2. Discovering association rules

To search for the association rules, we use the FP-growth algorithm.

The loaded transactions are fed into the Input Data Source port of the Association Rules node.

a) Configuring the Association Rules node

We set up the Association Rules node as follows:

  • Field ID: Assign the usage type Transaction
  • Field item: Usage type Item
  • Checkbox Exclude items with support greater than maximum: Marked
  • Maximum support, %: 20
  • Checkbox Exclude single sets: Marked
  • Minimum rule confidence, %: 25
  • Maximum number of consequences: 2

Whenever you change any settings, retrain the model.

Interpretation of Results

a) The Frequent sets output port

This port contains the itemsets that are most commonly found in transactions (frequent sets).

b) Output port Association rules

The port contains the identified association rules and their indicators: support, confidence, and lift.

c) Output port Apply rules

This port contains the input set transactions to which the identified rules apply.

To present the results, we use the Table visualizer which we set up for each port.

The Association rules table displays the sets of association rules and their indicators — support, confidence, and lift. This is the information that describes customer behavior. In the list obtained, we can see trivial patterns — for example, Fabric conditioner → Laundry detergent — as well as non-obvious ones (e.g., Paper towels → Air freshener).

The analysts should study each of the discovered rules and select the ones that are truly valuable.


Download and open the file in Megaladata. If necessary, you can install the free Megaladata Community Edition.

Download example

results matching ""

    No results matching ""