next up previous
Next: Iteration 3: Count Word Up: Finding Common Key Words Previous: Iteration 1: Count Word

Iteration 2: Count Word Sets of Length 2

The starting point for step 2, is the set of frequent words ${\cal F}_1$ computed in Step 1. We form a new candidate set ${\cal C}_2$ by looking at all distinct pairs of members of ${\cal F}_1$. For our example,
we get ${\cal C}_2 = \{12, 13, 14, 15, 23, 24, 25, 34, 35, 45
\}$ (note: 12 actually stands for the word set $\{1,2\}$).

After forming the candidate set we count their frequency in the database:

   Candidates:     12      13      14       14   
   --------------------------------------------
   Count  :        6       4       3        4

   Candidates:     23      24      25
   ----------------------------------
   Count  :        4       3       4 
  
   Candidates:     34      35      45
   ----------------------------------
   Count  :        3       2       1

We now select those candidates that are frequent (count at least 3) to get ${\cal F}_2 = \{12, 13, 14, 15, 23, 24, 25, 34 \}$.



Mohammed Zaki
10/30/1998