RPI_1 -- computer_vision, computer_systems, semiconductors RPI_2 -- computer_vision, computer_systems, programming_languages, computation_theory MIT -- computer_vision, computer_systems, programming_languages, computation_theory, semiconductors NWU -- computer_vision, computer_systems, programming_languages, semiconductors UofR_1-- computer_vision, computer_systems, programming_languages, computation_theory UofR_2-- computer_vision, computer_systems, semiconductors
For simplicity let's assign each web page a unique number as follows:
RPI_1 = 1 RPI_2 = 2 MIT = 3 NWU = 4 UofR_1 = 5 UofR_2 = 6
Let's also assign each key word a unique number as follows:
computer_vision = 1 computer_systems = 2 programming_languages = 3 computation_theory = 4 semiconductors = 5
The new database now looks like:
WebPage Num_KeyWords KeyWords 1 3 1 2 5 2 4 1 2 3 4 3 5 1 2 3 4 5 4 4 1 2 3 5 5 4 1 2 3 4 6 3 1 2 5
This example web database has 6 department web sites, and there are 5 different key words that appear in the pages.