Dictionaries are lists, except, you can have anything for indices (keys), not just numbers starting with 0.
The following two are equal:
>>> listoption = ['a','b','c']
>>> dictoption = {0:'a', 1: 'b', 2:'c'}
You would access them in the same way:
>>> listoption[1]
'b'
>>> dictoption[1]
'b'
You would update them in the same way:
>>> listoption[1] = 'd'
>>> dictoption[1] = 'd'
Of course the point of a dictionary is that keys can be anything:
>>> d = {'Gru':3, 'Margo':4}
>>> d['Gru']
3
This dictionary has strings as keys and integer as values. The values can be anything as well:
>>> d2 = {'Gru': set( [123,456] ), 'Margo': set( [456] ) }
>>> d2['Gru']
set([123, 456])
Note that since keys can be anything, to print or iterate through the values in a dictionary, you need something other than range. Well, it helps that keys() returns all the keys in a dictionary:
>>> d2.keys()
['Gru', 'Margo']
>>> for key in d2.keys():
… print key, d2[key]
Gru set([123, 456])
Margo set([456])
These are really the most common operations on dictionaries: put a value, read a value, iterate through keys to get values.
There is a function to convert a dictionary to a list, by throwing away the indices. For example:
>>> d2.values()
[set([123, 456]), set([456])]
Given the following dictionary for hobbies for people:
hobby = {'Gru':set(['Hiking','Cooking']), 'Edith':set(['Hiking','Board Games'])}
creates a new dictionary that lists people for each hobby:
{'Hiking': set(['Vector','Edith']), 'Cooking':set(['Vector']), 'Board Games':set(['Edith'])}
Write a program that uses a dictionary that associates integers (the key) and sets strings (the values) to find the number of movies in each year of the IMDB. Start from
imdb_file = raw_input("Enter the name of the IMDB file ==> ").strip()
years_and_movies = {}
for line in open(imdb_file):
words = line.strip().split('|')
movie_name = words[1].strip()
year = int(words[2])
Write additional code that uses the years_and_movies dictionary to find the year that has the most movies.
Exercise: what is the output of the following?
>>> d = dict()
>>> d[15] = 'hi'
>>> L = []
>>> L.append(d)
>>> d[20] = 'bye'
>>> L.append(d.copy())
>>> d[15] = 'hello'
>>> del d[20]
>>> L
The result may surprise you, but it reflects the difference between making an alias to an object and making a full copy of an object.
Assignment between lists, between sets, and between dictionaries all involve shallow copies!
Many APIs (Application Programming Interfaces) return values as JSON strings which are actually easily loaded into Python objects.
We will demo accessing Twitter through an API and processing the returned JSON object.
Accessing Twitter requires two modules:
Querying Twitter:
A pair is returned, containing a dictionary of information about the process of generating the query result, and a string containing the query result itself.
simplejson is used to parse this query string into a dictionary:
Two entries:
- ``search_metadata``, which has a dictionary of attributes
Each tweet in the list is a dictionary; one of the entries in this dictionary is the actual text.
Overall, this is a complicated hierarchy of lists and dictionaries, with each dictionary storing attribute/value pairs.
Once we understand the structure, we can write code to extract the information we want.
Create a dictionary to store the favorite colors of the following individuals
Then add some others of your own. Now, write code to change Fei’s preference to green and to remove Sandy’s preference from the dictionary.
Using the dictionary from the first problem, write code to find which color is most commonly preferred. Use a second dictionary, one that associates strings (representing the colors) with the counts. Output the most common color. If there are ties, output all tied colors.
Complete the fast, list solution to the movie counting problem based on sorting, as outlined at the start of the lecture notes.
Use a dictionary to determine which last names are most common in the IMDB data we have provided. Count individual people not the movies they appear in. For example, 'Hanks, Tom' counts as one instance of the name 'Hanks" despite the fact that he is in many movies. Assume that the last name ends with the first ',' in the actual name. Start this problem by thinking about what the dictionary keys and values should be.
Which two individuals have the most movies in common? To solve this you will need to start from the dictionary that associates each individual with the set of movies s/he is involved in. Then you will need double for loops.