Find all words containing three consecutive pairs of double letters in a file of all English words located at:
Modules used: urllib
Author: Sibel Adali <adalis@rpi.edu>
Returns: All words matching condition and the count of found words
Pseudo Code:
open the file from the web with all the words in English
for each word in the file:
for all positions l in the word
if letters at positions (l and l+1) and (l+2 and l+3) and
(l+4 and l+5) are the same then
output word and increment the count
Code:
""" Find all words containing three consecutive pairs of double letters
in a file of all English words located at:
http://thinkpython.com/code/words.txt
**Modules used:** :py:mod:`urllib`
**Author**: Sibel Adali <adalis@rpi.edu>
**Returns:** All words matching condition and the count of found words
**Pseudo Code**::
open the file from the web with all the words in English
for each word in the file:
for all positions l in the word
if letters at positions (l and l+1) and (l+2 and l+3) and
(l+4 and l+5) are the same then
output word and increment the count
"""
__version__ = '1'
import urllib
def three_double(word):
""" Returns True if the word contains three consecutive pairs of
double letters and False otherwise.
"""
for l in range(len(word)-5):
if word[l] == word[l+1] and \
word[l+2]==word[l+3] and \
word[l+4]==word[l+5]:
return True
return False
# Comments that fit in a single line can be put in this format.
# Anything after a single pound sign is ignored.
# Main body of the program starts here
word_url = 'http://thinkpython.com/code/words.txt'
word_file = urllib.urlopen(word_url)
count = 0
for word in word_file:
word = word.strip().strip('\n')
if three_double(word):
print word
count = count + 1
if count == 0:
print 'No words found'
else:
print count, 'words are found'
Returns the area of a rectangular solid by finding the area of all three of its surfaces. This module is used to illustrate the use of functions and how functions call other functions.
Example use of these functions:
>>> print area_solid(1,1,1)
>>> print area_solid(2,3,4)
>>> a = area_solid(1,1,1)
>>> print "Area is", a
Functions:
Returns the area of a rectangle given length and width.
Returns the area of a rectangular solid given length, width and height by finding and adding the area of its six different rectangular surfaces.
Code:
"""
Returns the area of a rectangular solid by finding the area of all three
of its surfaces. This module is used to illustrate the use of functions and
how functions call other functions.
Example use of these functions::
>>> print area_solid(1,1,1)
>>> print area_solid(2,3,4)
>>> a = area_solid(1,1,1)
>>> print "Area is", a
"""
def area_rectangle( length, width):
"""Returns the area of a rectangle given length and width. """
return length * width
def area_solid( length, width, height):
"""Returns the area of a rectangular solid given length, width and height
by finding and adding the area of its six different rectangular surfaces.
"""
surface_area = 2*area_rectangle(length,width)
surface_area += 2*area_rectangle(length,width)
surface_area += 2*area_rectangle(width,height)
return surface_area
Prints the area and volume of a cylinder. This module is used to illustrate the use of functions and how some functions return values and how others return nothing.
Note that function area_and_volume does not return a value. As such it returns the special value None.
Example use of these functions:
>>> print area_circle(5)
>>> vc = volume_cylinder(5,10)
>>> print "Volume of cylinder is", vc
>>> area_and_volume(5,10)
Functions:
Returns the surface area of a cylinder given radius and height.
Returns the volume of a cylinder given radius and height.
Prints the area and volume of a cylinder given radius and height. The function returns nothing, i.e. None.
Code:
"""
Prints the area and volume of a cylinder. This module is used to illustrate
the use of functions and how some functions return values and how others
return nothing.
Note that function ``area_and_volume`` does not return a value. As such
it returns the special value ``None``.
Example use of these functions::
>>> print area_circle(5)
>>> vc = volume_cylinder(5,10)
>>> print "Volume of cylinder is", vc
>>> area_and_volume(5,10)
"""
def area_circle(radius):
""" Returns the area of a circle given the radius. """
pi = 3.14159
return pi * radius ** 2
def volume_cylinder(radius,height):
""" Returns the volume of a cylinder given radius and height. """
pi = 3.14159
area = area_circle(radius)
return area * height
def area_cylinder(radius,height):
""" Returns the surface area of a cylinder given radius and height. """
pi = 3.14159
circle_area = area_circle(radius)
height_area = 2 * radius * pi * height
return 2*circle_area + height_area
def area_and_volume(radius, height):
"""
Prints the area and volume of a cylinder given radius and height.
The function returns nothing, i.e. None.
"""
print "For a cylinder with radius", radius, "and height", height
print "The surface area is ", area_cylinder(radius,height)
print "The volume is ", volume_cylinder(radius, height)
This program finds all values greater than average, then returns the number of such values, and the median of these values.
Pseudo code:
List of co2_levels is given
Find average value in the list
Find the list of values greater than average,
Put them in a list
Return the length of the list (number of such values)
Find the median and print
Functions:
Code:
"""
This program finds all values greater than average,
then returns the number of such values, and the median of these values.
Pseudo code::
List of co2_levels is given
Find average value in the list
Find the list of values greater than average,
Put them in a list
Return the length of the list (number of such values)
Find the median and print
"""
def median(L):
"""Given a list of values, returns the median value. """
vals = list(L) ## make a copy to make sure the original list is not changed
vals.sort()
numvalues = len(vals)
medianindex = numvalues/2
if numvalues%2 == 1: ##Odd number of values
return vals[medianindex]
else: ## Even number of values, return the average of two middle values
return sum(vals[medianindex-1:medianindex+1])/2.0
######## Main program starts here.
co2_levels = [ 320.03, 322.16, 328.07, 333.91, 341.47,\
348.92, 357.29, 363.77, 371.51, 382.47, 392.95 ]
avg = float(sum(co2_levels))/len(co2_levels)
gtavg = []
for value in co2_levels:
if value > avg:
gtavg.append(value)
print "average is", avg
print "Values greater than average", gtavg
print "%d values greater than average" %(len(gtavg))
medianval = median(gtavg)
print "Median value", medianval
This program illustrates the use of IF statements with the age, bmi calculation from the course notes. It shows three different ways you can construct the four conditions. The first one is hard to read. The last one is the most concise.
Code:
""" This program illustrates the use of IF statements with the
age, bmi calculation from the course notes. It shows three different
ways you can construct the four conditions. The first one is hard
to read. The last one is the most concise. """
def print_bmi(age, bmi):
if age <= 45 and bmi < 22:
print "low"
elif age > 45 and bmi < 22:
print "medium"
elif age <= 45:
print "medium"
else:
print "high"
def print_bmi2(age, bmi):
if bmi < 22:
if age <= 45:
print "low" ##bmi<22 and age <= 45
else:
print "medium" ##bmi<22 and age > 45
else: ## bmi >= 22
if age <= 45:
print "medium" ##bmi >= 22 and age <= 45
else:
print "high"
def print_bmi3(age, bmi):
if bmi <22 and age <= 45:
print "low"
elif bmi >= 22 and age > 45:
print "high"
else:
print "medium"
test_values = [[45,21], [50, 21], [45, 22], [50, 22]]
for val in test_values:
print val
print "Test 1:",
print_bmi(val[0], val[1])
print "Test 2:",
print_bmi2(val[0], val[1])
print "Test 3:",
print_bmi3(val[0], val[1])
This example illustrates the use of lists that contain other lists. It takes a list of gifts and the senders, and constructs “personalized” greeting cards for each gift giver.
Code:
"""This example illustrates the use of lists that contain other lists.
It takes a list of gifts and the senders, and constructs "personalized"
greeting cards for each gift giver.
"""
def print_thankyou(gifts):
for gift in gifts:
print "Dear %s," %gift[0]
print "Thank you for the wonderful %s." %gift[1],
print "It is very nice of you to think of me."
print "The classes are getting harder, but I am keeping up with it."
print "Hope to see you for the holiday."
print "With love,"
print "John Cleese"
print
gifts = [["Uncle John", "iphone"], \
["Aunt Marie", "snowboard"], \
["Grandma", "spaceship"]]
print_thankyou(gifts)
This program checks whether two rectangles with coordinates given in lists overlap or not. Instead of checking for overlap, it checks when they do not overlap first and returns false if this so. If those checks fail, then it returns true.
Functions:
Assumes A, B are lists of the form [x1,y1,x2,y2] and checks if they are not overlapping, and returns false if so. Otherwise it returns true.
A rectangle A=[x1,y1,x2,y2] has its lower left corner at (x1,y1) and upper right corner at (x2,y2).
Code:
"""This program checks whether two rectangles with coordinates given
in lists overlap or not. Instead of checking for overlap, it checks
when they do not overlap first and returns false if this so. If those
checks fail, then it returns true.
"""
def rectangle_overlap(A, B):
""" Assumes A, B are lists of the form
[x1,y1,x2,y2] and checks if they are not
overlapping, and returns false if so. Otherwise
it returns true.
A rectangle A=[x1,y1,x2,y2] has its lower left corner
at (x1,y1) and upper right corner at (x2,y2).
"""
if (B[2] < A[0]) or (B[0]>A[2]): ##B.x2 < A.x1 or B.x1 > A.x2
return False
elif (B[3]<A[1] or B[1]>A[3]): ## B.y2 < A.y1 or B.y1 > A.y2
return False
return True
A = [0,0,10,10]
B = [0,5,5,12]
if rectangle_overlap(A,B):
print "Rectangles", A,B, "overlap"
else:
print "Rectangles", A,B, "do not overlap."
This program illustrates how to print values from two parallel lists by using indexing and for loops.
Code:
""" This program illustrates how to print values from two parallel lists
by using indexing and for loops.
"""
myvals = [ 2,3,4,5,7 ]
labels = ['monday','tuesday','wednesday','thursday','friday']
print myvals
for i in range(5):
val = myvals[i]
label = labels[i].capitalize()
label = label + " "*(10-len(label)) ##make the label 10 characters long
print "%s:\t%d" %(label, val)
print
Checks if a given word has three consecutive double letters. This program is used to understand how indexing and looping works.
Note also how the program exists with a return as soon as the if statement is satisfied. If the loop exits, then it means no match found. In this case, the program returns false.
Code:
"""Checks if a given word has three consecutive double letters.
This program is used to understand how indexing and looping works.
Note also how the program exists with a return as soon as the if
statement is satisfied. If the loop exits, then it means no match
found. In this case, the program returns false.
"""
def three_double(s):
print "Searching in", s
for i in range(0, len(s)-5):
p11 = s[i]
p12 = s[i+1]
p21 = s[i+2]
p22 = s[i+3]
p31 = s[i+4]
p32 = s[i+5]
print "=> (%s,%s), (%s,%s), (%s,%s)" %(p11, p12, p21, p22, p31, p32)
if p11==p12 and p21==p22 and p31==p32:
return True
return False
print three_double("bookkeeper")
print three_double("transmogrifier")
Module for converting an image to gray value. Call this module by:
im = convert_to_gray(filename)
which will return a new image by converting all the pixes in the input file to gray scale.
Functions:
Convert the image to a 2x2 array of pixels. Then read each pixel, convert its color to gray scale and copy to the pixel array of the output image. Return the resulting image.
Code:
"""Module for converting an image to gray value. Call this module by:
im = convert_to_gray(filename)
which will return a new image by converting all the pixes in the input file to gray scale.
"""
import Image
def convert_to_gray(filename):
"""Convert the image to a 2x2 array of pixels. Then read each
pixel, convert its color to gray scale and copy to the pixel array
of the output image. Return the resulting image.
"""
im = Image.open(filename)
w,h = im.size
pics = im.load()
im_copy = Image.new("RGB", (w,h))
pics_copy = im_copy.load()
for i in range(0,w-1):
for j in range(0,h-1):
color = pics[i,j]
gray_val = (color[0]+color[1]+color[2])/3
pics_copy[i,j] = (gray_val, gray_val, gray_val)
return im_copy
This program executes edge detection by finding the difference in color value between a pixel and the average of the eight surrounding pixels. For example, given pixel at location [i,j], the neighbors are at locations:
[i-1,j-1], [i-1,j], [i-1,j+1], [i,j-1], [i,j+1], [i+1,j-1], [i+1,j], [i+1, j+]
Note that pixels are stored in a special datatype that requires indexing as above.
A pixel is a three tuple of red, green, blue values.
Functions:
Finds the difference between the average value of the pixels in a list of colors and a given color. It then converts the value to gray scale by taking the average of the pixel values and returns a color in gray scale. Gray scale colors have the same value for the red, green and blue channels.
Note: By subtracting a color from 256, we change from black to white.
Code:
"""This program executes edge detection by finding the difference
in color value between a pixel and the average of the eight surrounding
pixels. For example, given pixel at location [i,j], the neighbors are
at locations:
[i-1,j-1], [i-1,j], [i-1,j+1], [i,j-1], [i,j+1],
[i+1,j-1], [i+1,j], [i+1, j+]
Note that pixels are stored in a special datatype that requires indexing
as above.
A pixel is a three tuple of red, green, blue values.
"""
import Image
def diff_color(colorList, curcolor):
"""Finds the difference between the average value of the pixels
in a list of colors and a given color. It then converts the value
to gray scale by taking the average of the pixel values and returns
a color in gray scale. Gray scale colors have the same value for the
red, green and blue channels.
Note: By subtracting a color from 256, we change from black to white.
"""
avg = [0,0,0]
(r,g,b) = curcolor
for item in colorList:
avg[0] += item[0]
avg[1] += item[1]
avg[2] += item[2]
avg[0] /= len(colorList)
avg[1] /= len(colorList)
avg[2] /= len(colorList)
grayval = (256 -abs(avg[0]-r) +\
256 - abs(avg[1]-g) +\
256 - abs(avg[2]-b))/3
return (grayval, grayval, grayval)
if __name__ == '__main__':
im = Image.open("bolt.jpg")
(w,h) = im.size
newim = Image.new("RGB",(w,h),"white")
pix = im.load()
newpix = newim.load()
for i in range(1,w-1):
for j in range(1,h-1):
colorlist = [ pix[i-1,j-1], \
pix[i-1,j],\
pix[i-1,j+1],\
pix[i,j-1],\
pix[i,j+1],\
pix[i+1,j-1],\
pix[i+1,j],\
pix[i+1, j+1] ]
newcolor = diff_color(colorlist, pix[i,j])
newpix[i,j] = newcolor
newim.show()
This module contains a function to illustrate the use of while loops.
Functions:
Given a list of values, finds and returns the index of the first negative value in the list. If the list does not contain any negative values, it returns -1.
Note that the use of negative values when the value is “not found” is not a good idea. A better value would be to return None.
Good test cases are listed below:
>>> print 'first negative in [1,2,-1] is at index', firstnegative([1,2,-1])
first negative in [1,2,-1] is at index 2
>>> print 'first negative in [-1,2,-1] is at index', firstnegative([-1,2,-1])
first negative in [-1,2,-1] is at index 0
>>> print 'first negative in [1,2,1] is at index', firstnegative([1,2,1])
first negative in [1,2,1] is at index -1
Code:
"""This module contains a function to illustrate the use of
while loops.
"""
def firstnegative(L):
""" Given a list of values, finds and returns the index of the first
negative value in the list. If the list does not contain any
negative values, it returns -1.
Note that the use of negative values when the value is
"not found" is not a good idea. A better value would be to return
None.
Good test cases are listed below::
>>> print 'first negative in [1,2,-1] is at index', firstnegative([1,2,-1])
first negative in [1,2,-1] is at index 2
>>> print 'first negative in [-1,2,-1] is at index', firstnegative([-1,2,-1])
first negative in [-1,2,-1] is at index 0
>>> print 'first negative in [1,2,1] is at index', firstnegative([1,2,1])
first negative in [1,2,1] is at index -1
"""
i = 0
while (i < len(L)):
val = L[i]
if val < 0:
return i
i += 1
return -1
This code illustrates the reading of valid integer input using a while loop and a break until a correct input is entered.
User either enters the correct input or the system assumes 0 is entered after three attempts.
Functions:
This function reads an input from the user, and if it is not a valid integer, continues to prompt and ask for more. However, after three unsuccessful attempts, it assumes 0 is entered.
Code:
"""This code illustrates the reading of valid integer input
using a while loop and a break until a correct input is entered.
User either enters the correct input or the system assumes 0
is entered after three attempts.
"""
def read_input():
"""This function reads an input from the user, and if it is
not a valid integer, continues to prompt and ask for more.
However, after three unsuccessful attempts, it assumes 0 is entered.
"""
attempts = 0
num = raw_input("Enter a number ==> ")
while (True): ###the number is not a digit
## keep asking for a new value
attempts += 1
if attempts >= 3:
num = '0'
print 'I assume you entered 0'
break
if num.is_digit():
break
print "You did not enter a number"
print "I wish you tried harder"
num = raw_input("Enter a number ==> ")
num = int(num)
return num
Example module for storing complex information in a dictionary, for example various information about houses. Keys are various attribute of houses, and the value is the value for a specific house.
Code:
"""Example module for storing complex information in a
dictionary, for example various information about houses.
Keys are various attribute of houses, and the value is
the value for a specific house.
"""
if __name__ == "__main__":
house1 = {'bedrooms':3, 'bathrooms':2.5, 'price': 300000,\
'squarefeet':2000, 'street':'110 8th Street',\
'state': 'NY', 'city': 'Troy', 'zip': 12180, \
'past_sale_info': [ (2010, 280000), (2000, 250000)] }
house2 = {'bedrooms':3, 'bathrooms':2, 'price': 250000,\
'squarefeet':1800, 'street':'112 8th Street',\
'state': 'NY', 'city': 'Troy', 'zip': 12180, \
'past_sale_info': [] }
houses = [ house1, house2 ]
for i in range(len(houses)):
house = houses[i]
print "House %d:" %(i+1)
print "%s bdr %.1f bath $%d %d sqrtft" \
%(house['bedrooms'], house['bathrooms'],\
house['price'], house['squarefeet'])
if len( house['past_sale_info'] ) >0:
print "Past sale info:"
for (year, price) in sorted(house['past_sale_info']):
print "\t%d: $%d" %(year, price)
print
Example program that takes a dictionary
people as keys, and set of hobbies as values
and creates a reverse dictionary
hobbies as keys, and set of people as values.
Code:
"""Example program that takes a dictionary
people as keys, and set of hobbies as values
and creates a reverse dictionary
hobbies as keys, and set of people as values.
"""
if __name__ == "__main__":
people = {'Vector': set(['Hiking','Cooking']), \
'Edith': set(['Hiking', 'Board Games']) }
### Create a reverse dictionary, from hobbies to people
hobbies = {}
for key in people.keys():
for value in people[key]:
if value not in hobbies.keys():
hobbies[value] = set()
##if we do not create the key the first time,
#we will get an error.
hobbies[value].add(key)
print "Reverse dictionary"
for key in hobbies.keys():
print key, hobbies[key]
This program illustrates the use of two mirror dictionaries to find actors in degree 1 of a given actor: actors that are in a movie with a given actor.
Given an actor name (assuming entered correctly):
Try this program with actors:
Bacon, Kevin Neeson, Liam
Can you find actors with a large degree 1?
Can you compute higher degrees for the input actor?
Code:
"""This program illustrates the use of two mirror dictionaries to find
actors in degree 1 of a given actor: actors that are in a movie with a
given actor.
* Dictionary actors: key: actor name, value: set of (movie, year) for the actor
* Dictionary movies: key: movie name, value: set of actors in the movie
Given an actor name (assuming entered correctly):
#. Find all the movies for the input actor
#. Then, find the set of all the actors in all these movies (these are the degree 1 actors of the given actor)
#. Continues to ask for a new actor name until user enters -1
Try this program with actors:
Bacon, Kevin
Neeson, Liam
Can you find actors with a large degree 1?
Can you compute higher degrees for the input actor?
"""
if __name__ == "__main__":
##read the dictionary of actors from file
##actor key, set of (movie,year) as values
actors = {}
all_actors = set()
for line in open("imdb_data.txt"):
m = line.strip().split("|")
for i in range(len(m)):
m[i] = m[i].strip()
actor = m[0]
movie = m[1]
year = int(m[2])
if actor not in all_actors:
all_actors.add ( actor)
actors[actor] = set()
actors[actor].add( (movie,year) )
## compute the dictionary of movies
## movie key, set of actors as values
movies = {}
all_movies = set()
for actor in actors.keys():
for (movie, year) in actors[actor]:
if movie not in all_movies:
movies[movie] = set()
all_movies.add ( movie )
movies[movie].add( actor )
while (True):
name = raw_input("Enter an actor name (lastname, firstname) and (-1 to end) ==> ")
if name == '-1':
break
elif name in actors.keys():
degree1_actors = set()
for (movie,year) in actors[name]:
degree1_actors = degree1_actors | movies[movie]
print "Degree 1 actors:"
degree1_actors = sorted(list(degree1_actors))
for i in range(len(degree1_actors)):
print "%d. %s" %(i+1, degree1_actors[i])
else:
print "This actor is not found"
Class example for storing 2d objects, with x and y as member values.
Methods:
Adds two points and return the resulting point. Can call as:
p1.__add__(p2)
p1 + p2
Subtracts one point from another and returns the resulting point. Can call as:
p1.__sub__(p2)
p1 - p2
Example of a function that does not change the object, but returns something, a string representation of the object:
print p1
print str(p1)
Example of a function that changes the object, but does not return anything.
Example of a function that changes the object, but does not return anything.
Example of a function that changes the object, but does not return anything.
Example of a function that changes the object, but does not return anything.
Returns the Haversine distance in miles between two points representing latitude for x and longitude for y. See also
Code:
""" Class example for storing 2d objects, with x and y
as member values.
"""
import math
class Point2d(object):
def __init__ (self, x, y):
"""Create a new point object given x,y values. """
self.x = x
self.y = y
self.d = math.sqrt(self.x**2+self.y**2)
def move_left(self, disp):
"""Example of a function that changes the object,
but does not return anything.
"""
self.x -= disp
def move_right(self, disp):
"""Example of a function that changes the object,
but does not return anything.
"""
self.x += disp
def move_up(self, disp):
"""Example of a function that changes the object,
but does not return anything.
"""
self.y += disp
def move_down(self, disp):
"""Example of a function that changes the object,
but does not return anything.
"""
self.y -= disp
def __str__(self):
"""Example of a function that does not change the object,
but returns something, a string representation of the object::
print p1
print str(p1)
"""
return str(self.x) + "," + str(self.y)
def __add__(self, other):
"""Adds two points and return the resulting point.
Can call as::
p1.__add__(p2)
p1 + p2
"""
newp = Point2d(self.x, self.y)
newp.x += other.x
newp.y += other.y
return newp
def __sub__(self, other):
"""Subtracts one point from another and returns the resulting
point. Can call as::
p1.__sub__(p2)
p1 - p2
"""
newp = Point2d(self.x, self.y)
newp.x -= other.x
newp.y -= other.y
return newp
def dist(self, other):
"""Returns the Euclidian distance between two points. """
d = math.sqrt( (self.x-other.x)**2 + \
(self.y-other.y)**2 )
return d
def manhattan(self, other):
"""Returns the Manhattan distance between two points. """
d = abs(self.x-other.x) + abs(self.y-other.y)
return d
def haversine(self, other):
"""Returns the Haversine distance in miles between
two points representing latitude for x and longitude for y.
See also
http://en.wikipedia.org/wiki/Haversine_formula
"""
lat1 = self.x * (math.pi / 180.0)
long1 = self.y * (math.pi / 180.0)
lat2 = other.x * (math.pi / 180.0)
long2 = other.y * (math.pi / 180.0)
dlat = (lat1-lat2)
dlong = (long1-long2)
a = math.sin(dlat/2)**2 + \
math.cos(lat1) * math.cos(lat2) * math.sin(dlong/2)**2
c = 2*math.atan2( math.sqrt(a), math.sqrt(1-a) )
R = 6371 / 1.609
return R*c
if __name__ == "__main__":
##Test code for the Point2d class
p1 = Point2d(10,20)
p2 = Point2d(30,40)
print "p1:", p1
print "p2:", p2
p1.move_left(5)
print "p1 after moving left by 5:", p1
print "p1+p2:", p1+p2
print "p1-p2:", p1-p2
print "Euclidian distance", p1.dist(p2)
print "Manhattan distance", p1.manhattan(p2)
Class for storing time related information. Even though it is initialized with hour, minute and second information, it keeps time in seconds as a member value.
Methods:
Code:
"""
Class for storing time related information. Even though it
is initialized with hour, minute and second information,
it keeps time in seconds as a member value.
"""
class Time(object):
def __init__(self, h, m, s):
self.seconds = s + m*60 + h*60*60
def convert(self):
"""Return the hour, minute and second corresponding to the time value of the object.
"""
h = self.seconds/(3600)
m = (self.seconds - h*3600)/60
s = self.seconds - h*3600 - m*60
return (h,m,s)
def __sub__(self, other):
"""Subtract two times from each other and return
a new time object.
"""
(h,m,s) = self.convert()
newt = Time(h,m,s)
newt.seconds -= other.seconds
if newt.seconds < 0 :
newt.seconds = 0
return newt
def timestr(self,val):
if val < 10:
return "0" + str(val)
else:
return str(val)
def __str__(self):
(h,m,s) = self.convert()
return '%s:%s:%s' %(self.timestr(h),\
self.timestr(m),\
self.timestr(s))
if __name__ == "__main__":
##Code to test the class
p1 = Time(10, 3, 4)
p2 = Time(12, 10, 3)
print "p1:", p1
print "p2:", p2
print "p1-p2:", p1 - p2
print "p2-p1:", p2 - p1
A class for the Game of Thrones fans. It really does not do anything useful or provide a useful educational experience.
Code:
"""A class for the Game of Thrones fans. It really does not
do anything useful or provide a useful educational experience.
"""
class Hodor(object):
def __init__ (self):
self.word = 'Hodor'
def hodor (self):
self.word = self.word + ' Hodor'
def __str__ (self):
return self.word
if __name__ == "__main__":
h = Hodor()
print str(h)
h.hodor()
print str(h)
Class example for storing complex object for storing both information and supporting methods that are specific to this object. It illustrates the use of another class (Point2d) and its methods.
A business is populated with information from Yelp for businessses near RPI. An example use is given below.
Methods:
Create a new business object.
Prints a string containing the top 2 and bottom 2 reviews for this business, based on the number of stars.
Code:
""" Class example for storing complex object for storing both information
and supporting methods that are specific to this object. It illustrates
the use of another class (Point2d) and its methods.
A business is populated with information from Yelp for businessses near
RPI. An example use is given below.
"""
import Point2d
import simplejson
import textwrap
class Business(object):
def __init__ ( self, name, address, latitude, longitude, reviews):
"""Create a new business object.
* self.name: name string
* self.latitude, self.longitude: floats for business location
* self.reviews: a list of lists of type [stars:string, text:string]
(number of stars the review gave and text of the review.
"""
self.name = name
self.address = address
self.location = Point2d.Point2d(latitude, longitude)
self.reviews = reviews
self.reviews.sort(reverse=True)
def __str__(self):
""" Print basic business info. """
return "%s (%s) " %(self.name,\
self.address.replace("\n", ", "))
def distance_to_rpi(self):
""" Return the Haversine distance to RPI Union. """
rpi = Point2d.Point2d(42.73, -73.68)
return self.location.haversine(rpi)
def print_best_reviews(self):
""" Prints a string containing the top 2 and bottom 2 reviews
for this business, based on the number of stars.
"""
outline = "*"*60 + "\n"
for [stars,text] in (self.reviews[:2] + self.reviews[-2:]):
for line in textwrap.wrap(text[:200]+"..." + "(%s stars)" %stars, 60):
outline += line + "\n"
outline += "*"*60 + "\n"
return outline
if __name__ == "__main__":
reviews = {}
##key: business id, value: list of reviews: [stars, text]
for line in open("reviews.json"):
m = simplejson.loads(line)
review = [ m['stars'], m['text'] ]
bid = m['business_id']
if bid not in reviews.keys():
reviews[bid] = []
reviews[bid].append( review )
i = 0
blist = {}
for line in open("businesses.json"):
m = simplejson.loads(line)
bid = m['business_id']
b = Business( m['name'], \
m['full_address'], \
m['latitude'],\
m['longitude'],
reviews[bid])
blist[bid] = b
i += 1
if i > 10:
break
bid_list = blist.keys()
while (True):
for i in range(len(bid_list)):
bid = bid_list[i]
print "%d. %s" %(i+1, blist[bid])
choice = raw_input("Enter an index (-1 to exit) ==> ")
if choice == '-1':
break
if choice.isdigit() and 1<= int(choice) <= len(bid_list):
choice = int(choice)
bid = bid_list[choice-1]
business = blist[bid]
business.print_best_reviews()
raw_input("...")
print
This module contains four functions for finding the index of the two smallest values in a list of numbers. The functions do not handle the following extra cases:
The main code tests the performance of the functions using the time module from Python. To be able to see the difference between the different solutions, very large lists are required.
Algorithm for finding the index of the two smallest values in the input list:
Algorithm for finding the index of the two smallest values in the input list:
Algorithm for finding the index of the two smallest values in the input list:
Algorithm for finding the index of the two smallest values in the input list:
Code:
"""This module contains four functions for finding the index of the
two smallest values in a list of numbers. The functions do not handle
the following extra cases:
* Lists with something other than numbers. The functions may work,
but uses the default ordering for these objects in Python.
* Lists with no values or a single value. The functions will fail
in this case.
The main code tests the performance of the functions using the time
module from Python. To be able to see the difference between the
different solutions, very large lists are required.
"""
import time
import random
def smallest_two1(l):
"""Algorithm for finding the index of the two smallest values
in the input list:
#. i0: smallest value, i1: second smallest value
#. initialize i0,i1 to 0, 1 in correct order
#. go through the indices of the rest of the list one by one
#. first check against the **second smallest**, and then the **smallest**,
switch values as appropriate
"""
if l[0] <= l[1]:
i0,i1 = 0,1
else:
i0,i1 = 1,0
for i in range(2,len(l)):
if l[i] <= l[i1]:
if l[i] <= l[i0]:
i1 = i0
i0 = i
else:
i1 = i
return (i0,i1)
def smallest_two2(l):
"""Algorithm for finding the index of the two smallest values
in the input list:
#. i0: smallest value, i1: second smallest value
#. initialize i0,i1 to 0, 1 in correct order
#. go through the indices of the rest of the list one by one
#. first check against the **smallest**, and then the **second smallest**,
switch values as appropriate
"""
if l[0] <= l[1]:
i0,i1 = 0,1
else:
i0,i1 = 1,0
for i in range(2,len(l)):
if l[i] <= l[i0]:
i1 = i0
i0 = i
elif l[i] <= l[i1]:
i1 = i
return (i0,i1)
def smallest_two3(l):
"""Algorithm for finding the index of the two smallest values
in the input list:
#. make a copy of the list and sort
#. find the two smallest values
#. find their indices using :func:list.index:.
#. if the two smallest values are the same, the index
of the second smallest must come after the index of the smallest.
"""
lcopy = l[:]
lcopy.sort()
v0 = lcopy[0]
v1 = lcopy[1]
i0 = l.index ( v0 )
if v0 != v1:
i1 = l.index ( v1 )
else:
i1 = l.index ( v1, i0+1)
return (i0, i1)
def smallest_two4(l):
"""Algorithm for finding the index of the two smallest values
in the input list:
#. find the min value, and then find the index of the min value
#. make a copy of the list, remove the smallest value (this will
remove the smallest value that is found first in the list hence
will match the index.)
#. find the next smallest value and its index.
"""
v0 = min(l)
i0 = l.index ( v0 )
lcopy = l[:]
lcopy.pop(i0)
v1 = min(lcopy)
if v0 != v1:
i1 = l.index ( v1 )
else:
i1 = l.index ( v1, i0+1 )
return (i0, i1)
if __name__ == "__main__":
##Test cases, two LONG lists in random order.
l = range(2000000)
random.shuffle(l)
l2 = range(2000000)
random.shuffle(l2)
print "Running version 1 of the smallest two function"
t0 = time.time()
(i0,i1) = smallest_two1(l)
(i0,i1) = smallest_two1(l2)
t1 = time.time()
print "It took %.3f seconds" %((t1 - t0)/2.0)
print "Running version 2 of the smallest two function"
t0 = time.time()
(i0,i1) = smallest_two2(l)
(i0,i1) = smallest_two2(l2)
t1 = time.time()
print "It took %.3f seconds" %((t1 - t0)/2.0)
print "Running version 3 of the smallest two function"
t0 = time.time()
(i0,i1) = smallest_two3(l)
(i0,i1) = smallest_two3(l2)
t1 = time.time()
print "It took %.3f seconds" %((t1 - t0)/2.0)
print "Running version 4 of the smallest two function"
t0 = time.time()
(i0,i1) = smallest_two4(l)
(i0,i1) = smallest_two4(l2)
t1 = time.time()
print "It took %.3f seconds" %((t1 - t0)/2.0)
Module for testing the smallest two module using the :mod:nose test module. Currently, it only includes test cases for two of the functions.
Code:
"""Module for testing the smallest two module using the :mod:nose test
module. Currently, it only includes test cases for two of the functions.
"""
import nose
from smallest_two import *
def test1():
(i0, i1) = smallest_two1([1,2,3,4])
assert i0==0 and i1==1
def test2():
(i0, i1) = smallest_two1([1,1,3,4])
assert i0==0 and i1==1
def test3():
(i0, i1) = smallest_two1([4,2,1,3])
assert i0==2 and i1==1
def test4():
(i0,i1) = smallest_two1([2,4,3,1])
assert i0==3 and i1==0
def test5():
(i0, i1) = smallest_two4([1,2,3,4])
assert i0==0 and i1==1
def test6():
(i0, i1) = smallest_two4([1,1,3,4])
assert i0==0 and i1==1
def test7():
(i0, i1) = smallest_two4([4,2,1,3])
assert i0==2 and i1==1
def test8():
(i0,i1) = smallest_two4([2,4,3,1])
assert i0==3 and i1==0
if __name__ == "__main__":
nose.runmodule()
This module shows two methods for search to find an index and a third one is shown for comparison of computational complexity.
We can see the difference in running times by executing this module.
Linear search algorithm: O(N) complexity as it checks each item. Returns the index of the val if it is found in the list, and None if it is not found.
Binary search algorithm: O(log N) complexity as it checks about log N of the items. Assumes the input list is sorted.
Returns the index of the val if it is found in the list, and None if it is not found.
Algorithm:
initialize: min:beginning of list, max:end of list, mid:mid point
Example function for testing other functions. Note: it takes a function as an argument and then calls this function.
Code:
"""This module shows two methods for search to find an index and
a third one is shown for comparison of computational complexity.
* Method 1: linear search, O(N) complexity
* Method 2: binary search, O(log N) complexity
* Method 3: set solution (for membership check only), O(1) complexity
We can see the difference in running times by executing this module.
"""
import random
import time
def lin_search(l, val):
""" Linear search algorithm: O(N) complexity as it checks
each item. Returns the index of the val if it is found in the
list, and None if it is not found.
"""
for i in xrange(len(l)):
if l[i] == val:
return i
return None
def bin_search(l, val):
""" Binary search algorithm: O(log N) complexity as it checks
about log N of the items. Assumes the input list is sorted.
Returns the index of the val if it is found in the list, and None
if it is not found.
Algorithm:
#. initialize: min:beginning of list, max:end of list, mid:mid point
#. while min != max
#. if mid value is the value being searched, return its index
#. if value searched is greater than mid value, adjust min value
#. else adjust max value
"""
if len(l) == 0:
return None
min = 0
max = len(l)-1
mid = (min+max)/2
while min != max:
if l[mid] < val:
min = mid+1
else:
max = mid
mid = (min+max)/2
if l[mid]==val:
return mid
return None
def test_search(f, l, testvaluelist):
"""Example function for testing other functions. Note: it takes a
function as an argument and then calls this function.
"""
start = time.time()
for val in testvaluelist:
x = f(l,val)
end = time.time()
print "Function:%s, takes \t%.10f seconds per call" \
%(f.__name__, (end-start)/float(len(testvaluelist)))
if __name__=="__main__":
#Testing three ways to search a long list. Calling with 100
#test values.
N = 100000
l = []
for i in range(N):
l.append(random.randint(1,N*100))
l.sort()
lset = set(l)
testN = 100
testvals = []
for i in range(testN):
testvals.append(random.randint(1,N))
##Linear search is O(N), which is costly. But this is the most
##general algorithm. It works for lists, returns an index and
##works when the list is not sorted.
test_search( lin_search, l, testvals)
##Binary search is O(log N), but only works when the input list
##is sorted.
test_search( bin_search, l, testvals)
#Note the set solution is O(1), or constant time, the fastest of all
##three. But it is not equivalent to the above two functions as
##it does not give an index.
test_search( set.__contains__, lset, testvals)
Illustrates the different sort functions and computes their cost.
Insertion sort (O(n^2))
Selection sort (O(n^2))
Merge sort (O(nlogn)) [recursive and iterative versions]
Internal sort, used when calling x.sort() (O(nlong n)
Note:Internal sort is compiled C++ code and does not create a copy of the list to be sorted. Hence, despite having the same complexity as merge sort, it is much faster.
Insertion sort algorithm:
For each location i in list l:
Shift all elements in l[:i] greater than l[i] to right
opening a location for l[i] in the list
Insert l[i] to the opened location (pointed by j+1)
Selection sort algorithm:
For each location i in the list l:
Find the minimum element in l[i:]
Swap ith location with the min element
Subfunction used for merge sort.
Merge two sorted lists into a single sorted list by
continuously popping the smallest element.
When while loop ends, at least one list is empty. The other
list is appended to the end of the merged list.
Recursive merge sort
If the list has more than one element:
Recursively sort the two halves of the list (l1= l[:mid], l2=l[mid:])
Merge the sorted versions of l1 and l2 and return
Else (list has already one or zero elements)
Return the list, it is already sorted by definition
Merge sort iteratively
Create a list of lists by placing each element in list l into a list
( lmerge = [ [1], [4], [3], [2] ] )
While there are more than 1 sublists in lmerge:
For each pair of lists l1, l2 in lmerge
Merge l1,l2 and append it to a new list, lmerge2
Replace lmerge with lmerge2
Return the single remaining element (sorted list) in the lmerge
Code:
"""Illustrates the different sort functions and computes their
cost.
#. Insertion sort (O(n^2))
#. Selection sort (O(n^2))
#. Merge sort (O(nlogn)) [recursive and iterative versions]
#. Internal sort, used when calling x.sort() (O(nlong n)
Note:Internal sort is compiled C++ code and does not create a
copy of the list to be sorted. Hence, despite having the same
complexity as merge sort, it is much faster.
"""
import random
import time
def sel_sort(l):
"""Selection sort algorithm:
::
For each location i in the list l:
Find the minimum element in l[i:]
Swap ith location with the min element
"""
for i in range(len(l)-1):
cur_min = l[i]
min_idx = i
for j in range(i+1,len(l)):
if l[j] <= cur_min:
cur_min = l[j]
min_idx = j
l[i], l[min_idx] = l[min_idx], l[i]
def ins_sort(l):
"""Insertion sort algorithm:
::
For each location i in list l:
Shift all elements in l[:i] greater than l[i] to right
opening a location for l[i] in the list
Insert l[i] to the opened location (pointed by j+1)
"""
for i in range(1, len(l)):
x = l[i]
j = i-1
while (j >= 0) and (l[j]>x):
l[j+1]=l[j]
j -= 1
l[j+1] = x
return l
def merge(l1, l2):
"""Subfunction used for merge sort.
::
Merge two sorted lists into a single sorted list by
continuously popping the smallest element.
When while loop ends, at least one list is empty. The other
list is appended to the end of the merged list.
"""
l = []
while (len(l1) > 0 and len(l2)>0):
if l1[0] < l2[0]:
l.append( l1.pop(0) )
else:
l.append( l2.pop(0) )
if len(l1) > 0:
l.extend(l1)
elif len(l2) > 0:
l.extend(l2)
return l
def merge_sort_rec(l):
"""Recursive merge sort
::
If the list has more than one element:
Recursively sort the two halves of the list (l1= l[:mid], l2=l[mid:])
Merge the sorted versions of l1 and l2 and return
Else (list has already one or zero elements)
Return the list, it is already sorted by definition
"""
if len(l) <= 1:
return l
else:
mid = len(l)/2
l1 = l[:mid]
l2 = l[mid:]
l1new = merge_sort_rec(l1)
l2new = merge_sort_rec(l2)
return merge( l1new, l2new )
def merge_sort_it(l):
"""Merge sort iteratively
::
Create a list of lists by placing each element in list l into a list
( lmerge = [ [1], [4], [3], [2] ] )
While there are more than 1 sublists in lmerge:
For each pair of lists l1, l2 in lmerge
Merge l1,l2 and append it to a new list, lmerge2
Replace lmerge with lmerge2
Return the single remaining element (sorted list) in the lmerge
"""
lmerge = []
for item in l:
lmerge.append( [item] )
lmerge2 = []
while len(lmerge) > 1:
for i in range(0,len(lmerge)-1,2):
ltmp = merge( lmerge[i], lmerge[i+1] )
lmerge2.append( ltmp )
if len(lmerge)%2 == 1:
lmerge2.append( lmerge[-1] )
lmerge = lmerge2
lmerge2 = []
return lmerge[0]
def time_func(f, l):
"""This is used for timing the input function on a given list
and print the timing.
"""
start = time.time()
f(l)
end = time.time()
print "Function %s, total: %.4f" %(f.__name__, end-start)
if __name__=="__main__":
x = range(5000)
random.shuffle(x) ##shuffle the list to create an unsorted list
x1 = x[:] ##we are creating copies for ins_sort and sel_sort as these do not return a new list
x2 = x[:]
time_func(ins_sort, x1) ##n squared
time_func(sel_sort, x2) ##n squared
time_func(merge_sort_rec, x) ##n * log n
time_func(merge_sort_it, x) ## n * log n
time_func(list.sort, x) ##internal sort, n * log n
This modules draws the Sierpinks triangles up to a given depth using the Tkinter module. It illustrates the use of recursion in drawing self-similar patterns in smaller and smaller regions of the larger triangle.
See:
Recursive function to draw Sierpinski triangles in chart_1 within coordinates: lowleft, top, lowright.
At each call, the call level is increased. The function ends when maxlevel is reached.
Code:
"""This modules draws the Sierpinks triangles up to a given depth
using the Tkinter module. It illustrates the use of recursion in
drawing self-similar patterns in smaller and smaller regions of the
larger triangle.
See:
http://en.wikipedia.org/wiki/Sierpinski_triangle
"""
import Tkinter as tk
import math
def sierpinski(chart_1, lowleft, top, lowright, level, maxlevel):
"""Recursive function to draw Sierpinski triangles in chart_1
within coordinates: lowleft, top, lowright.
At each call, the call level is increased. The function ends
when maxlevel is reached.
"""
if level == maxlevel:
return ##Base case to terminate the process.
else:
chart_1.create_polygon([lowleft, top, lowright], fill="red")
leftmid = (lowleft[0]+top[0])/2,(lowleft[1]+top[1])/2
rightmid = (lowright[0]+top[0])/2,(lowright[1]+top[1])/2
bottommid = (lowright[0]+lowleft[0])/2,(lowright[1]+lowleft[1])/2
chart_1.create_polygon([leftmid, rightmid, bottommid], fill="white")
chart_1.update()
##Recursive calls to redraw triangles in three corners of the
##current triangle area
level += 1
sierpinski(chart_1, lowleft, leftmid, bottommid, level,maxlevel)
sierpinski(chart_1, leftmid, top, rightmid, level,maxlevel)
sierpinski(chart_1, bottommid, rightmid, lowright, level,maxlevel)
def restart(chart):
"""Redraws the Sierpinski triangle, but increasing the depth
at each time.
"""
chart_1.delete(tk.ALL)
sierpinski(chart, (0,600), (300,600-300*math.sqrt(3)), (600,600), \
0, maxlevel_var[0])
maxlevel_var[0] += 1
if __name__ == "__main__":
root = tk.Tk()
root.title("Sierpinski Recursion Example")
chart_1 = tk.Canvas(root, width=600, height=600, background="white")
chart_1.grid(row=0, column=0)
## Initially max level is 1, which will draw
##a simple triangle with an inverted triangle inside.
maxlevel_var = [1]
restart(chart_1) ## Draw the Sierpinski triangles once
root.frame = tk.Frame(root)
root.frame.button = tk.Button(root.frame,\
text="quit", \
command=lambda:root.destroy())
root.frame.button2 = tk.Button(root.frame, \
text="draw again!", \
command=lambda:restart(chart_1))
root.frame.button.grid()
root.frame.button2.grid()
root.frame.grid()
root.mainloop()
A recursive solution to the lego problem from Homework #4. This solution performs mix and match, but does not check for two different ways to satisfy the 4x2 lego substitution.
Find replacement legos recursively. For the replacement, all the substitutions should be satisfiable. The function returns:
list of substituted legos, list of remaining legos
if a substitition is found. Otherwise, it returns None.
Code:
"""A recursive solution to the lego problem from Homework #4. This
solution performs mix and match, but does not check for two different
ways to satisfy the 4x2 lego substitution.
"""
def read_legos(fname):
"""Util function to read legos from a file. Assumes each line
contains a lego type, comma, number of that lego type.
"""
legos = []
for line in open(fname):
m = line.strip().split(",")
lego = m[0].strip()
cnt = int(m[1])
legos.extend( [lego]*cnt )
return legos
def subs(lego):
"""Returns a substitution for a lego as a list of legos, by finding
the largest possible pieces that taken together make up the input lego.
"""
if lego == '1x1':
return [] ## No substitution exists.
elif lego == '2x1':
return ['1x1']*2
elif lego == '2x2':
return ['2x1']*2
elif lego == '4x2':
return ['4x1']*2
elif lego == '4x1':
return ['2x1']*2
def replacement(lego, legolist):
"""Find replacement legos recursively. For the replacement,
all the substitutions should be satisfiable. The function
returns:
list of substituted legos, list of remaining legos
if a substitition is found. Otherwise, it returns None.
"""
if lego in legolist:
remaining = legolist[:]
remaining.remove(lego)
return [ lego ], remaining ##Searched lego is found, return success
else:
subslegos = subs(lego)
if len(subslegos) == 0:
return None ## No more substitutions, return failure
##Substitutions exist, search for each one separately
##Modify the lego list alog the way.
lego1 = subslegos[0]
lego2 = subslegos[1]
result1 = replacement(lego1, legolist)
## If no substitution for the first piece, report failure
if result1 == None:
return None
result2 = replacement(lego2, result1[1])
## If no substitution for the first piece, report failure
if result2 == None:
return None
##If we came here, both pieces have substitutions, report their
##combination
return result1[0] + result2[0], result2[1]
if __name__ == "__main__":
mylegos = read_legos("mylegos.txt")
reqlegos = read_legos("legosrequired.txt")
foundlegos = []
missinglegos = []
for lego in reqlegos:
result = replacement(lego, mylegos)
if result == None: # No substitutions were found.
missinglegos.append(lego)
else: ##A substitution was found.
match = result[0]
remaining_legos = result[1]
foundlegos.append( [lego, match] )
mylegos = remaining_legos
print "My remaining legos", mylegos
print "Required legos", reqlegos
print "Still missing legos", missinglegos
print
print "Found legos:"
for item in foundlegos:
print item
Finding modes of a list: finding the value that occurs the most.
n = number of items in a list
m = number of unique items in a list
(floats are not great keys for dictionaries)
return one mode, or all modes?
Code:
"""
Finding modes of a list: finding the value that
occurs the most.
Algorithms:
-----------
#. Find a set of all distinct values
Count each one
#. Create a dictionary with values as key,
counts as value
Find the max count
Find the value with max count
#. Pop a value
Find its count
Delete all copies of this value
Repeat until I go through all values and
report value with max count
#. Sort the list
Go through the list once
Count values and keep track of the max count/max value
Design parameters:
-------------------
* n = number of items in a list
* m = number of unique items in a list
* Are the values integer, float, string?
(floats are not great keys for dictionaries)
* return one mode, or all modes?
Good test values:
-------------------
Test for correctness:
----------------------
* [1.001, 1.0010, 1 ] Float vs int
* With one mode
* With multiple modes
* Mode is at the end
* Mode is scattered in the list
Test for performance:
----------------------
* Random lists of numbers
* vary: n, m
* Fix n, change the number of distinct values
* Fix m, change the number of values in list
"""
import time
import random
def test_perf(f,l):
"""Testing the running time of a function."""
start = time.time()
f(l)
end = time.time()
print "Function %s took %.4f seconds" %(f.__name__, end-start)
def test_correctness(f):
"""Testing the correctness of a function."""
l1 = [1,2,2,1]
l2 = [1,1,2,2,2,3,4]
l3 = [1,1,2,2,2]
print "Function", f.__name__
print "Expected mode: 1,2 returned: ", f(l1)
print "Expected mode: 2 returned: ", f(l2)
print "Expected mode: 2 returned: ", f(l3)
def mode_sort(l):
"""Sort the list and go through it once, keeping track
of the most frequent value. As number of distinct values
increase, more if statements are executed.
Sort: O(nlogn)
Go through the list: O(n)
The most expensive step: O(nlogn), but internal sort is compiled
and the linear step may also be a significant factor.
"""
lc = l[:] ##linear in size of list
lc.sort() ##nlogn
cur = lc[0]
cnt = 1
maxcnt = 0
modeval = [cur]
for item in lc[1:]: ##linear in size of list
if item == cur:
cnt += 1
else:
if cnt > maxcnt:
maxcnt = cnt
modeval = [cur]
elif cnt == maxcnt:
modeval.append(cur)
cur = item
cnt = 1
if cnt > maxcnt:
maxcnt = cnt
modeval = [cur]
elif cnt == maxcnt:
modeval.append(cur)
return modeval
def mode_sort2(l):
"""Slightly optimized version of the sort based algorithm. Minor
adjustments by not creating a copy of a list in the main for
loop. The xrange also iterates over the numbers without creating a
list of n values.
The complexity is the same as mode_sort, hence the savings is
small.
"""
lc = l[:]
lc.sort() ##nlogn
cur = lc[0]
cnt = 1
maxcnt = 0
modeval = [cur]
for i in xrange(1,len(lc)): ##linear in size of list, but no copy is created
item = lc[i]
if item == cur:
cnt += 1
else:
if cnt > maxcnt:
maxcnt = cnt
modeval = [cur]
elif cnt == maxcnt:
modeval.append(cur)
cur = item
cnt = 1
if cnt > maxcnt:
maxcnt = cnt
modeval = [cur]
elif cnt == maxcnt:
modeval.append(cur)
return modeval
def mode_dict(l):
"""Unoptimized dictionary solution. It creates the dictionary
keys on the fly. Checking if a key is in the LIST of keys is
O(m), where m is the number of distinct values in l.
The main for loop is repeated n times, and contains an O(m)
check for the keys. So, the complexity is O(mn). If m is large,
this is similar to O(n^2) which is expensive.
"""
md = {}
for item in l: ## n items in list
if item in md.keys(): ##m distinct values, linear in m
md[item] += 1
else:
md[item] = 1
maxval = max( md.values() ) ##linear in m
modes = []
for key in md.keys(): #linear in m
if md[key] == maxval:
modes.append(key)
return modes
def mode_dict2(l):
"""Optimized version of mode_dict. Create all the keys
first, and then increment the values one by one. The
overall complexity is O(n).
"""
md = {}
for item in set(l): ##O(n)
#go through every item in list, hash to create a set
##as set(l) is compiled, it is also faster for this reason.
md[item] = 0
for item in l: ##go through n items in list
md[item] += 1 ##constant time
maxval = max( md.values() ) ##linear in m
modes = []
for key in md.keys(): #linear in m
if md[key] == maxval:
modes.append(key)
return modes
def mode_set(l):
"""Another solution with complexity O(mn)
set of l: linear in length of l, so O(n)
for each distinct value (m), run count function
Note: count is linear in length of l,
EXCEPT count is compiled, so slightly faster
Total O(mn)
"""
maxcnt = 0
modevals = []
for item in set(l):
cnt = l.count(item)
if cnt > maxcnt:
maxcnt = cnt
modevals = [ item ]
elif cnt == maxcnt:
modevals.append( item )
return modevals
if __name__ == "__main__":
#test_correctness(mode_dict)
n = 1000
m = 900
l = []
for i in range(n):
l.append( random.randint(1,m) )
##Fastest methods are based on sort, and optimized dictionary
test_perf(mode_sort, l)
test_perf(mode_dict, l)
test_perf(mode_dict2, l)
test_perf(mode_set, l)