Name____________________ net-id _________ Name____________________ net-id _________ Name____________________ net-id _________ Name____________________ net-id _________
slow_fingerp and
fast_fingerp do similar tasks: count the number of
times every word in the parameter datasource occurs
and return a structure of pairs. The slow_fingerp
function
returns a list of lists, each inner list is like ["apple",
57] indicating the word "apple" occurs 57 times in the
datasource passed as a parameter. So the function might return
the list below for a data source that contains the three words
occuring as many times as shown.
[ ["ant", 15], ["bat", 3], ["dog", 8] ]The function
fast_fingerp returns a dictionary, we'll
explain that after you've answered some questeions about the code.
benchmark?
source = urllib.open(name)
source = open(name)
source.close()
urllib
time
std_time
found
initialized of False in the outerloop that sets
word to each word found in the datasource?
slow_fingerp the first time
a word in the datasource is found.
pair[1] += 1
stats.append([word,1])
word = word.lower
About how long to do melville8.txt based on these numbers?
finger_print2.py the function below returns
the most frequently occurring word/count pair in data,
a list of two-element lists, e.g., [['the',45],['cat',13],['dog',9]]
Which is the best explanation for why this function works, i.e., why it returns the word/count pair that occurs most often?
def max_list(data):
return sorted([(elt[1],elt[0]) for elt in data])[-1]
d that stores a dictionary as shown. What the user
types is shown in italics.
>> d
{'duke': 50, 'columbia': 30, 'stanford': 20}
>>> d.keys()
['duke', 'columbia', 'stanford']
>>> d.values()
[50, 30, 20]
>>> d.items()
[('duke', 50), ('columbia', 30), ('stanford', 20)]
>>> [x[1] for x in d.items()]
[50, 30, 20]
x[1] in the last line is replaced by
x[0] what is printed?
[20, 30, 50]
['duke', 'duke', 'duke']
['duke', 'columbia', 'stanford']
d['duke'] = 80, what is printed
by the expression d.values()?
[50, 30, 20]
[80, 30, 20]
[50, 80, 30]
for name in d:
d[name] += 10
print d
{'duke': 90, 'columbia': 40, 'stanford': 30}
{'duke': 90, 'columbia': 30, 'stanford': 20}
{'duke': 90, 'columbia': 40, 'stanford': 20}