Compsci 6, Dictionary FUN March 15

Name____________________   net-id _________       Name____________________   net-id _________       

Name____________________   net-id _________       Name____________________   net-id _________

fingerprint2_py

slow_fingerp

fast_fingerp

datasource

slow_fingerp

["apple",
	  57]

   [ ["ant", 15], ["bat", 3], ["dog", 8] ]

fast_fingerp

What line is used to open a file rather than a URL in the function benchmark?
1. source = urllib.open(name)
2. source = open(name)
3. source.close()
What's the name of the module that facilitates finding the current time?
1. urllib
2. time
3. std_time
What is the purpose of the boolean varaible found initialized of False in the outerloop that sets word to each word found in the datasource?
1. it indicates if the datasource is found online
2. it indicates if the word currently being processed in the loop has been processed before
3. it indicates if the word occures as the first element of stats
Which line is executed in slow_fingerp the first time a word in the datasource is found.
1. pair[1] += 1
2. stats.append([word,1])
3. word = word.lower
It takes 0.16 seconds to process melville.txt which has 4,103 unique words and about 14,000 total words on ola's laptop. If this file is copied onto itself so it's twice as big (available as melville2.txt), but just duplicated twice then it takes 0.34 seconds to process: still 4,103 unique words but now about 28,000 total words. For a file that's four melville's concatenated together it takes 0.52 seconds, this time about 56,000 words, still 4,103 unique words (available as melville4.txt).
About how long to do melville8.txt based on these numbers?
1. 1 second
2. 2 seconds
3. 3 seconds
In finger_print2.py the function below returns the most frequently occurring word/count pair in data, a list of two-element lists, e.g., [['the',45],['cat',13],['dog',9]]
Which is the best explanation for why this function works, i.e., why it returns the word/count pair that occurs most often?
```
def max_list(data):
    return sorted([(elt[1],elt[0]) for elt in data])[-1]
```
1. lists are sorted alphabetically by word, sorting puts the last word alphabetically at the end, and this word is returned.
2. lists are sorted lexicographically, since elt[1] represents the count and is the first element of each tuple in the list being sorted the last tuple in the list is the largest number of occurrences, this last tuple is returned
3. the function returns a tuple that's the most-frequently occurring word because the list passed to the function is already ordered by frequency of occurrence.

Dictionary Basics

In a Python console the following appears illustrating some of the methods that work with dictionaries, in particular there's a variable d that stores a dictionary as shown. What the user types is shown in italics.

>> d
{'duke': 50, 'columbia': 30, 'stanford': 20}
>>> d.keys()
['duke', 'columbia', 'stanford']
>>> d.values()
[50, 30, 20]
>>> d.items()
[('duke', 50), ('columbia', 30), ('stanford', 20)]
>>> [x[1] for x in d.items()]
[50, 30, 20]

If the x[1] in the last line is replaced by x[0] what is printed?
1. [20, 30, 50]
2. ['duke', 'duke', 'duke']
3. ['duke', 'columbia', 'stanford']
After the user types d['duke'] = 80, what is printed by the expression d.values()?
1. [50, 30, 20]
2. [80, 30, 20]
3. [50, 80, 30]
The code below is executed next (after the value associated with 'duke' is changed to 80), what is printed?
```
for name in d:
    d[name] += 10
print d
```
1. {'duke': 90, 'columbia': 40, 'stanford': 30}
2. {'duke': 90, 'columbia': 30, 'stanford': 20}
3. {'duke': 90, 'columbia': 40, 'stanford': 20}