Compsci 101, Fall 2012, October 17

By entering your name/net-id below you indicate you are in class on October 17 to answer these questions and that you have answered them. Your name should not appear unless you are in class when answering these questions.
Name____________________   net-id _________       Name____________________   net-id _________       

Name____________________   net-id _________       Name____________________   net-id _________       

    DictionaryTimings.py

    In DictionaryTimings.py two functions: linear and binary do similar tasks: count the number of times every word in the parameter words occurs and return a list of pairs. They each function return a list of lists, each inner list is like ["apple", 57] indicating the word "apple" occurs 57 times in the list of strings passed as a parameter. So these functions might return the list below for a data source that contains the three words occuring as many times as shown.
       [ ["ant", 15], ["bat", 3], ["dog", 8] ]
    
    

  1. What line is used to open a file rather than a URL in the function tradeoff?

    1. f = urllib.open(datasource)

    2. f = open(datasource)

    3. f.close()

  2. What's the name of the module that facilitates finding the current time?

    1. urllib

    2. time

    3. std_time

  3. What is the purpose of the boolean varaible found initialized to False in the outerloop of linear?

    1. it indicates if w has been found in data in the inner loop

    2. it indicates if w is the first element of data

    3. it indicates if w occurs in more then one pair of data
  4. Which line is executed in linear the first time a word is found

    1. elt[1] += 1

    2. data.append([w,1])

    3. found = True

  5. It takes 0.831 seconds to process melville.txt which has 3,128 unique words and about 14,000 total words on ola's desktop machine. If this file is copied onto itself so it's twice as big (available as melville2.txt), but just duplicated twice then it takes 1.662 seconds to process: still 3,128 unique words but now about 28,000 total words. For a file that's four melville's concatenated together it takes 3.334 seconds, this time about 56,000 words, still 3,128 unique words (available as melville4.txt).

    About how long to do melville8.txt based on these numbers?

    1. 6 second

    2. 12 seconds

    3. 18 seconds

  6. In the function binary what is the purpose of the first if statement? (more than one may be correct)

    1. It ensures that checking to see if a word already appears doesn't cause an illegal-index error.

    2. It adds a new word to data that's never been seen when it's the last word alphabetically.

    3. It could be swapped with the elif check and still work.

  7. What is true of the code in the else statement? (more than one may be correct)

    1. It updates the number of occurrences when w has been seen before.

    2. It is not executed if w is the last word in data

    3. It is executed only when w is the first word in data