Compsci 6, Timing FUN, February 28

Name____________________   net-id _________       

Name____________________   net-id _________       

Name____________________   net-id _________       

ModeTimings

The questions here are about the code in ModeTimings.py in which there are four functions given to return the most frequenly occurring value in a list of integers. First you'll be asked some questions about the functions, then about how efficient they are.

  1. In mode_compA what's the value assigned to mindex?
    1. The largest value in the list occs (an integer).

    2. The index in occs at which the largest value of occs occurs (an integer).

    3. The number of times that m occurs in occs (an integer).

  2. In mode_compA which is the most relevant as to why the function works correctly.

    1. The number of occurrences of nums[i] is the value stored in occs[i], for 0 <= i < len(nums).

    2. The same number may occur multiple times in nums, but the count of this number only occurs once in occs.

    3. The function max works on integer values, string values, and list values --- i.e., it would return the largest list in a list of lists.

  3. What's the difference between mode_compB and mode_compC (in as few words as possible).
    
    

  4. In mode_compD why is the value mn subtracted from n in the loop and added to mindex in the return statement?

    1. Using mn helps ensure that the code will work with lists of string values as well as lists of int values.

    2. This helps ensure that the lowest value in nums has its count stored in counts[0], i.e., the use of nm maps the lowest value in nums to zero and vice-versa.

    3. The primary purpose of mn is to ensure symmetry with the use of mx.

    Several runs of the program are shown below (from ola's macbook pro):

    Run with n = 10,000 Run with n = 20,000 Run with n = 30,000
    <function mode_compA > mode = 111 time = 3.50 <function mode_compB > mode = 111 time = 3.50 <function mode_compC > mode = 111 time = 3.55 <function mode_compD > mode = 111 time = 0.00 <function mode_compA > mode = 464 time = 13.83 <function mode_compB > mode = 464 time = 14.00 <function mode_compC > mode = 464 time = 14.30 <function mode_compD > mode = 464 time = 0.01 <function mode_compA > mode = 281 time = 31.37 <function mode_compB > mode = 281 time = 32.18 <function mode_compC > mode = 281 time = 33.81 <function mode_compD > mode = 281 time = 0.01

    The function mode_compD has these timings for larger values of n (the number of elements in the list):

    n time
    100,000 0.03
    200,000 0.07
    300,000 0.10
    400,000 0.13
    500,000 0.16

  5. Predict and justify the time for mode_compD and mode_compB for a list with one million elements.
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    


SetTimings

These questions are for the code in SetTimings.py

  1. What does the function create_word do?

    1. Creates a string by shuffling the letters in "etaoins", but returns a list.

    2. Creates a string by shuffling the letters in "etaoins", and returns the string.

    3. Returns the same string every time it is called.

  2. What is the purpose of the slice operator used in diffB?

    1. It ensures that the .count(word) is applied to the part of words before the occurrence of word.

    2. It ensures that the .count(word) is applied to the part of words after the occurrence of word.

    3. It creates a copy of words so that words is not modified by the function diffB.

  3. A run of the program is shown for different sized lists. Predict the time it will take for both diffA and diffB to run on a 20,000 element list, justify your answer. n = 2000 <function diffA > diffs = 1640 time = 0.000 n = 2000 <function diffB > diffs = 1640 time = 0.074 n = 4000 <function diffA > diffs = 2769 time = 0.000 n = 4000 <function diffB > diffs = 2769 time = 0.312 n = 5000 <function diffA > diffs = 3192 time = 0.000 n = 5000 <function diffB > diffs = 3192 time = 0.488 n = 10000 <function diffA> diffs = 4343 time = 0.001 n = 10000 <function diffB> diffs = 4343 time = 1.934

SearchTiming

These questions are for the code in SearchTimings.py

  1. Why is the function linear_search named that?

    1. The code has one loop.

    2. The code does a linear search using the if i in nums check to (possibly) examine all values of nums

    3. The code uses single-letter variable names.

  2. The function bisect.bisect_left returns the least or left-most index where the parameter being searched for does occur, if it occurs at all. If the value is len(nums) the value cannot occur in the list nums. What's the purpose of the compound if statement in binary_search?

    1. It checks whether the value i is not found in list nums.

    2. It checks for whether the value i occurs, but ensures that any index of list nums is valid, i.e., will not generate an index-out-of-bounds exception.

    3. The check for index != len(nums) is un-needed, the code would work without the check.

  3. A run is shown below. How long will it take the linear_search function to search for 20,000 elements in a list of 40,000 elements? Justify. <function linear_search >:size = 10000 search = 5000 found=2500 time = 0.953 <function binary_search >:size = 10000 search = 5000 found=2500 time = 0.013 <function linear_search >:size = 10000 search = 10000 found=5000 time = 2.124 <function binary_search >:size = 10000 search = 10000 found=5000 time = 0.020 <function linear_search >:size = 20000 search = 10000 found=5000 time = 3.820 <function binary_search >:size = 20000 search = 10000 found=5000 time = 0.027 <function linear_search >:size = 20000 search = 20000 found=10000 time = 8.534 <function binary_search >:size = 20000 search = 20000 found=10000 time = 0.043

GCTiming

strand size = 10000 function = <function max_ index > win = 1000 index = 3565 time=1.525 win = 1100 index = 3486 time=1.701 win = 1200 index = 3567 time=1.857 win = 1300 index = 3218 time=1.955 win = 1400 index = 3166 time=2.055 win = 1500 index = 2897 time=2.214 win = 1600 index = 2952 time=2.322 win = 1700 index = 2881 time=2.425 win = 1800 index = 2941 time=2.750 win = 1900 index = 2950 time=2.639 win = 2000 index = 2397 time=3.055 win = 2100 index = 2413 time=3.237 win = 2200 index = 2385 time=3.018 win = 2300 index = 2285 time=3.163 win = 2400 index = 2181 time=3.129 win = 2500 index = 2378 time=3.387 win = 2600 index = 2182 time=3.311 win = 2700 index = 2182 time=3.396 win = 2800 index = 2414 time=3.475 win = 2900 index = 2414 time=3.524 win = 3000 index = 2316 time=3.609 function = <function running_max > win = 1000 index = 3565 time=0.008 win = 1100 index = 3486 time=0.005 win = 1200 index = 3567 time=0.006 win = 1300 index = 3218 time=0.005 win = 1400 index = 3166 time=0.006 win = 1500 index = 2897 time=0.005 win = 1600 index = 2952 time=0.005 win = 1700 index = 2881 time=0.006 win = 1800 index = 2941 time=0.005 win = 1900 index = 2950 time=0.005 win = 2000 index = 2397 time=0.006 win = 2100 index = 2413 time=0.005 win = 2200 index = 2385 time=0.007 win = 2300 index = 2285 time=0.005 win = 2400 index = 2181 time=0.005 win = 2500 index = 2378 time=0.006 win = 2600 index = 2182 time=0.005 win = 2700 index = 2182 time=0.005 win = 2800 index = 2414 time=0.005 win = 2900 index = 2414 time=0.006 win = 3000 index = 2316 time=0.005