Compsci 6, Timing FUN II, March 2
Name____________________ net-id _________
Name____________________ net-id _________
Name____________________ net-id _________
GCTiming
The questions here are about the code in
GCTiming.py in which
two algorithms are used to calculate the largest gc-window
in a strand of DNA (a likely protein-coding region).
The first data is for function max_index
, a plot and fit from
Excel is shown for the data on the right. The data is generated using a
fixed strand size (10000 nucleotides) and varying the windowsize
as shown.
|
strand size = 10000
function =
win = 1000 index = 3565 time=1.525
win = 1100 index = 3486 time=1.701
win = 1200 index = 3567 time=1.857
win = 1300 index = 3218 time=1.955
win = 1400 index = 3166 time=2.055
win = 1500 index = 2897 time=2.214
win = 1600 index = 2952 time=2.322
win = 1700 index = 2881 time=2.425
win = 1800 index = 2941 time=2.750
win = 1900 index = 2950 time=2.639
win = 2000 index = 2397 time=3.055
win = 2100 index = 2413 time=3.237
win = 2200 index = 2385 time=3.018
win = 2300 index = 2285 time=3.163
win = 2400 index = 2181 time=3.129
win = 2500 index = 2378 time=3.387
win = 2600 index = 2182 time=3.311
win = 2700 index = 2182 time=3.396
win = 2800 index = 2414 time=3.475
win = 2900 index = 2414 time=3.524
win = 3000 index = 2316 time=3.609
|
- Which is the best explanation as to why increasing the window-size
increases the running time (see function
max_index
for coding details).
- The loop over
range(0,len(strand)-windowSize+1))
will
iterate more times as the value of windowSize
increases.
- The time to execute the function call
cgratio(strand[i:i+windowSize])
increases
as the value of windowSize
increases.
- The time to pass the value of parameter
windowSize
to
the function max_index
increases as the value
of
windowSize
increases.
- If the window size is fixed, e.g., at 1000 and the size of the
strand increases from 10,000 to 80,000, what do you expect to
happen to the running time of
max_index
?
- The running time will increase linearly, e.g., as the strand size
doubles the running time will double.
- The running time will increas quadratically, e.g., as the strand
size doubles the running time will quadruple.
- The running time depends on the window-size and not on the strand
size, so the running time will remain constant.
The timings below are for a fixed strand size of 10,000 as the
window-size varies for the function running_max
.
function =
win = 1000 index = 3565 time=0.008
win = 1100 index = 3486 time=0.005
win = 1200 index = 3567 time=0.006
win = 1300 index = 3218 time=0.005
win = 1400 index = 3166 time=0.006
win = 1500 index = 2897 time=0.005
win = 1600 index = 2952 time=0.005
win = 1700 index = 2881 time=0.006
win = 1800 index = 2941 time=0.005
win = 1900 index = 2950 time=0.005
win = 2000 index = 2397 time=0.006
win = 2100 index = 2413 time=0.005
win = 2200 index = 2385 time=0.007
win = 2300 index = 2285 time=0.005
win = 2400 index = 2181 time=0.005
win = 2500 index = 2378 time=0.006
win = 2600 index = 2182 time=0.005
win = 2700 index = 2182 time=0.005
win = 2800 index = 2414 time=0.005
win = 2900 index = 2414 time=0.006
win = 3000 index = 2316 time=0.005
- Which best describes the value stored in
counters[i]
?
- The number of occurrences of 'g' or 'c' in
range(0,i)
.
- The number of occurrences of 'g' or 'c' in
range(0,i+1)
.
- The number of occurrences of 'g' or 'c' in
range(i,len(strand))
.
- Which best explains why the running time is roughly constant?
- The running time of the two loops in
running_max
do
not depend on the value of windowSize
at all.
- The first loop in
running_max
does not depend on the
value
of windowSize
and the second loop does roughly the same
amount of "work" as the first loop even though its running
time will depend to some extent on the value
of windowSize
- The running time is not constant, but it's so small that the
timings don't truly show the variation, that's why the times go up and
down as the window size increases.
- True or False: the running time of
running_max
will
increase linearly with a fixed window-size (e.g., 1000) as
the size of the strand varies (e.g., from 10,000 to 100,000).