Compsci 101, Fall 2016, Lab 11
The lab
In this lab you will:
- Experiment with recursion
- Experiment with regular expressions
There is code to snarf for lab 11, that
code is also here.
Getting Credit for Lab 11
To get credit for this lab, you will need to do the following by Sunday
night.
Feedback on UTAs
Login to Sakai and give us feedback on your lab UTAs. There is a link
on the Sakai announcements page to the form for UTA feedback.
Part 1a: Recursion
First discuss the following mystery function (where lines are numbered)
and calls to it.
How many recursive calls are there for each call (the original call is not
a recursive call)? What is the return value in each case?
1 def mystery(x, y):
2 if x <= y:
3 return 1
4 return 2 + mystery(x-1, y+1)
print mystery(5,5)
print mystery(7,4)
print mystery(10,3)
Answer these questions on the online form.
- Which numbered line has the recursive call?
- How many recursive calls are there for mystery(5,5) (do not count
the original call mystery(5,5))?
- What is the output for mystery(5,5)?
- How many recursive calls are there for mystery(7,4) (do not count
the original call mystery(7,4))?
- What is the output for mystery(7,4)?
- How many recursive calls are there for mystery(10,3) (do not count
the original call mystery(10,3))?
- What is the output for mystery(10,3)?
Part 1b : Recursion
Now you will rework one older APT by rewriting it to solve it with
simpler recursion - Acronym
If you have already done it, you could comment out the code you
have and put in new code that uses recursion.
Just use the Test link to test it out since we will not submit this as an
APT.
Answer these questions:
- What is the smaller problem to solve?
- What is the way out to stop the recursion, the smallest problem that
does not use recursion
- What is the code for this function that is recursive?
Part 2: Regular Expressions
Snarf the code for this part of the lab.
In creating regular expressions you'll need to understand these aspects of
regular expressions. You can see more about regular expressions in Python
via Python documentation, e.g.,
https://docs.python.org/2/library/re.html
Answer each question on the online form.
You can run module RegexDemo.py which accepts regex expressions and
prints those words that match the expression. It reads data from a file
that has a lot of words, with one word per line.
For example, entering the regex
^\w\w\wz$
matches 12 words as shown below. There have to be four characters, the
first three are any letter and the fourth letter must be z.
benz
buzz
cruz
fuzz
jazz
katz
lutz
quiz
ritz
salz
suez
whiz
Whereas entering
\w\w\wz$
matches 65 words, the last four of which are shown here. In this case
there are restrictions on the last four characters but none at the
beginning of the string. There must be at least four characters and the
last character must be z.
vasquez
velasquez
waltz
whiz
You'll use expressions provided to you and develop expressions to answer
questions about understanding regular expressions.
|
regex
|
purpose
|
|
.
|
any character
|
|
*
|
zero or more of previous regex
|
|
\w
|
any alphanumeric character (and _)
|
|
+
|
one or more of previous regex
|
|
\s
|
any whitespace character
|
|
*? or +?
|
non-greedy version of either * or +
|
|
\d
|
any digit character
|
|
()
|
tag/group a regular expression
|
|
[]
|
character class, e.g., [A-Z] or [aeiou]
|
|
\1, \2, ..
|
match numbered tagged/grouped regex
|
|
{n}
|
n occurrences of preceding regex
|
|
^
|
beginning of line/string
|
|
[^...]
|
not the characters in the class, e.g., [^aeiou]
|
|
$
|
end of line/string
|
Part 2a) Answer these questions on the online form.
- How many words end in the letter 'a'?
Answer: there are 898 words that end with 'a',
the regex used is a$ and the last word printed is zomba
-
The regex [aeiou]$ indicates there are 6988 words that end with a
vowel. How many words start and end with a vowel?
-
How many words start and end with the same vowel, e.g., extreme and aorta
are two such words?
-
The words obsequious and pharmacopoeia each contain four vowels in a
row. How many words contain four consecutive vowels?
-
'radar' is a
five-letter palindrome, it reads the same from left-to-right as from
right-to-left. The regex
^(\w)(\w)\w\2\1$
shows there are 9 five-letter
palindromes. How many seven letter palindromes are there and what are they?
-
How many seven letter words end with 'ing'?
-
How many seven letter words start with 's' and end with 'ings'?
-
There are 7955 words containing 'in', that match either in or (in) for
example. The word 'maintaining' contains three occurrences of 'in'. How
many words are there that contain three occurrences of 'in'?
-
How many words contain any two letter sequence repeated three times, not
just 'in' as in the previous question. For example, 'anticompetitive' has the
sequence 'ti' repeated.
-
The regular expression ((\w).\2).*\1 matches the 12 words below (and only
these words). Explain why these words match, that is how the regular
expression works. Note that the expression (\w) is the second tagged
expression since the first left parenthesis begins the first tagged
expression -- you count tagged expressions by the order in which the
left-parenthesis appears in reading left-to-right.
amalgamate
amalgamated
amalgamates
amalgamatating
amalgamatation
assesses
dereference
monotonous
monotonously
monotonousness
possesses
sensitivities
-
Make up your own "word puzzle" and enter it in the lab form. Include the question, the answer, and the regex you developed to answer it.
Part 2b)
You'll need to modify RegexDemo.py to also look at the other file
provided, and answer some questions.
- A second file provided is littlebrother.txt - the free text of the
book entitled Little Brother by Cory Doctorow. Modify the main section
of RegexDemo.py to ask the user if they want to read in words.txt or the
book. Then read in the appropriate file. If you read in the book you
should put it into the format of a list of words from the book, not a list of
lines. The file words.txt has one word per line in it but
littlebrother.txt has many words per line.
Enter the main code you wrote in the form.
- Calculate how many unique emails are in the book littlebrother.txt.
Enter the regular expression you used and the number.
- Calculate how many web sites start with http: are in
littlebrother.txt.
Enter the regular expression you used and the number.
- Calculate how many words in littlebrother.txt have a number in them
(examples: 68, 6A, word2).
Enter the regular expression you used and the number.
- Calculate how many numbers are in littlebrother.txt (Don't count words
that have
any symbols that are not digits. 456 is good, but 786-3452 and 6A would not count)
Enter the regular expression you used and the number.
- Calculate how many words in littlebrother.txt have at least one digit
and at least one letter (examples: 63A, word26).
Enter the regular expression you used and the number.
Submit
After submitting the lab form, then submit the code you wrote
with ambient/websubmit. Use lab11 as the submission folder.
Last Lab
This is the last lab of the semester. Celebrate!