Compsci 101, Fall 2016, Lab 11

The lab

In this lab you will:

There is code to snarf for lab 11, that code is also here.

Getting Credit for Lab 11

To get credit for this lab, you will need to do the following by Sunday night.


Feedback on UTAs

Login to Sakai and give us feedback on your lab UTAs. There is a link on the Sakai announcements page to the form for UTA feedback.


Part 1a: Recursion

First discuss the following mystery function (where lines are numbered) and calls to it. How many recursive calls are there for each call (the original call is not a recursive call)? What is the return value in each case?

1  def mystery(x, y):
2      if x <= y:
3          return 1
4      return 2 + mystery(x-1, y+1)

print mystery(5,5)
print mystery(7,4)
print mystery(10,3)

Answer these questions on the online form.

  1. Which numbered line has the recursive call?
  2. How many recursive calls are there for mystery(5,5) (do not count the original call mystery(5,5))?
  3. What is the output for mystery(5,5)?
  4. How many recursive calls are there for mystery(7,4) (do not count the original call mystery(7,4))?
  5. What is the output for mystery(7,4)?
  6. How many recursive calls are there for mystery(10,3) (do not count the original call mystery(10,3))?
  7. What is the output for mystery(10,3)?

Part 1b : Recursion

Now you will rework one older APT by rewriting it to solve it with simpler recursion - Acronym

If you have already done it, you could comment out the code you have and put in new code that uses recursion.

Just use the Test link to test it out since we will not submit this as an APT.

Answer these questions:

  1. What is the smaller problem to solve?
  2. What is the way out to stop the recursion, the smallest problem that does not use recursion
  3. What is the code for this function that is recursive?

Part 2: Regular Expressions

Snarf the code for this part of the lab.

In creating regular expressions you'll need to understand these aspects of regular expressions. You can see more about regular expressions in Python via Python documentation, e.g., https://docs.python.org/2/library/re.html

Answer each question on the online form. You can run module RegexDemo.py which accepts regex expressions and prints those words that match the expression. It reads data from a file that has a lot of words, with one word per line.

For example, entering the regex

^\w\w\wz$

matches 12 words as shown below. There have to be four characters, the first three are any letter and the fourth letter must be z.

benz
buzz
cruz
fuzz
jazz
katz
lutz
quiz
ritz
salz
suez
whiz

Whereas entering

\w\w\wz$

matches 65 words, the last four of which are shown here. In this case there are restrictions on the last four characters but none at the beginning of the string. There must be at least four characters and the last character must be z.

vasquez
velasquez
waltz
whiz

You'll use expressions provided to you and develop expressions to answer questions about understanding regular expressions.

regex purpose
. any character
* zero or more of previous regex
\w any alphanumeric character (and _)
+ one or more of previous regex
\s any whitespace character
*? or +? non-greedy version of either * or +
\d any digit character
() tag/group a regular expression
[] character class, e.g., [A-Z] or [aeiou]
\1, \2, .. match numbered tagged/grouped regex
{n} n occurrences of preceding regex
^ beginning of line/string
[^...] not the characters in the class, e.g., [^aeiou]
$ end of line/string

Part 2a) Answer these questions on the online form.

  1. How many words end in the letter 'a'?

    Answer: there are 898 words that end with 'a', the regex used is a$ and the last word printed is zomba

  2. The regex [aeiou]$ indicates there are 6988 words that end with a vowel. How many words start and end with a vowel?
  3. How many words start and end with the same vowel, e.g., extreme and aorta are two such words?
  4. The words obsequious and pharmacopoeia each contain four vowels in a row. How many words contain four consecutive vowels?
  5. 'radar' is a five-letter palindrome, it reads the same from left-to-right as from right-to-left. The regex
    ^(\w)(\w)\w\2\1$ 
    
    shows there are 9 five-letter palindromes. How many seven letter palindromes are there and what are they?
  6. How many seven letter words end with 'ing'?
  7. How many seven letter words start with 's' and end with 'ings'?
  8. There are 7955 words containing 'in', that match either in or (in) for example. The word 'maintaining' contains three occurrences of 'in'. How many words are there that contain three occurrences of 'in'?
  9. How many words contain any two letter sequence repeated three times, not just 'in' as in the previous question. For example, 'anticompetitive' has the sequence 'ti' repeated.
  10. The regular expression ((\w).\2).*\1 matches the 12 words below (and only these words). Explain why these words match, that is how the regular expression works. Note that the expression (\w) is the second tagged expression since the first left parenthesis begins the first tagged expression -- you count tagged expressions by the order in which the left-parenthesis appears in reading left-to-right.

    amalgamate
    amalgamated
    amalgamates
    amalgamatating
    amalgamatation
    assesses
    dereference
    monotonous
    monotonously
    monotonousness
    possesses
    sensitivities
    
  11. Make up your own "word puzzle" and enter it in the lab form. Include the question, the answer, and the regex you developed to answer it.

Part 2b)

You'll need to modify RegexDemo.py to also look at the other file provided, and answer some questions.

  1. A second file provided is littlebrother.txt - the free text of the book entitled Little Brother by Cory Doctorow. Modify the main section of RegexDemo.py to ask the user if they want to read in words.txt or the book. Then read in the appropriate file. If you read in the book you should put it into the format of a list of words from the book, not a list of lines. The file words.txt has one word per line in it but littlebrother.txt has many words per line.

    Enter the main code you wrote in the form.

  2. Calculate how many unique emails are in the book littlebrother.txt. Enter the regular expression you used and the number.

  3. Calculate how many web sites start with http: are in littlebrother.txt. Enter the regular expression you used and the number.

  4. Calculate how many words in littlebrother.txt have a number in them (examples: 68, 6A, word2). Enter the regular expression you used and the number.
  5. Calculate how many numbers are in littlebrother.txt (Don't count words that have any symbols that are not digits. 456 is good, but 786-3452 and 6A would not count) Enter the regular expression you used and the number.
  6. Calculate how many words in littlebrother.txt have at least one digit and at least one letter (examples: 63A, word26). Enter the regular expression you used and the number.

Submit

After submitting the lab form, then submit the code you wrote with ambient/websubmit. Use lab11 as the submission folder.

Last Lab

This is the last lab of the semester. Celebrate!