Compsci 101, Fall 2014, Transform HOWTO

(with more detail and recommendations on how to start)

Overview

Here is how to get started with this assignment.

This assignment is about looping over lists, reading and writing files, and using functions to transform data from one form to another. It's also about reading code provided to you and trying to fit new code into the existing code. This is a skill that's needed in writing all kinds of programs. You have to do three things:

We have given you a lot of code for this assignment and already do a lot of things for you, such as setting up a nice interface with menus. We have given you four files, but you only need to modify two of them FileTransform.py and Transform.py. Even in those files we have done a lot of work already for you.

Step 1: Run the program to see what it does

Before you make any changes, run FileTransform.py to see what it does. (Be sure to highlight it in Eclipse before running it). It should do the following:

Step 2: Create a simple data file called simple.txt

It is so much easier to work with a simple datafile and you are required to write one as part of this assignment, so start the file now. Just put two short sentences in it on at least two lines. You can add to it later as you work on transformations.

Create a file to put in the data folder, by clicking on the data folder to highlight it, then select New>Other>General>Untitled Text File.

You may find that this simple datafile is the most helpful in debugging your program. You can also create additional files to test your program. You should not run it only on large files til you have your program working!

Step 3: Make changes in FileTransform.py

This section provides a general overview of FileTransform.py in which you will modify only two functions, write_words and transform . You should not modify any other functions in this file.

When you run FileTransform.py, in addition to executing the code in this file, it will execute code in the other three files.

You should get these functions working before testing the pig-latin and rot13 functions you will write in Transforms.py that are described in a later step.

Here is what FileTransform.py does. It has four functions in it.

  1. tranform_file - This is the main function that starts and it has already been written for you. You do not need to modify it.
    def transform_file ():
        """
        do the work for this program: 
          - prompt user for file
          - prompt user for transform function
          - apply transform to each word
          - write transformed data to a file specified by user
        """
        file = Input.choose_file_to_open()
        if file == None:
            return
        words = get_words(file)
        file.close()
        
        func = Input.choose_transform()
        if func == None:
            return
        twords = transform(words, func)
        
        file = Input.choose_file_to_save()
        if file == None:
            return
        write_words(file, twords)
        file.close();
    
    

  2. The function get_words already works. It takes a file that is open for reading, reads it line by line and for each line it converts the line into a list of words on that line. This function returns a list of lists of words as mentioned above.

    Do not modify this function, it already works.

  3. the function transform - You need to modify this function to apply the transformation to each word from the file. (see below).
  4. the function write_words - You need to modify this function to write all the transformed words to a file (see below)

Step 3A: Modify the transform function in FileTransform.py

Here is the current code in this function:

def transform(words, func):
    '''
    apply func each word in words and return the result
     - words is a list of lists,
       where each sublist represents a line from the original file
     - the result is a list of lists, 
       where in each sublist each word has been 
       transformed by applying func to the word
    '''
    # TODO: change each word in the list of lists, using func to accomplish the change
    # FOR EXAMPLE:
    newWords = words[:]                 # copy list
    newWords[0][0] = func(words[0][0])  # change first word by calling func on it  
    return newWords

Currently this function applies the transformation function func to only the first word in the file. To see that, run Transforms and this time select the simple.txt file you created, and select the Transform to UPPERCASE option. And select a file to write to(again, no file will be written to yet, since you have not implemented that).

You should see your simple file printed in the console window with the first word in uppercase, as only the first word was converted to uppercase.

Modify this function to apply the transformation to all the words in words, and return a new list of list of transformed words.

If your code is correct, you should be able to test it on your simple.txt file and use the UPPERCASE transformation and see all the words in uppercase.

Step 3B: Modify the write_words function in FileTransform.py

Whenever you print something to the console, you should also write it to the file open for output. This way the transformed words will be written to the console in Eclipse and saved to a file. The current version of function write_words writes to the console only, but it takes a file parameter that you'll use when you modify the function. This is the code you're given:

for line in words: for w in line: print w + " ", print Note the inner loop has a comma in the print statement, that keeps the output on a single line. The print statement after the inner loop moves to the next line, because one sub-list of words, which is one line of the transformed file, has been written completely to the console.

You must modify this function to write the words to a file as well as to the console. To write to a file, you use the file .write method which takes a string as a parameter, two uses are shown below for a variable named outfile:

outfile.write(word) # to end a line, write the newline character outfile.write("\n") To make sure the output file is completely written, the last line of your code must close the file as below: outfile.close() This will ensure that all writing to the file happens, that the file is flushed and closed properly.

This means that completing write_words requires mirroring the print calls to also write to a file using file.write as follows:

When you have this code correct, you should be able to run FileTransforms.py and give a name of a new file to write to. Then you may have to refresh the data folder in eclipse to see the new file.

Step 4 - Modifying functions in Transforms.py

Please note when editing this file that you do not remove the comments we provide. In particular the GUI menu looks for a comment right after the def statement in the functions in Transforms.py and if you remove that comment the program will not run. It is ok to remove a comment that is beside code you need to replace.

In this part you should add to your simple.txt data file. Put in words that represent all the cases you will need to test for piglatin. Put in words you can test for rot13.

Modify the Python module named Transforms.py. In this module, start by writing the functions transform_pigify and transform_unpigify that each have a single string parameter and return a string that is either the pig-latin equivalent or that is reversed from pig-latin to normal text, respectively. You are given two transform functions, transform_identity and transform_uppercase, that serve as simple examples.

Pig-latin

These are the rules you should use to convert a word into pig-latin. We're using a hyphen to facilitate translating back from pig-latin to English. In creating pig-latin you will not be concerned with punctuation, so treat every character as either a vowel or not-a-vowel, and punctuation falls into the second category.
  1. If a word begins with 'a', 'e', 'i', 'o', or 'u', then append the string "-way" to form the pig-latin equivalent. Examples:
    Word pig-latin Equivalent
    anchor anchor-way
    elegant elegant-way
    oasis oasis-way
    isthmus isthmus-way
    only only-way

  2. If a word begins with a non-vowel (we will call this a consonant, but it could be a number, punctuation, or something else), move the prefix before the first vowel to the end with "ay" appended. Use a hyphen and treat 'y' as a vowel. If 'y' is the first letter of a word it should be considered a consonant.
    Word pig-latin Equivalent
    computer omputer-cay
    slander ander-slay
    spa a-spay
    pray ay-pray
    yesterday esterday-yay
    strident ident-stray
    rhythm ythm-rhay

  3. Words that begin with a 'qu' should be treated as though the 'u' is a consonant.
    Word pig-latin Equivalent
    quiet iet-quay
    queue eue-quay
    quay ay-quay

A few words will not conform to these rules, but the rules should always be used. If a word contains no vowels it should be treated as though it starts with a vowel. For example "zzz" will be translated to "zzz-way".

It is possible that different words will be transformed to the same pig-latin form. For example, "it" is "it-way", but "wit" is also "it-way" using the rules above.

ROT13

Write a function named transform_rot13 to use a ROT13 cipher to encode/decode a string, and then use this to encode every word in a file. The function transform_rot13 returns a rotated form of its string parameter: def transform_rot13(w): s = "" # write code to concatenate characters to s return s

To convert a letter character to its ROT13 equivalent we suggest using these strings, the find method for String that returns an index, and the string indexing operator.

   a = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
   b = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm"
For example, the letter 'S' in the first string (labeled a above) is found at index 18. Note that a.find("S") will evaluate to 18. Then b[18] represents the encoding for 'S', namely 'F'. You can reverse the roles of the strings a and b since the ROT13 cipher is symmetric.

There are other ways to do the ROT13 cipher using the functions chr, ord and the % operator, but the approach suggested by using indexing and the strings above is much easier to get working.

You can identify non-letters by using the return value of the string find method which will be -1 for non-letters. Alternatively you can use the Python string module to identify letters. For example, the code below generates the output that follows it.

import string for a in "ABCDefg123!,#": if not a in string.letters: print a OUTPUT
    1
    2
    3
    !
    ,
    #

Note that you must import the string module to use its functions and constants, see the Python string docs for full information on the module.

Step 5: Now you can try the extra credit if you want.

We will let you figure that out.