Compsci 6, Spring 2011, Transform Howto

Snarfing

This assignment provides some starter files you can use, specifically FileTransform.py and some data files to convert to Piglatin. See snarfing help for information on how to use the Ambient/Snarf functionality in Eclipse. Note that the snarf URL for this semester is http://www.cs.duke.edu/courses/cps006/spring11/snarf

The howto has two parts: information on transforms and information on writing to files.

Overview

This section provides a general overview of FileTransform.py in which you'll write two functions. You can test these functions before testing the pig-latin and rot13 functions you'll write in Transforms.py that are described below.

You can run the program and it will print the contents of a file to the Eclipse console window. You'll need to transform the data read from a file and print to a file rather than to the console window. Then you'll need to write new transforms for piglatina and rot13. These are described after the Getting Started section below.

Getting Started

You'll run the provided program FileTransform.py which will make calls to code you write in Transforms.py as described below.

The main function transform_file will do three tasks/things by calling functions: you will write code that helps the second and third of these tasks.

  1. Get a file to open (using a file dialog)
  2. Get a transform to apply to each word and apply it
  3. Get a file to save transformed data to and save it

Each task/phase of the program is described below:

Open a file

When a file dialog appears a small rocketship icon will likely appear on your screen and if the file-dialog below doesn't appear, choosing that icon will enable the file dialog:

open file

Choose a transform and transform

You'll choose a file to transform -- the data directory that comes with this assignment has three files in it, you can choose to transform those or other files you have. You may have to use the file dialog to navigate to the data directory.

Reading the file creates a list of the lines in the file. Each line is represented by a list of strings/words on that line. This list of lists is returned by the function get_words which is written. The code then prompts the user for a transform by displaying a text-based menu in the Eclipse console:

transform choice

This transform should be applied to each word in the file being transformed. This means you must write code in the function transform to create a new list of lists, with each word in the parameter words being transformed by applying func. For example, if the parameter words is the two lines from a file represented by the two lists below:

[ ["This", "is", "the", "story"], ["of", "a", "streetcar", "named", "Desire"]]

Then if the parameter func represents piglatin, you'll write code to return this list:

[ ["is-Thay", "is-way", "e-thay", "ory-stay"], ["of-way", "a-way", "eetcar-stray", "amed-nay", "esire-Day"]]

The code you're given in transform returns a copy of the list of words. You should modify it to return a transformed copy.

Writing the file

After the data from the file has been transformed, a new transformed file should be written. The program uses a file-dialog to ask the user for a filename to save the data in.

save file

The program then calls write_words with a file open for writing and a list of transformed data/words. You must complete write_words so that it writes the transformed data to a file.

There's information below on writing to a file, you'll need to test both the code in FileTransform.py that reads and writes a file and the code in Transforms.py that does the transforms.


Transform functions

You'll modify the Python module named Transforms.py. In this module you must ultimately write functions pigify and unpigify that each have a single string parameter and return a string that is either the pig-latin equivalent (pigify) or that is reversed from piglatin to normal text (unpigify). You're given one transform function named identity that does a no-transform transform leaving the data unmodified (this was named vanilla in the original version of the assignment, name changed on 2/28/2011).

You'll need to test the module yourself, by running code you write to see that the two functions dealing with piglatin work as intended. Then you'll add these functions as valid transforms in FileTransform.py (see below).

My function looks something like this, but the one below doesn't work for every word:

def pigify(w): """ return a string that's a piglatin form of parameter String w """ return w+"-way" When you're reasonably sure that your Piglatin functions work, you can start working on the rot13 function. This function both encodes and decodes a string, so one function fills two roles. See the rot13 section for details on this encoding.

You'll need to use FileTransform.py to test whether your transforms work with files too. This means you'll need to add your transform functions, e.g., Transforms.piglatin to the list of functions in choose_transform and you'll need to add a corresponding string so the user can choose the function. These will be added in the lists funcs and names respectively.

Adding Functions for the User to Choose

After you write pigify, unpigify, and rot13 functions you'll need to add these functions and a text prompt for them to FileTransform.py. Currently that program uses base64 transforms and a identity do-nothing transform (the identity function was named vanilla in the original code). You'll need to add functions and prompts to the code in choose_transform so that the transforms you write can be called to transform data


Part I: Transforms

Piglatin rules

These are the rules you should use to convert a word into piglatin. We're using a hyphen to facilitate translating back from piglatin to English.

In creating piglatin you will not be concerned with punctuation, so treat every character as either a vowel or not-a-vowel, and punctuation falls into the second category.

  1. If a word begins with 'a', 'e', 'i', 'o', or 'u', then append the string "-way" to form the piglatin equivalent. Examples:

    Word Piglatin Equivalent
    anchor anchor-way
    elegant elegant-way
    oasis oasis-way
    isthmus isthmus-way
    only only-way

  2. If a word begins with a non-vowel (we'll call this a consonant, but it could be a number, punctuation, or something else), move the prefix before the first vowel to the end with "ay" appended. Use a hyphen and treat 'y' as a vowel. If 'y' is the first letter of a word it should be considered a consonant.

    Word Piglatin Equivalent
    computer omputer-cay
    slander ander-slay
    spa a-spay
    pray ay-pray
    yesterday esterday-yay
    strident ident-stray
    rhythm ythm-rhay

  3. Words that begin with a 'qu' should be treated as though the 'u' is a consonant.

    Word Piglatin Equivalent
    quiet iet-quay
    queue eue-quay
    quay ay-quay

A few words won't conform to these rules, but the rules should always be used. If a word contains no vowels it should be treated as though it starts with a vowel --- for example "zzz" will be translated to "zzz-way".

It's possible that different words will be transformed to the same piglatin form. For example, "it" is "it-way", but "wit" is also "it-way" using the rules above.


Rot13

You'll write a function named rot13 to use a (Wikipedia) ROT13 cipher to encode/decode a string, and then use this to encode every word in a file.

The function rot13 returns a rotated form of its string parameter:

def rot13(w): s = "" # write code to concatenate characters to s return s

To convert a letter character to its ROT13 equivalent we suggest using these strings, the .find method that returns an index, and the string indexing operator.

   a = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
   b = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm"

For example, the letter 'S' in the first string (labeled a above) is found at index 18. Note that a.find("S") will evaluate to 18. Then b[18] represents the encoding for 'S', namely 'F'. You can reverse the roles of the strings a and b since the ROT13 cipher is symmetric.

There are other ways to do the ROT13 cipher using the functions chr, ord and the % operator, but the approach suggested by using indexing and the strings above is much easier to get working.

You can identify non-letters by using the return value of the string find method which will be -1 for non-letters. Alternatively you can use the Python string module to identify letters. For example, the code below generates the output that follows it.

import string for a in "ABCDefg123!,#": if not a in string.letters: print a OUTPUT
    1
    2
    3
    !
    ,
    #
Note that you must import the string module to use its functions and constants, see the Python string docs for full information on the module.

Writing to Files

Whenever you print something to the console, you should also write it to the file open for output. You do that with the file .write method which takes a string as a parameter, two uses shown below: outfile.write(word) # to create a line, write the newline character outfile.write("\n") To make sure the output file is completely written, the last line of your code must close the file as below: outfile.close() This will ensure that all writing to the file happens, that the file is flushed and closed properly.

Advice

Don't do the output file stuff until you've successfully written piglatin to the console. You can always create your own files to test your program, you don't need to run on large files. Create a file just like you create a README, with New>Other>General>Untitled Text File from the Eclipse "new" menu.

Extra Credit