CPS 100E Fall 1999: Reading Expressions

Classwork for October 28, 1999


Prefix, infix, and postfix refer to the location of the operator relative to its operands in an expression. In prefix expressions, the operators come before the operands; programming languages like Lisp use this notation because each operator is actually a function call and its operands are the arguments. In postfix expressions, the operator comes after its operands; many scientific calculators use this notation because there is no ambiguity as to the operator's precedence. Infix expressions, where the operator comes in between the two operands, are perhaps the most problematic type of expressions but the most common though because that is what people learn in math classes.

It turns out that it is easier for a computer to evaluate a postfix expression because it requires no parentheses and evaluates unambiguously in a single left to right pass. For complex equations or those that are going to be evaluated many times, it may be worth the extra effort to convert the expression from infix to postfix before evaluating it. You will use Dijkstra's algorithm to convert a series of expressions from infix to postfix notation.

Setup

  1. Create a directory called expressions2 in your cps100e directory by typing mkdir expressions2
  2. Change into the expressions2 directory by typing: cd expressions2

  3. Copy files to this directory by typing (don't forget the trailing dot): cp ~rcduvall/cps100e/classwork/expressions2/* .

If you copied the files correctly you should see the following files in your directory when you type the Unix command ls

Note that we are providing a data file for you, but you should always create your own that covers all the cases you can think of. We will test your program on a different data file.

Reading to the Stack

Initially, you will write a program that simply echoes the given expression to the user in the reverse order that it was read. You should use a stack to do this because that's exactly what stacks do best --- by pushing each character onto the stack and then, after every character has been pushed, popping each off, you get the characters in reverse order.

For example, given the following input, your program should produce the following output:

a + b            ------------> b + a
a + b * c        ------------> c * b + a
a + b * c / d    ------------> d / c * b + a

To do this, you will modify the template by adding a function ProcessExpression() that processes each expression by taking that expression as a parameter and returning a new expression to be printed out. In this program, expressions are represented as strings. You should call ProcessExpression() for each expression read in.

The header for this function should look like the following:

string ProcessExpression (const string & expression)
// post: return a string that is the reverse of expression

Within this function, you should use a Stack<char> to hold each character of the expression. After every character has been pushed onto the stack, you should then pop each off and add it to the result string. Finally you should return the result string to the calling function.

Here is example pseudocode for the function ProcessExpression():

for each character, c, in expression string
  push c onto stack

clear result string
for each element, e, in stack
  pop e
  concatenate e with result string

return result string

Copy the output of your program into your README file.

Expressions Without Parentheses

Next, you will write a program that converts simple infix expressions, i.e., those that do not have parentheses in them, into postfix expressions. To do this, you will modify the function ProcessExpression() by adding some additional logic that decides when to push each character in the expression on to the stack and when to add it to the resulting string.

For example, given the following input, your program should produce the following output:

a + b            ------------> a b +
a + b * c        ------------> a b c * +
a + b * c / d    ------------> a b c * d / +

At this point, you need to be able to determine if the character read in is an operator or an operand. To simplify this assignment, you can assume that operands and operators will be only one character long. This means for each character you read, you need to make a decision. Of course, you could just write a boolean function that returns true if the character is an operator and false otherwise. However, when translating the expression, the precedence of the operator is important because it determines when to pop the operator (thus ensuring your postfix expression calculates the same result as the infix expression would have). In the second example above, note that the multiplication operator is read in after the addition operator, but printed before it in the final result.

Your program should recognize the standard C++ binary arithmetic operators (*, /, %, + , -), other non-whitespace characters should be treated as operands. Furthermore, you should use the standard C++ operator precedence (if you do not remember it, you can look it up in the Astrachan textbook). You should make a function that, when given a character, returns its precedence as a number. This function should handle operands as well (for example, by giving them a precedence of zero).

The header for this function should look like the following:

int GetPrecedence (char c)
// post: return a number that represents the precedence of c

Within the ProcessExpression() function, you should use a Stack<char> to hold some of the operators in the expressions. When an operator is found in the expression, you will need to check if any other higher precedence (or equal precedence, since operators are evaluated from left to right) operators have already been found. If they have, then they may have been pushed onto the stack and you need to pop them now and add them to the result string because they will need to be evaluated before the current operator in the postfix expression. After that, you should push the current operator on to the stack since you do not know if a higher precedence operator might occur later in the expression.

After every character has been processed, you need to clear off the stack of any operators remaining that need to be evaluated. It just so happens (not entirely by coincidence) that the operators remaining are in the correct order, so just popping each off and adding it to the result string works correctly. Again, you should return the result string to the calling function.

Here is example pseudocode for the revised ProcessExpression():

clear result string
for each character, c, in expression string
  if (c is an operator)
     pop all previously pushed operators with higher or equal precedence and 
       concatenate each with result string
     push c onto stack
  else  // c is an operand
    concatenate c with result string

for each element, e, remaining in stack
  pop e
  concatenate e with result string

return result string

Copy the output of your program into your README file.

Expressions With Parentheses

Finally, you will write a program that converts more complex infix expressions, i.e., simple infix expressions as well as those that have parentheses in them, into postfix expressions. This process is more complex because parentheses change the order of evaluation and thus the order in which you print things out.

For example, given the following input, your program should produce the following output:

a + b            ------------> a b +
(a + b) * c      ------------> a b + c *
(a + b) * c / d  ------------> a b + c * d /

For this part, you need to be able to match beginning and ending parentheses and deal with every operator found between them. To do this, you need to change your ProcessExpression() and GetPrecedence() functions to handle parentheses.

In the ProcessExpression() function, the parentheses will act as brackets that denote a set of operators that must be evaluated before all others. Thus, when you find an open parenthesis, it acts as the beginning of the bracket. Since you are bracketing things that will be stored on the stack, that means you must push it onto the stack as soon as you find it. Then you can proceed as normal, in essence filling the bracket, until a closed parenthesis is found. At this point, every operator pushed after the matching open parenthesis must be popped and concatenated with the result string (i.e., given highest precedence).

In the GetPrecedence() function, you will need to add open and closed parentheses to your set of recognized operators. You will need to determine what precedence to give them to make the ProcessExpression() function work correctly.

Additionally, since you now have at least two situations where you are popping a number of operators off the stack and adding them to the result string, you may want to make a separate function that encapsulates this behavior.

The header for this function should look like the following:

string ClearOperators (Stack & s, int precedence)
// post: return string of operators popped from stack with higher
//       or equal precendence to precedence

The rest of the ProcessExpression() function remains the same.

Here is example pseudocode for the revised ProcessExpression():

clear result string
for each character, c, in expression string
  if (c is an '(')
     push c
  else if (c is an ')')
     pop all previously pushed operators until '(' and 
       concatenate each with result string
  else if (c is an operator)
     pop all previously pushed operators with higher or equal precedence and 
       concatenate each with result string
     push c onto stack
  else c is an operand
    concatenate c with result string

for each element, e, remaining in stack
  pop e
  concatenate e with result string

return result string

Copy the output of your program into your README file.

Extra Credit

Change your implementation of ProcessExpression to return an expression tree, instead of a string, so that the expression typed in can be evaluated. This will require that your function use two stacks, one for the operators and another one for the operands (the current version uses an operator stack, but simply appends the operands to the result string). The operand stack will need to hold pointers to Expressions and will build the tree. Thus, instead of adding an operator to the result string when it is popped, you will need to to create the appropriate subclass of BinaryOperator whose operands are the last two operands added to the operand stack. Then the newly created subclass should be pushed onto the operand stack to become an operand for the next operator. In this way, the tree is constructed such that when the expression is completely processed, the complete tree is the only thing left in the operand stack.

To test your code, you create a new data file that uses single digit numbers, instead of letters, as its operands.

Submit

You should submit your completed classwork electronically by midnight the night before the next class.

Every assignment must have a README file submitted with it (please use all capital letters). Include your name and your partner's names, the date, and an estimate of how long you worked on this work in the README file. You must also include a list of names of all those people with whom you collaborated on the assignment.

To submit this classwork type

submit100e expressions2 README Makefile countw3.cpp players.cpp sports.dat

You'll get a message printed in your terminal window if the submit was successful. To see the files you submitted type submit100e expressions2 without any file names after the assignment name. This should show you a list of the files you submitted for that assignment.


Comments?