Lecture 9: Strings and Loops


The String class

In the last lecture the differences between objects and primitive data types were explored. One of the most widely used type of objects are those defined by the String class. String literals have already been used in most of the example programs given in lecture. We've also seen string concatenation where the "+" operator was used to append two strings together. This operator will perform arithmetic addition if it is surrounded by two numeric operands. When one or both of the operands reference String objects, though, this operator performs string concatenation (creating a temporary String object to represent a non-String value if needed).

While we've seen examples of creating String literals by using double quotations, you may have noticed that some characters (like the " symbol itself) are hard to represent in this manner. Java defines escape sequences to represent these special characters. Here is a listing of the Java escape sequences and what they represent:
\b : backspace
\t : tab
\n : newline
\r : carriage return
\" : double quote
\' : single quote
\\ : backslash

String Objects

String objects are defined by the String class and are instantiated in the same way that all new objects are, with the new operator.

String course = new String("CompSci 006");

As you can see, the String constructor method takes a single reference to a String argument, which is usually given in the form of a string literal. There are many methods that create a new String object which is a slightly modified version of the original object. For instance, the substring(int offset, int endIndex) method returns a portion of the original string starting at the character indexed by offset and going to character endIndex-1. To figure what character in a string is indexed by a number, think of counting the characters left-to-right (start counting with 0).

String name = new String("lecture4");
String shorterName = name.substring(0,3);

Since the characters in a string are indexed starting at 0, the second line of code above will set the variable shorterName to reference a new String object with the literal value equivalent of "lec". The object referenced by name will not be affected. We can depend on this because String objects are immutable. Immutable objects can never have their value(s) changed once they are set. So, the string literal in the first object, "lecture4", will never be shortened, lengthened, or modified in any way. If we wanted to change the string literal value that the variable name referenced, we would have to create a new String object and set name equal to that. Note again the difference between objects... and variables, which can hold references to objects. A String object cannot be changed once it is created. However, a variable that references that String object can easily change its reference to point to a completely different String object.

The Keyboard, Random, and Math classes

Classes like String are thought to be so basic and useful that they are automatically made available for every program that you write. For other sets of external code that you would like to use, the import declaration is used to identify that code. We have already used this declaration in several programs to import the Keyboard class, which is part of the cs1 package. Packages are a means for grouping classes together. Specifics about packages, class libraries - which are sets of classes used to support program development - and the import statement are not something we'll be concerned with in this class (although you can always ask about any topic and I'll explain it or point you to resources for more information).

In previous examples the readInt() method was used to read integer values from standard input (typed from the keyboard). The Keyboard class also provides similar methods for all of the 8 primitive types as well as for string literals: readLong(), readDouble(), readString(), etc. Notice however that we use these methods differently from how we used the methods of the DecimalFormat class in the last lecture. This is exhibited in the simple example ClassMeth.java.

DecimalFormat formatter = new DecimalFormat("0.##");

System.out.print("Please enter a decimal value: ");
double value = Keyboard.readDouble();

String result = formatter.format(value);

With the DecimalFormat class, you use the methods of a particular object by giving the name of a variable referencing that object, the dot operator, and then the method name with parameters. In the above example, we have the formatter variable and the format method (with value as a parameter). However, for the readDouble method Keyboard is the name of a class, not a variable. This is because the methods of the Keyboard class were defined as static, meaning that they do not need an instantiation of an object to be used. These methods are called class methods or static methods. They are made this way by using the static reserved word in their method definition. As shown above, static methods are used by giving the appropriate class name to the left of the dot operator (rather than a variable name). We'll see just how this works when we start writing our own classes later in the course.

Another useful class that has many static methods is the Math class. This provides several useful methods for basic mathematical operations. A good resource for finding the method definitions and other general information for a class is from a Java Docs page like this. This page gives a good description about everything that makes up the Math class. Because the page gives so much information, you will have to learn to search for what you need and ignore the rest. The Java API link on the course resources page links to a page where you can find Java Docs for all of the standard class libraries provided for Java.

One last class to mention is the Random class. To use it you need to instantiate a Random object. You would then utilize the methods of that object using the variable referencing it. This works just like with the DecimalFormat objects: a variable name, followed by the dot operator, followed by the method name with the appropriate parameters given.

Program Statements and Flow of Control

After discussing several different object types, we now shift gears to talk about program statements. These important tools help control the flow of control for your program. They give it the ability to do more than just execute code in a straight top-down, line-by-line manner. One way to alter the top-down flow of control is to call a method, which causes the flow to jump to the code for that method. Another is to use program statements to make decisions about what piece of code should be run next.

We've already seen one program statement in Math5.java: the for-loop. The for-loop program statement had a header section and a body section. The header section updated a counter variable and ran through the body section of code any number of times. It also made a decision about whether to run that body of code each time or drop out of the loop. This was a restricted use of the for-loop. What if I wanted to have my variable count down instead of up? What if I wanted it to do more than just count up by one? Could I have a different condition operator than just <=? To work up to all of the capabilities of for-loop's, we'll first start out with the most basic program statement: the if statement.

The if statement

The if statement is fairly simple in that it controls a single block of code, deciding if that code should be executed or skipped. The following code example shows this:

   if(value1 > value2)
      System.out.println("value1 is larger");

Here we just make a decision of true or false for the statement that value1 > value2. The statement inside the parentheses must always evaluate to a true/false boolean value. This could be done with a boolean variable, or with an expression that evaluates to a true/false value as in the above example. We say that these are boolean expressions simply because they are expressions that evaluate to boolean values. Along with the greater than operator (>), we could also use the operators for less than (<), equal to (==), greater than or equal (>=), and less than or equal (<=). The only tricky operator in this group is the equal to operator (==) since it is similar to assignment operator (=). These two operators act quite differently. Using two equals signs causes us to compare two values or variables while the assignment operator takes the value on its right side and assigns it to the variable on its left size. A common mistake is to accidentally use the assignment operator when you meant to compare two operands with the equal to operator.

What happens if the operators I just provided aren't enough? What if you want to execute a line of code only if two different conditions were both met? How about if we want at least one of two different conditions were met? How about if want the negative of some condition, meaning that it didn't evaluate to true? These cases are taken care of by the logical operators for AND (&&), OR (||), and NOT (!). This could allow us to create a complicated expression like the following:

   if((value1 == 0) && ((value1 > value2) || !(value1 <= value3)))
      System.out.println("Either value2 or value3 is negative.");

The boolean expressions that we've been creating so far seem to make sense for integer variable comparisons. What if we're dealing with char variables, or floating point values, or String objects? String's are determined to be equal if all of their characters are equal. The methods equals and equalsIgnoreCase can be used to test the equality of two different String's.

In the first example given previously, we printed out a string literal if value1 was larger than value2. If we also wanted to print out a different statement if value1 was not larger (meaning that value1 <= value2) we could do this using an if-else statement as shown below.

   if(value1 > value2)
      System.out.println("value1 is larger");
   else
      System.out.println("value1 <= value2");

If the condition for the if statement evaluates to true, then the line of code immediately below it is run. Otherwise, the line of code immediately below the else reserved word is run. If we wanted to run more than a single line of code, we would have to use curly braces.

   if(value1 > value2) {
      System.out.println("Which value is larger?");
      System.out.println("value1 is larger.");
   }
   else {
      System.out.println("Actually...");
      System.out.println("value1 <= value2.");
   }

These curly braces create something called a block statement. This is the same as for the body definitions of methods and classes as we've seen earlier. If we leave out the curly braces, odd things can happen.

   if(value1 > value2) {
      System.out.println("Which value is larger?");
      System.out.println("value1 is larger.");
   }
   else
      System.out.println("value1 <= value2.");
      System.out.println("This is always printed.");

Without the curly braces to mark a block statement of grouped code the last two println statements are separated. If the value1 > value2 statement evaluates to false, only the first line of code immediately following the else reserved word will be run. The other lines coming after this are unaffected by the if-else statement, so the last println statement will always be run and print the text: This is always printed.

One case where if-else statements can get particularly trickey is with nested if-else statements. This code example illustrates that:

   if (num1 < num2)   
      if(num1 < num3) 
         min = num1;
      else          
         min = num3;
   else              
      if (num2 < num3) 
         min = num2;
      else               
         min = num3;

The goal is to find the minimum of the three "num" variables and assign its value to the min variable. The rule to follow when figuring which else statements go with which if statement is just like we would have for nested parentheses or nested curly braces. An else statement is always matched up with the nearest (in the code above it) unmatched if statement. Comments with matching numbers are used to match up if-else pairs below.

   if (num1 < num2)      // pair 1
      if(num1 < num3)    // pair 2
         min = num1;
      else               // pair 2
         min = num3;
   else                  // pair 1
      if (num2 < num3)   // pair 3
         min = num2;
      else               // pair 3
         min = num3;

While knowing the rule for matching up if-else pairs is important (you should know it) this confusion can be avoided. Just like using parentheses in arithmetic expressions reduces the confusion of operator precedence, curly braces can be used to show which lines of code go with which if or else statement. You're encouraged to try out this if-else code for yourself to be certain that everything makes sense to you.

More operators

Before we move on to discussing loops, it's worthwhile to look at a few more operators that can be useful in loops. First up are the increment (++) and decrement (--) operators. These add or subtract, respectively, a value of 1 from either integer or floating point data types. Thus, for the variable int value, these two lines of code execute the same:

value = value + 1;
value++;

These operators can be a handy shortcut, and can also provide one extra bit of functionality. If we happen to be using a variable at the same time that it is being incremented, then we get different values depending on whether the increment operator is placed before or after the variable.

int value = 5;
System.out.println(value++);
System.out.println(value);
System.out.println(++value);
System.out.println(value);

For the above short bit of code, we would get the following lines of output:

5
6
7
7

This occurs because when we use the increment operator after the variable, the use of the variable gets its original value (before the decrement). When we use the increment operator before the variable, its operation occurs before the value is taken out of the variable (so we get the value after the increment). The same effect holds for the decrement operator.

Some other useful operators are variations of the assignment operator. For instance, the following lines of code are equivalent:

value *= 3 + 5;
value = value * (3 + 5);

This assignment operator works for addition, but there are similar ones for string concatenation (+=), subtraction (-=), multiplication (*=), division (/=) and many more. In general, you can think of these operators as working like the example equivalent code above. Put parentheses around the expression on the right. Take the "extra" symbol in the assignment operator and move it to the front of the equation on the right. Make a copy of the variable on the left and move it to the front of the equation on the right. One last bit of terminology while we're talking about operators: an operator that acts on one operand is a unary operator, one that acts on two operands is a binary operator, and one that acts on three operands is a ternary operator.

Loops

With the useful operators now covered, we're ready to move on to discussing loops. Each of the three loops we'll cover (while, do-while, and for) are capable of creating the same effect, each type is uniquely suited to be easier to use in certain situations. The first to be mentioned is the while statement. This works just like an if statement, except that the body of the while statement is executed as long as it's condition header evaluates to true. This means that the condition is checked once before every iteration of running the code in the body of the while loop. By comparison, an if statement will only run its body of code at most once. The while statement's structure makes it ideally suited for cases where we don't know in advance how many times a loop may run. An example of a while statement is given below.
   while(true) {
      System.out.print("Can I stop now?");
      System.out.println(" No. \n");
   }
Note that the boolean expression that is placed inside the while loop's header parentheses can be anything that evaluates to true or false. This includes any of the complicated expressions that can be created by the many operators mentioned earlier in this lecture - or something as simple as the reserved word true.

It is very important when you write loops of your own that you guarantee that the loop is progressing towards the boolean expression in the header becoming false. Since my loop's header will never evaluate to false, my code has a problem called an infinite loop. This is a common error when writing any of the three loop types in code. One other thing to note about loops is that they can be nested (just like with if statements). Consider a case where we actually nest one loop inside another. The inner loop will execute completely for every iteration that the outer loop runs. This can be useful, but layers of nested loops can quickly make for a slow program.

The do-while Statement

The second type of loop is made by the do-while statement. This is just like the while statement, except that the do-while statement's condition is checked after every iteration of it's body's code. This means that the body of the do-while statement will always be run at least once. An example of a do-while statement is given below.

   String president = new String("?");
   do {
   
      System.out.print("Please enter the name of the President of the US: ");
	  president = Keyboard.readString();
	  System.out.println("So, " + president + " is in office, eh? \n\n");
	  
   } while ( president.equals("bush") );

The for-loop Statement

The third and final type of loop comes from a statement we've already seen: the for statement. for-loop's are ideally suited to executing a loop where you know in advance how many times the loop should run, or can at least approximate that with some sort of variable. As we knew before, the for statement had a header and a body like the other two loop statements. We also learned that its header is divided up into three parts: initialization, condition, and increment. Here's a look at the basic for-loop header:

The first part of that header, the initialization, occurs only once and occurs before anything else. That's the highlighted part below.

The condition portion of the header is executed once before every iteration of the loop. If the condition evaluates to true then the body of the loop is run one more time. If the condition evaluates to false, the for-loop is done and we move on to whatever code happens to follow it. The highlighted part below is the condition statement.

Finally, the update portion of the header occurs once after every iteration of the loop. Typically, this portion of the header is used to increment the variable created in the initialization portion (which is then checked each time in the condition portion). The increment portion of the header is highlighted below.

We now can see that for-loop's are not just limited to starting some counter at 1 and incrementing it until it is less than or equal (<=) to some threshold stored in a variable. All of the boolean expressions that worked for the condition header in if, while, and do-while statements will also work for the condition part of the for statement's header. Also, while the increment part can be a simple addition statement, it could also make use of the more extravagant assignment operators we've seen in this lecture.

It is also possible to modify the increment variable within the body of the for-loop. You must take extreme care when doing this, though, because you must always make sure that your loop will eventually terminate by causing the condition portion of the header to evaluate to false. The initialization variable also does not have to be declared inside the header of the for-loop. It can be declared in code preceding the for-loop.

This can be useful because variables declared in the header or body of the for statement are thrown away and lost when the statement finishes executing. To save the variable - and whatever value it might finish the loop holding - we'd need to declare it before the loop. This instance of a variable being thrown away is not unique. All variables only exist within a certain realm, typically within innermost curly braces that they were declared within. The exact details of this will be covered when we start writing our own classes next week.

Loop Invariants

Here's one more term for helping you reason about loops: loop invariants. A loop invariant is a boolean expression that is true each time the loop condition is evaluated. This is something you would probably do implicitly anyway. The boolean expression in the condition statement of the while loop is its loop invariant. That is always true when the loop starts to run. The same is almost true of the do-while loop, but that loop will always run at least once, so there's a chance that the boolean expression would not be true before that first run of the loop. Finally, for the for loop the condition was the middle term in the 3 parts of the header of the loop: initialization, condition, and increment.

In any case, you can come up with a loop invariant for any loop that repeatedly runs, and this is true for other cases of repeated actions (not just loops in Java code). Being able to identify what the loop invariant of a something you're writing should be, or identifying the loop invariant in something else somebody wrote, can be a very useful skill. We just covered simple cases before where the condition just sort of worked out for you to be true or false when it was supposed to. In more complex examples, you'd be trying to initialize the loop invariant to be true and then hold it to being true for a certain number of steps, until some goal is met, etc. How you have to initially make the loop invariant true can set the proper initial values for certain variables in the loop condition and body. Through the actions in the body of your loop, certain things make cause that loop invariant to be false. If you're wanting to maintain that loop invariant up until your goal is met, you'll have to do extra work to push the invariant back to true.