CompSci 6- Classwork 16 - March 25, 2010
10 pts

Working with Sets

Terminology

Definition
A set is an unordered collection of distinct objects. The objects in a set are also called the elements or members of the set.
Common Operations
The union of two sets A and B is the set containing elements that are either in A or in B.
The intersection of two sets A and B is the set containing elements that are both A and B.
The difference between two sets A and B is the set containing elements that are in A but not B.
The complement of a set A is the set of all elements of the universal set (i.e. set containing all objects under consideration) that are not in A.

Using Sets

Suppose A is a Set of String objects representing the names of first-year students at Duke and B is a Set of String objects representing students taking CompSci 6 at Duke. Express each of the following sets in terms of operations on sets A and B.

  1. the set of first-year students taking CompSci 6
  2. the set of first-year students who are not taking CompSci 6
  3. the set of students who either are first-year students or are taking CompSci 6
  4. the set of students who either are not in their first-year or are not taking CompSci 6

Using TreeSet

Snarf the file 16_sets_cps006_spring10 to get started.

In Java, the TreeSet class, is used to model the concept of a set given above. In the remainder of this activity, you will use one or more TreeSet objects to discover several interesting statistics about words in a file: how many distinct words exist in the file, how many appear in two separate files, how many are common to both files, and finally, how many appear in one file but not the other. Sets prove to be a useful tool to explore these kinds of statistics because each is a common use of sets: creation, union, intersection, and difference, respectively.

There are two classes defined for this activity: SetAlgorithms and SetAlgorithmsTest. The first is the class you will be implementing, the second tests the first by testing its methods using the JUnit unit testing framework. You should start by reading over the code in SetAlgorithms and making sure you understand it. Note, you cannot access items in a set with get and set, as was possible with an ArrayList, so instead you must access each item in turn using a separate Iterator object.

The output of SetAlgorithmsTest should indicate that two methods that create and print sets are properly implemented. Implementing the set operation methods is your exercise today. After you have verified the current implementation, you should implement the following additional methods and test them by inspecting their output.

  1. Complete the method, union, that creates a new TreeSet that is the union of the two Sets passed as parameters. The union of two sets is the set that contains all of the elements of both sets. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
  2. Complete the method, intersection, that creates a new TreeSet that is the intersection of the two Sets passed as parameters. The intersection of two sets is the set that contains only those elements that are contained in both sets. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
  3. Complete the method, difference, that creates a new TreeSet that is the difference of the two Sets passed as parameters. The difference of two sets is the set that contains only those elements that are in the first set, but not the second set. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
  4. Complete the overloaded method, union, that creates a new TreeSet that is the union of any number of Sets passed as an list. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
  5. Complete the overloaded method, intersection, that creates a new TreeSet that is the intersection of any number of Sets passed as an list. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.

Unit Testing

To test your SetAlgorithms class you're given testing code. This code tests individual methods in your class, these tests are called unit tests and so you need to use the standard JUnit unit-testing library with the SetAlgorithmsTest.java file to test your implementation.

NOTE: If your program won't run with JUnit, you may need to select JUnit4. Right-Click on the name of the project, select "Build Path", "Add Libraries", select "JUnit", then JUnit4.

To choose Run as JUnit test first use the Run As option in the Run menu as shown on the left below. You have to select the JUnit option as shown on the right below. Most of you will have that as the only option.

run as

There are three tests in SetAlgorithmsTest: one each for the correctness of union, intersection, and difference methods.

If the JUnit tests pass, you'll get all green as shown on the left below. Otherwise you'll get red -- on the right below -- and an indication of the first test to fail. Fix that, go on to more tests. The red was obtained from the code you're given. You'll work to make the testing all green.

green junit   red junit

Currently, SetAlgorithmsTest includes one test each for the union and intersection and none for difference. Each test uses the JUnit assertEquals method. An assertion is a statement that allows a programmer to reason assumptions he or she makes in the program. In this case, assertEquals is testing whether the 3rd argument, the actual value, is equal to the 2nd argument, the expected value. If values are not equal (i.e., .equals returns false), then the assertion and the test fails. The first argument is the message to print if the asertion fails.

In this section, you will add more tests to SetAlgorithmsTest.

  1. Using testUnion and testIntersection as models, add a test to testDifference using the sets test1, test2, or test2.
  2. Add at least one more test to each of the testUnion and testIntersection methods. You will need to create new data sets (like test1) as necessary. with whatever data you like to show that your tests work.
  3. Add assertions to testUnion and testIntersection to test the union and intersection methods on lists of sets.

Submitting

Submit as ClassworkMarch25sets.