The next set of labs considers some of the trade-offs involved with building Boggle® players and game. In particular, this lab examines alternate implementations of a lexicon or word source.
You will write implement and analyze classes that implement the ILexicon
interface. Each lexicon
implementation facilitates looking up a word or a possible prefix
of a word. The classes differ in how efficiently they support these
operations and in how much memory they use.
You are given a JUnit testing class, TestLexicon
, to test your
implementations. In the load
methods you write you
can assume no duplicate words will be inserted via either the Scanner or
ArrayList parameters to the load methods. You are also given a class LexiconBenchmark
you can use
to analyze the empirical performance of your implementations.
You're given an implementation SimpleLexicon
with an O(n) implementation of the method
wordStatus
since the implementation simply does a linear
search over the list of words it stores. Read the
documentation for details of the interface.
You will modify
SimpleLexicon
so that it's wordStatus
method
executes more efficiently, create a BinarySearchLexicon
class, and complete the TrieLexicon
class.
wordStatus
methods of the
SimpleLexicon
class so that it uses the TreeSet.subSet
method that returns a
subset/subtree of the tree in which all the words are stored --- the subset's
first word should be the String whose status is being determined in
wordStatus
and the subset's last word should be
myUpperBound
which is a value larger than any element in the
lexicon's set of words (see the code). Make sure you check to see if the
parameter s
is larger than the upper bound since that will cause
the TreeSet.subset
call to fail. See
the
SortedSet.subset API for details.
BinarySearchLexicon
implementing the ILexicon interface. Store words in a sorted ArrayList
(sort the
ArrayList
after adding all the words to it) and use Collections.binarySearch
to search the list.
Read the
documentation for binarySearch
. Note
that when the index value returned is less than zero the value can be
used to determine where a word should be in a sorted list. For
example, when looking up "ela" the value returned might be
-137. This means that "ela" is not in the lexicon, but if it
were to be inserted it would be at index 136. This also means that if
the word at index 136 does not begin with "ela" then no word in
the lexicon has a prefix of "ela". So, any non-negative value returned
by binarySearch
means the status of a word is
LexStatus.WORD
. If the value returned is negative, one call
of the appropriate String.startsWith()
method can determine
if LexStatus.PREFIX
should be returned (make sure you don't go off the
end of the array o[5~f words in the lexicon when calling
startsWith
).
do, dog, dogma, ,dot, dote, ha, we, wed, weptThe black dot in each node indicates a
isWord
field
having value true
in some node; if there is no black dot
the isWord
field is false
.
The TrieLexicon
class uses
a trie data structure to implement the ILexicon
interface.
wordStatus
method for
TrieLexicon
.
TestLexicon
to use as you develop
your ILexicon
implementations. To test different implementations
simply change the code in the method makeLexicon
to return the
implementation you want to test and run the JUnit tests.
We also provide a benchmarking class LexiconBenchmark
that
facilitates evaluating the efficiency of different implementations as well as
correctness. Confidence in an implementation's correctness is increased if it
returns the same results as other implementations.