Index:
[thread]
[date]
[subject]
[author]
From: Jason Grosland <jcg3@cs.duke.edu>
To :
Date: Mon, 8 Mar 1999 16:15:15 -0500
Re: indexing for goofi
On Mon, 8 Mar 1999, David Bartoli wrote:
<snip>
> Now, I have no idea if I am being too picky or if I am noticing a
> difficult specification in this assignment. It just seems to me that a
> lot of hard-coding would have to be used to hack through this situation,
> and I am not sure if it is worth sacrificing the design to account for
> this.
Ok, what you seem to have here is what's called a design decision. The
good news is that you can do whatever you want. The bad news is you can
do whatever you want.
What I'd recommend is thinking about the problem. Hyphenation can be
either a compound word, or two separate words. What about word that over-
lap a line boundary (like overlap in this sentence), are you going to
handle that?
You might write some serious language-parsing code to deal with this sort
of problem, and do something really cool. Or, you could just delegate the
responsibility back to the user (-h allows indexing of hypenated words, -H
will index the stuff on either side of the hypen, etc.).
For words with apostrophes, how could you handle it? You could build a
map of recognized contractions, and what they expand to. Then, you could
have that as an option-- expand contractions or not with indexing.
Figuring out if an apostrophe is for possesion shouldn't be too tough--
it's always an /'s/ or /s'/, in which case, you probably just want to
index the base word before the apostrophe.
Also think about the fact that most of the common contractions are made up
of short words that you might not want to be indexing anyway.
If you think you can't solve some of these problems without hardcoding
your solution, think about the problem a little longer. If you are stuck,
post a message about what you're thinking of and how you're stuck.
For example, you can get around hardcoding contractions by using a file to
contain all the contractions that you want to cover. Or you can use an
array (const final string[] = { "can't", "won't", ...};) to hold the
contractions-- as long as your code is independent of the size of the
array, you've got easy expandability...
-Jason
Index:
[thread]
[date]
[subject]
[author]