Question

For the Porter stemmer rule group shown in (2.1):

(2.1) Rule Example SSES → SS IES → caresses Caress ponies -- caress cats poni → caress → cat
What is the purpose of including an identity rule such as SS →SS? Applying just this rule group, what will the following words be stemmed to? circus canaries boss
What rule should be added to correctly stem pony? The stemming for ponies and pony might seem strange. Does it have a deleterious effect on retrieval? Why or why not?

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Hello. I have answered your question as you have requested. I have explained it in detail as much as possible. Please do comment if you have any doubt.

======================Solution begins here ==============================

1)What is the purpose of including an identity rule such as SS →SS?

Incase the rule SS->SS was not included as a rule within the algorithm then a few words which end with ss like boss, toss, loss etc would not have been recognized. Basically you are increasing the precision of the stemmer. It is like not giving out any output for a certain input while you are giving some output to a certain input. Like if in case we parse toss as an input into a function and recieve input back as an output it would be a lot more better than recieving a null data back.

Lets take the example of caress. As in step 1a we assume that it is being sent into a function.

function stemmer (caress) --> caress SUCCESS
function stemmer (bosses) --> boss SUCCESS
function (ridiculousness) --> ridiculous SUCCESS

So now the precision is at 100%.

Case 2. Rule SS->SS is not in the algorithm.

Stemming would occur like this.

function stemmer (caress) --> null FAIL
function stemmer (bosses) --> boss SUCCESS
function (ridiculousness) --> ridiculous SUCCES

So now the pricision is at 33.3%.

So the bottom line is that we include SS->SS is because we can increase the pricision for the stemmer.


2) Applying just this rule group, what will the following words be stemmed to? circus canaries boss

The word circus will be stemmed to circu. According to the step 1a we change s-> _ (Here _ means an empty string).

So circus -> circu (s -> _)

The word canaries will be stemmed to canari. According to the step 1a we change ies-> i

So canaries -> canari (ies -> i)

The word boss will be stemmed to boss. According to the step 1a we change SS-> SS.

So boss -> boss (ss - > ss)


3) What rule should be added to correctly stem pony?

The word pony is stemmed to poni. We use the step 1c which changes last letter any word ending with Y with a vowel in between to i.

So pony -> poni ((*v*) Y-> I)

4)The stemming for ponies and pony might seem strange. Does it have a deleterious effect on retrieval? Why or why not?

No. it will not have any deleterious effect on retrieval. because stemming of both the words result in the same word.

==============================Solution ends here========================

Add a comment
Know the answer?
Add Answer to:
For the Porter stemmer rule group shown in (2.1): What is the purpose of including an...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT