Question 5: Combinatorics (25 marks, 5 marks each with
5b as bonus): Assume there are 100,000 unique web pages*
and answer the following:
5d) Information Retrieval uses preprocessing to
only consider a small portion of candidate web pages. For example,
usually only two thousand pages with certain matching keywords are
ever possibly returned in Google's top results. What is the
probability that all 10 relevant pages are in the first page of
these 2000 options under equal likelihood?
5e) How would your answer 5d change if 100
candidate pages were returned instead of 2000?
Question 5: Combinatorics (25 marks, 5 marks each with 5b as bonus): Assume there are 100,000...
Assume there are 100,000 unique web pages* and answer the following: 5a) A Google search returns a first page of 20 results each in a ranked order. How many possible first page rankings are there for the 100,000 web pages? BONUS: 5b) How many rankings of 20 pages are there if two of the 100,000 pages are duplicates. Hint: You might need to use the rule of sum. 5c) In Information Retrieval, a field of Data-Mining for correctly returning search...