Question

Assume that you have built a dense B+-tree index on SSN, and the B+-trees leaf nodes contain record ids pointing to data records in data file. Assume 10-byte long. Assume also that you built the tree by using bulk loading so that the nodes at each level were filled up as much as possible. 4. a. b. How many levels does the resulting tree have? For each level of the tree, how many nodes are at that level? How many block I/Os are needed to search for and retrieve a record from the file given its SSN value using the index How many levels would the resulting tree has with all pages 60% full only? c. d. 5. Repeat the above exercise when the B+ tree index is built on NAME. 6. Repeat the exercise when the B+ tree index is built on SSN, and the B+-trees leaf nodes directly contain data records (there is no separate data file)

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Answer:

Answer:

4)

Given: Size of data pointer is 10 bytes

a. We assume each node in B+- tree is of the order of m. SSN being a 9 digit number as a key will take 9 bytes. Hence,

Size of key = 9

Each node in a B+- tree corresponds to a disk block. Let size of the block be B.

Then

m*size_of_pointer+(m-1)*size_of_key <= size_of_disk_block

m*10+(m-1)*9 <= B

10m+9m-9 <=B

19m <=B+9

m <=(B+9)/19

Hence maximum order of B+- tree will be (B+9)/19.

If there are N keys, then a B+- tree of order m can have a maximum of 1+log(m/2)((N+1)/2) levels.

There are approximate 450+ million SSNs i.e > 4.5e+9 SSNs.

Lets assume N = 4.5*10^9

Hence max level number = 1+log(m/2)((N+1)/2)

                                    = 1+log(m/2)((4.5e+9)+1)/2)

                                    = 1+log(m/2)(2.25e+9)

For a typical block size of 512 bytes, we will have m = (512+9)/19 = 27.42 approx. pointers

So max level number will be 1+log(27/2)(2.25e+9) = 9.3955638307 = 9 approx.

2. Assuming order m, as calculated is 27.

Then

Level Nodes Entries Pointers
0 1 26 27
1 27 702 729
2 729 511758 511785
3 511785 261910068030 261910068057
4 261910068057 68596883742550799917710 68596883742550799917737
5 68596883742550799917737 4.7055325e+45 4.7055325e+45
6 4.7055325e+45 2.2142036e+91 2.2142036e+91
7
8

Note: From this calculation, it is clear that a B+- tree of height 5-6 will be enough for the exiting numbers of SSNs

c) Since, each node will contain one datablock, there for atleast 2 I/O (one Input and one Output) will be required for search and retrieval.

d) For 60% full tree, a node has approx. 0.60*27 = 16.2 = 16 approx. data pointers. In this case, tree can have a maximum of 11 levels.

Add a comment
Know the answer?
Add Answer to:
Assume that you have built a dense B+-tree index on SSN, and the B+-tree's leaf nodes...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • [Index structure: B+ tree and B tree] (b). B+ tree index structure is said to be...

    [Index structure: B+ tree and B tree] (b). B+ tree index structure is said to be an improvement of B tree index structure. The most important distinction between them is that data record pointers exist in both internal and leaf nodes (i.e., blocks) for a B tree whereas only in the leaf nodes for a B+ tree. Explain why this distinction would make B+ tree a more efficient structure (in terms of index search speed) overall than a B tree...

  • Question 3. (25 points) Consider the B+ tree index shown below. Each intermediate node can hold up to five pointers and four key values. Each leaf can hold up to four records, and leaf nodes are doub...

    Question 3. (25 points) Consider the B+ tree index shown below. Each intermediate node can hold up to five pointers and four key values. Each leaf can hold up to four records, and leaf nodes are doubly linked as usual, although these links are not shown in the figure. If you can borrow from or merge with both siblings, choose the right sibling. Answer the following questions 3080 I1 35425065 9098 12 68 69 70 79 98 99 1001105 30...

  • Multi-Level Indexing. Construct a B+-tree for a data file that contains 10 records with the following key values, respectively: 23, 26,43,77,4,92 The index fan-out p 3, that is, each internal tree no...

    Multi-Level Indexing. Construct a B+-tree for a data file that contains 10 records with the following key values, respectively: 23, 26,43,77,4,92 The index fan-out p 3, that is, each internal tree node of the B+-tree contains at most p-1 = 2 keys and p = 3 pointers; the underlying data file has a blocking factor pI-2, that is, each leaf of the B+-tree contains at most pI-2 data records. Assume that the tree is initially empty and records are added...

  • Problem 1. Consider a disk with block size B=1024 bytes. A block pointer is P =...

    Problem 1. Consider a disk with block size B=1024 bytes. A block pointer is P = 8 bytes long, and a record pointer is R =7 bytes long. A file has r=5,000,000 EMPLOYEE records of fixed- length. Each record has the following fields: NAME (30 bytes), SSN (9 bytes), DEPARTMENTCODE (9 bytes), ADDRESS (35 bytes), PHONE (9 bytes), BIRTHDATE (8 bytes), SEX (1 byte), JOBCODE (4 bytes), SALARY (4 bytes, real number). An additional byte is used as a deletion...

  • Consider a disk with block size = 4096 bytes. A block pointer is 6 bytes and...

    Consider a disk with block size = 4096 bytes. A block pointer is 6 bytes and a record pointer is 8 bytes long. A file has 100,000 records of fixed length. Each record has the following fields and byte size: Name (30), Ssn (9), Dept (9), Address (40), Phone (10), Bdate (8), Sex (1), Job (4) and Salary (4). An additional byte is used as a deletion marked. a. Calculate the record size R in bytes b. Calculate the blocking...

  • Page size is 1K bytes, and the avg record size is 100 bytes. Given 1 million...

    Page size is 1K bytes, and the avg record size is 100 bytes. Given 1 million records in a table, answering the following questions. 1) Suppose the table is stored in a heapfile, how many pages are in this file? 2) We will build an unclustered index B+ tree on this table over attribute A, how many data entries are contained in the leaf level? 3) Suppose A is integer type, what’s the size of a data entry? 4) Suppose...

  • Consider the B+ tree index shown in Figure 1. Each intermediate node can hold up to...

    Consider the B+ tree index shown in Figure 1. Each intermediate node can hold up to five pointers and four key values. Each leaf can hold up to four records, and leaf nodes are doubly linked as usual, although these links are not shown in the figure. Answer the following questions. Show every step of the insertion and explain the process. 34.21-4 ostas 704 94 986 99-100-105 944954964974 LS Figure 1. (1). Insert a record with search key 109 into...

  • (b) You are given the AVL Tree in the figure below. Assume that the nodes are...

    (b) You are given the AVL Tree in the figure below. Assume that the nodes are sorted in alphabetical order. E J B D K A F L H Draw the resulting BST after node E is removed. To construct the new BST replace node E with an appropriate node from the left subtree of E. Do not rebalance the resulting tree. Label each node in the resulting tree with its balance factor. (e) Now rebalance the tree from the...

  • 4. Suppose a B+ tree with 4 levels having exactly 7 keys in each node. There...

    4. Suppose a B+ tree with 4 levels having exactly 7 keys in each node. There is a record for each key 1, 2, . . . , k ? 1, k where k is the number of data records. How many nodes should be examined to find records with keys in the range [82, 113]?

  • 11. [6] We are going to design our B+ Tree to be as optimal as possible...

    11. [6] We are going to design our B+ Tree to be as optimal as possible for our old hard drives (since the management won't buy new ones, those cheapskates! We want to keep the tree as short as we can, and pack each disk block in the filesystem as tightly as possible. We also want to access our data in sorted order for printing out reports, so each leaf node will have a pointer to the next one. see...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT