Task1 - B+TREE PAGES 您需要实现三个页面类来存储B+树的数据。
B+Tree Parent Page B+Tree Internal Page B+Tree Leaf Page B+ Tree Parent Page 这是内部页和叶页都继承的父类,它只包含两个子类共享的信息。父页面被划分为如下表所示的几个字段。
B+Tree Parent Page Content Variable Name Size Description page_type_ 4 Page Type (internal or leaf) lsn_ 4 Log sequence number (Used in Project 4) size_ 4 Number of Key & Value pairs in page max_size_ 4 Max number of Key & Value pairs in page parent_page_id_ 4 Parent Page Id page_id_ 4 Self Page Id
你必须在指定文件中实现父页。你只允许修改头文件(src/include/storage/page/b_plus_tree_page.h)和它对应的源文件(src/page/b_plus_tree_page.cpp)。
这里属于白给送分题,把各种set和get函数填空就行。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 namespace bustub {bool BPlusTreePage::IsLeafPage () const { return page_type_ == IndexPageType::LEAF_PAGE; }bool BPlusTreePage::IsRootPage () const { return parent_page_id_ == INVALID_PAGE_ID; }void BPlusTreePage::SetPageType (IndexPageType page_type) { page_type_ = page_type; }int BPlusTreePage::GetSize () const { return size_; }void BPlusTreePage::SetSize (int size) { size_ = size; }void BPlusTreePage::IncreaseSize (int amount) { size_ += amount; }int BPlusTreePage::GetMaxSize () const { return max_size_; }void BPlusTreePage::SetMaxSize (int size) { max_size_ = size; }int BPlusTreePage::GetMinSize () const { return max_size_ / 2 ; }page_id_t BPlusTreePage::GetParentPageId () const { return parent_page_id_; }void BPlusTreePage::SetParentPageId (page_id_t parent_page_id) { parent_page_id_ = parent_page_id; }page_id_t BPlusTreePage::GetPageId () const { return page_id_; }void BPlusTreePage::SetPageId (page_id_t page_id) { page_id_ = page_id; }void BPlusTreePage::SetLSN (lsn_t lsn) { lsn_ = lsn; }}
B+TREE INTERNAL PAGE 内部页不存储任何实际数据,而是存储有序的m个键条目和m + 1个指针(也称为page_id)。 由于指针的数量不等于键的数量,因此将第一个键设置为无效,并且查找方法应始终从第二个键开始。 任何时候,每个内部页面至少有一半已满。 在删除期间,可以将两个半满页面合并为合法页面,或者可以将其重新分配以避免合并,而在插入期间,可以将一个完整页面分为两部分。
你只能修改头文件(src/include/storage/page/b_plus_tree_internal_page.h) 和对应的源文件(src/page/b_plus_tree_internal_page.cpp).
如图所示,第一个node被设置为无效, 没有key值。 类初始化 1 2 3 4 5 6 7 8 INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::Init (page_id_t page_id, page_id_t parent_id, int max_size) { SetPageType (IndexPageType::INTERNAL_PAGE); SetPageId (page_id); SetParentPageId (parent_id); SetSize (0 ); SetMaxSize (max_size); }
基本的数组内部函数 array
表示中间节点的内部数组。first
存key,也就是主键。second
存value,即指向子节点的指针。1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 INDEX_TEMPLATE_ARGUMENTS KeyType B_PLUS_TREE_INTERNAL_PAGE_TYPE::KeyAt (int index) const { return array[index].first; } INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::SetKeyAt (int index, const KeyType &key) { array[index].first = key; }INDEX_TEMPLATE_ARGUMENTS int B_PLUS_TREE_INTERNAL_PAGE_TYPE::ValueIndex (const ValueType &value) const { for (int i = 0 ; i < GetSize (); i++) { if (array[i].second == value) { return i; } } return -1 ; } INDEX_TEMPLATE_ARGUMENTS ValueType B_PLUS_TREE_INTERNAL_PAGE_TYPE::ValueAt (int index) const { return array[index].second; }INDEX_TEMPLATE_ARGUMENTS KeyType B_PLUS_TREE_LEAF_PAGE_TYPE::KeyAt (int index) const { return array[index].first; } INDEX_TEMPLATE_ARGUMENTS const MappingType &B_PLUS_TREE_LEAF_PAGE_TYPE::GetItem (int index) { return array[index]; }
Lookup函数(内部节点) 这里是让我们查找并返回指向子页面的子指针(page_id),该页面包含输入“key”。 从第二个键开始搜索(第一个键是无效的) 注意到K(i) <= K < K(i+1)
,只需查找小于等于input的最后一个key即可。 这里用y总的整数二分模板(右边界) ,且要注意key比array中所有key都要小的情况。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 INDEX_TEMPLATE_ARGUMENTS ValueType B_PLUS_TREE_INTERNAL_PAGE_TYPE::Lookup (const KeyType &key, const KeyComparator &comparator) const { int l = 1 ; int r = GetSize () - 1 ; while (l < r) { int mid = (l + r + 1 ) >> 1 ; if (comparator (KeyAt (mid), key) <= 0 ) { l = mid; } else { r = mid - 1 ; } } if (comparator (KeyAt (l), key) > 0 ) { return array[0 ].second; } return ValueAt (r); }
插入函数 PopulateNewRoot()
填充分裂后新的根节点,只在InsertIntoParent()
(b_plus_tree.cpp)中调用。InsertNodeAfter()
将new_key
和new_value
对插入到值为old_value
的键值对之后。先找到存有old_value
的键值对,数组下标大于等于insert_index的元素整体后移1位,在空位插入新的node。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::PopulateNewRoot (const ValueType &old_value, const KeyType &new_key, const ValueType &new_value) { array[0 ].second = old_value; array[1 ].first = new_key; array[1 ].second = new_value; SetSize (2 ); } INDEX_TEMPLATE_ARGUMENTS int B_PLUS_TREE_INTERNAL_PAGE_TYPE::InsertNodeAfter (const ValueType &old_value, const KeyType & new_key, const ValueType &new_value) { int insert_index = ValueIndex (old_value); insert_index; for (int i = GetSize (); i > insert_index; i--) { array[i] = array[i - 1 ]; } array[insert_index] = MappingType{new_key, new_value}; IncreaseSize (1 ); return GetSize (); }
MoveHalfTo函数 MoveHalfTo()
将old_node的右半部分array复制给new_node。CopyNFrom()
将[items,items+size)复制到当前page的array最后一个键值对之后的空间。找到调用CopyNFrom()
函数的page的array中每个value指向的孩子结点,其父指针更新为调用该函数的page id。 注意,UnpinPage()
的dirty参数要置为true,因为修改了page->data转为node后的ParentPageId
。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveHalfTo (BPlusTreeInternalPage *recipient, BufferPoolManager *buffer_pool_manager) { int start_index = GetMinSize (); int move_num = GetSize () - start_index; recipient->CopyNFrom (array + start_index, move_num, buffer_pool_manager); IncreaseSize (-move_num); } INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::CopyNFrom (MappingType *items, int size, BufferPoolManager *buffer_pool_manager) { std::copy (items, items + size, array + GetSize ()); for (int i = GetSize (); i < GetSize () + size; i++) { Page *child_page = buffer_pool_manager->FetchPage (ValueAt (i)); BPlusTreePage *child_node = reinterpret_cast <BPlusTreePage *>(child_page->GetData ()); child_node->SetParentPageId (GetPageId ()); buffer_pool_manager->UnpinPage (child_page->GetPageId (), true ); } IncreaseSize (size); }
Remove函数 Remove(int index)
删除array[index]
,并将array整体向前移动一位填补空缺。RemoveAndReturnOnlyChild()
删除内部页面中唯一的键和值对,并返回值,只能在AdjustRoot()中调用这个方法(在b_plus_tree.cpp中)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::Remove (int index) { IncreaseSize (-1 ); for (int i = index; i < GetSize (); i++) { array[i] = array[i + 1 ]; } } INDEX_TEMPLATE_ARGUMENTS ValueType B_PLUS_TREE_INTERNAL_PAGE_TYPE::RemoveAndReturnOnlyChild () { SetSize (0 ); return ValueAt (0 ); }
MoveAllTo函数 1 2 3 4 5 6 INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveAllTo (BPlusTreeInternalPage *recipient, const KeyType &middle_key, BufferPoolManager *buffer_pool_manager) { SetKeyAt (0 , middle_key); recipient->CopyNFrom (array, GetSize (), buffer_pool_manager); SetSize (0 ); }
MoveFirstToEndOf函数 将所有当前page的第一个键值对移动到recipient
最后。 CopyLastFrom()
给array赋值,再修改其子节点的父指针。由于FetchPage()
时pin住了该页面,最后需unpin。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveFirstToEndOf (BPlusTreeInternalPage *recipient, const KeyType &middle_key,BufferPoolManager *buffer_pool_manager) { SetKeyAt (0 , middle_key); recipient->CopyLastFrom (array[0 ], buffer_pool_manager); Remove (0 ); } INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::CopyLastFrom (const MappingType &item, BufferPoolManager *buffer_pool_manager) { array[GetSize ()] = item; Page *child_page = buffer_pool_manager->FetchPage (ValueAt (GetSize ())); BPlusTreePage *child_node = reinterpret_cast <BPlusTreePage *>(child_page->GetData ()); child_node->SetParentPageId (GetPageId ()); buffer_pool_manager->UnpinPage (child_page->GetPageId (), true ); IncreaseSize (1 ); }
MoveLastToFrontOf函数 将所有当前page的最后个键值对移动到recipient
头部。 注意需将recipient
整体后移一格。 由于FetchPage()
时pin住了该页面,最后需unpin。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveLastToFrontOf (BPlusTreeInternalPage *recipient, const KeyType &middle_key, BufferPoolManager *buffer_pool_manager) { recipient->SetKeyAt (0 , middle_key); recipient->CopyFirstFrom (array[GetSize () - 1 ], buffer_pool_manager); IncreaseSize (-1 ); } INDEX_TEMPLATE_ARGUMENTS void B_PLUS_TREE_INTERNAL_PAGE_TYPE::CopyFirstFrom (const MappingType &item, BufferPoolManager *buffer_pool_manager) { for (int i = GetSize () - 1 ; i >= 0 ; i--) { array[i + 1 ] = array[i]; } array[0 ] = item; Page *child_page = buffer_pool_manager->FetchPage (ValueAt (0 )); BPlusTreePage *child_node = reinterpret_cast <BPlusTreePage *>(child_page->GetData ()); child_node->SetParentPageId (GetPageId ()); buffer_pool_manager->UnpinPage (child_page->GetPageId (), true ); IncreaseSize (1 ); }