0%

CMU15-445 lab02(Part 1)

Task1 - B+TREE PAGES

您需要实现三个页面类来存储B+树的数据。

  • B+Tree Parent Page
  • B+Tree Internal Page
  • B+Tree Leaf Page

B+ Tree Parent Page

这是内部页和叶页都继承的父类,它只包含两个子类共享的信息。父页面被划分为如下表所示的几个字段。

  • B+Tree Parent Page Content
Variable NameSizeDescription
page_type_4Page Type (internal or leaf)
lsn_4Log sequence number (Used in Project 4)
size_4Number of Key & Value pairs in page
max_size_4Max number of Key & Value pairs in page
parent_page_id_4Parent Page Id
page_id_4Self Page Id
你必须在指定文件中实现父页。你只允许修改头文件(src/include/storage/page/b_plus_tree_page.h)和它对应的源文件(src/page/b_plus_tree_page.cpp)。
  • 这里属于白给送分题,把各种set和get函数填空就行。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
namespace bustub {
/*
* Helper methods to get/set page type
* Page type enum class is defined in b_plus_tree_page.h
*/
bool BPlusTreePage::IsLeafPage() const { return page_type_ == IndexPageType::LEAF_PAGE; }
bool BPlusTreePage::IsRootPage() const { return parent_page_id_ == INVALID_PAGE_ID; }
void BPlusTreePage::SetPageType(IndexPageType page_type) { page_type_ = page_type; }

/*
* Helper methods to get/set size (number of key/value pairs stored in that
* page)
*/
int BPlusTreePage::GetSize() const { return size_; }
void BPlusTreePage::SetSize(int size) { size_ = size; }
void BPlusTreePage::IncreaseSize(int amount) { size_ += amount; }

/*
* Helper methods to get/set max size (capacity) of the page
*/
int BPlusTreePage::GetMaxSize() const { return max_size_; }
void BPlusTreePage::SetMaxSize(int size) { max_size_ = size; }

/*
* Helper method to get min page size
* Generally, min page size == max page size / 2
*/
int BPlusTreePage::GetMinSize() const { return max_size_ / 2; }

/*
* Helper methods to get/set parent page id
*/
page_id_t BPlusTreePage::GetParentPageId() const { return parent_page_id_; }
void BPlusTreePage::SetParentPageId(page_id_t parent_page_id) { parent_page_id_ = parent_page_id; }

/*
* Helper methods to get/set self page id
*/
page_id_t BPlusTreePage::GetPageId() const { return page_id_; }
void BPlusTreePage::SetPageId(page_id_t page_id) { page_id_ = page_id; }

/*
* Helper methods to set lsn
*/
void BPlusTreePage::SetLSN(lsn_t lsn) { lsn_ = lsn; }

} // namespace bustub

B+TREE INTERNAL PAGE

内部页不存储任何实际数据,而是存储有序的m个键条目和m + 1个指针(也称为page_id)。 由于指针的数量不等于键的数量,因此将第一个键设置为无效,并且查找方法应始终从第二个键开始。 任何时候,每个内部页面至少有一半已满。 在删除期间,可以将两个半满页面合并为合法页面,或者可以将其重新分配以避免合并,而在插入期间,可以将一个完整页面分为两部分。

你只能修改头文件(src/include/storage/page/b_plus_tree_internal_page.h) 和对应的源文件(src/page/b_plus_tree_internal_page.cpp).

  • 如图所示,第一个node被设置为无效, 没有key值。

类初始化

1
2
3
4
5
6
7
8
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::Init(page_id_t page_id, page_id_t parent_id, int max_size) {
SetPageType(IndexPageType::INTERNAL_PAGE);
SetPageId(page_id);
SetParentPageId(parent_id);
SetSize(0);
SetMaxSize(max_size);
}

基本的数组内部函数

  • array表示中间节点的内部数组。
  • first存key,也就是主键。
  • second存value,即指向子节点的指针。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
INDEX_TEMPLATE_ARGUMENTS
KeyType B_PLUS_TREE_INTERNAL_PAGE_TYPE::KeyAt(int index) const {
return array[index].first;
}

INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::SetKeyAt(int index, const KeyType &key) { array[index].first = key; }


INDEX_TEMPLATE_ARGUMENTS
int B_PLUS_TREE_INTERNAL_PAGE_TYPE::ValueIndex(const ValueType &value) const {
for (int i = 0; i < GetSize(); i++) {
if (array[i].second == value) {
return i;
}
}
return -1;
}

INDEX_TEMPLATE_ARGUMENTS
ValueType B_PLUS_TREE_INTERNAL_PAGE_TYPE::ValueAt(int index) const { return array[index].second; }

INDEX_TEMPLATE_ARGUMENTS
KeyType B_PLUS_TREE_LEAF_PAGE_TYPE::KeyAt(int index) const {
return array[index].first;
}

/*
* Helper method to find and return the key & value pair associated with input
* "index"(a.k.a array offset)
*/
INDEX_TEMPLATE_ARGUMENTS
const MappingType &B_PLUS_TREE_LEAF_PAGE_TYPE::GetItem(int index) {
return array[index];
}

Lookup函数(内部节点)

  • 这里是让我们查找并返回指向子页面的子指针(page_id),该页面包含输入“key”。
  • 从第二个键开始搜索(第一个键是无效的)
  • 注意到K(i) <= K < K(i+1),只需查找小于等于input的最后一个key即可。
  • 这里用y总的整数二分模板(右边界),且要注意key比array中所有key都要小的情况。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
INDEX_TEMPLATE_ARGUMENTS
ValueType B_PLUS_TREE_INTERNAL_PAGE_TYPE::Lookup(const KeyType &key, const KeyComparator &comparator) const {
int l = 1;
int r = GetSize() - 1;
while (l < r) {
int mid = (l + r + 1) >> 1;
if (comparator(KeyAt(mid), key) <= 0) {
l = mid;
} else {
r = mid - 1;
}
}
if (comparator(KeyAt(l), key) > 0) {
return array[0].second; // Cannot find satisdied key. The key is smaller than any valid key.
}
return ValueAt(r);
}

插入函数

  • PopulateNewRoot()填充分裂后新的根节点,只在InsertIntoParent()(b_plus_tree.cpp)中调用。
  • InsertNodeAfter()new_keynew_value对插入到值为old_value的键值对之后。
  • 先找到存有old_value的键值对,数组下标大于等于insert_index的元素整体后移1位,在空位插入新的node。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::PopulateNewRoot(const ValueType &old_value, const KeyType &new_key,
const ValueType &new_value) {
array[0].second = old_value;
array[1].first = new_key;
array[1].second = new_value;
SetSize(2);
}

INDEX_TEMPLATE_ARGUMENTS
int B_PLUS_TREE_INTERNAL_PAGE_TYPE::InsertNodeAfter(const ValueType &old_value, const KeyType & new_key, const ValueType &new_value) {
int insert_index = ValueIndex(old_value);
insert_index;
for (int i = GetSize(); i > insert_index; i--) {
array[i] = array[i - 1];
}
array[insert_index] = MappingType{new_key, new_value};
IncreaseSize(1);
return GetSize();
}

MoveHalfTo函数

  • MoveHalfTo()将old_node的右半部分array复制给new_node。
  • CopyNFrom()将[items,items+size)复制到当前page的array最后一个键值对之后的空间。
  • 找到调用CopyNFrom()函数的page的array中每个value指向的孩子结点,其父指针更新为调用该函数的page id。
  • 注意,UnpinPage()的dirty参数要置为true,因为修改了page->data转为node后的ParentPageId
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveHalfTo(BPlusTreeInternalPage *recipient, BufferPoolManager *buffer_pool_manager) {
int start_index = GetMinSize();
int move_num = GetSize() - start_index;
recipient->CopyNFrom(array + start_index, move_num, buffer_pool_manager);
IncreaseSize(-move_num);
}

INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::CopyNFrom(MappingType *items, int size, BufferPoolManager *buffer_pool_manager) {
std::copy(items, items + size, array + GetSize());
for (int i = GetSize(); i < GetSize() + size; i++) {
Page *child_page = buffer_pool_manager->FetchPage(ValueAt(i));
BPlusTreePage *child_node = reinterpret_cast<BPlusTreePage *>(child_page->GetData());
child_node->SetParentPageId(GetPageId());
buffer_pool_manager->UnpinPage(child_page->GetPageId(), true);
}
IncreaseSize(size);
}

Remove函数

  • Remove(int index)删除array[index],并将array整体向前移动一位填补空缺。
  • RemoveAndReturnOnlyChild()删除内部页面中唯一的键和值对,并返回值,只能在AdjustRoot()中调用这个方法(在b_plus_tree.cpp中)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::Remove(int index) {
IncreaseSize(-1);
for (int i = index; i < GetSize(); i++) {
array[i] = array[i + 1];
}
}
/*
* Remove the only key & value pair in internal page and return the value
* NOTE: only call this method within AdjustRoot()(in b_plus_tree.cpp)
*/
INDEX_TEMPLATE_ARGUMENTS
ValueType B_PLUS_TREE_INTERNAL_PAGE_TYPE::RemoveAndReturnOnlyChild() {
SetSize(0);
return ValueAt(0);
}

MoveAllTo函数

  • 将所有键值对移动到recipient
1
2
3
4
5
6
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveAllTo(BPlusTreeInternalPage *recipient, const KeyType &middle_key, BufferPoolManager *buffer_pool_manager) {
SetKeyAt(0, middle_key);
recipient->CopyNFrom(array, GetSize(), buffer_pool_manager);
SetSize(0);
}

MoveFirstToEndOf函数

  • 将所有当前page的第一个键值对移动到recipient最后。
  • CopyLastFrom()给array赋值,再修改其子节点的父指针。
  • 由于FetchPage()时pin住了该页面,最后需unpin。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveFirstToEndOf(BPlusTreeInternalPage *recipient, const KeyType &middle_key,BufferPoolManager *buffer_pool_manager) {
SetKeyAt(0, middle_key);
recipient->CopyLastFrom(array[0], buffer_pool_manager);
Remove(0);
}

/* Append an entry at the end.
* Since it is an internal page, the moved entry(page)'s parent needs to be updated.
* So I need to 'adopt' it by changing its parent page id, which needs to be persisted with BufferPoolManger
*/
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::CopyLastFrom(const MappingType &item, BufferPoolManager *buffer_pool_manager) {
array[GetSize()] = item;
Page *child_page = buffer_pool_manager->FetchPage(ValueAt(GetSize()));
BPlusTreePage *child_node = reinterpret_cast<BPlusTreePage *>(child_page->GetData());
child_node->SetParentPageId(GetPageId());
buffer_pool_manager->UnpinPage(child_page->GetPageId(), true);
IncreaseSize(1);
}

MoveLastToFrontOf函数

  • 将所有当前page的最后个键值对移动到recipient头部。
  • 注意需将recipient整体后移一格。
  • 由于FetchPage()时pin住了该页面,最后需unpin。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::MoveLastToFrontOf(BPlusTreeInternalPage *recipient, const KeyType &middle_key,
BufferPoolManager *buffer_pool_manager) {
recipient->SetKeyAt(0, middle_key);
recipient->CopyFirstFrom(array[GetSize() - 1], buffer_pool_manager);
IncreaseSize(-1);
}

/* Append an entry at the beginning.
* Since it is an internal page, the moved entry(page)'s parent needs to be updated.
* So I need to 'adopt' it by changing its parent page id, which needs to be persisted with BufferPoolManger
*/
INDEX_TEMPLATE_ARGUMENTS
void B_PLUS_TREE_INTERNAL_PAGE_TYPE::CopyFirstFrom(const MappingType &item, BufferPoolManager *buffer_pool_manager) {
for (int i = GetSize() - 1; i >= 0; i--) {
array[i + 1] = array[i];
}
array[0] = item;
Page *child_page = buffer_pool_manager->FetchPage(ValueAt(0));
BPlusTreePage *child_node = reinterpret_cast<BPlusTreePage *>(child_page->GetData());
child_node->SetParentPageId(GetPageId());
buffer_pool_manager->UnpinPage(child_page->GetPageId(), true);
IncreaseSize(1);
}