Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add: Add index_gt::merge() #572

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Add: Add index_gt::merge() #572

wants to merge 1 commit into from

Conversation

kou
Copy link

@kou kou commented Mar 4, 2025

#84

This just focus on index_gt. index_dense_gt is out of scope of this PR. If we get consensus of implementation approach, we'll be able to implement index_dense_gt::merge() too.

This adds mutable memory_mapped_file_t and you can create a mutable memory-mapped index with it. You can merge multiple indexes to the mutable memory-mapped index without allocating all data on RAM.

This just focus on index_gt. index_dense_gt is out of scope of this
PR. If we get consensus of implementation approach, we'll be able to
implement index_dense_gt::merge() too.

This adds mutable `memory_mapped_file_t` and you can create a mutable
memory-mapped index with it. You can merge multiple indexes to the
mutable memory-mapped index without allocating all data on RAM.
@@ -3272,7 +3449,6 @@ class index_gt {

// We are loading an empty index, no more work to do
if (!header.size) {
reset();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this PR but I add this PR because I need similar change in view(). If this should be a separated PR, I'll open a separated PR.

We don't need to call reset() here because we don't change anything after the above reset().

Comment on lines -3423 to +3619
serialization_result_t stream_result = load_from_stream(
[&](void* buffer, std::size_t length) {
if (offset + length > file.size())
return false;
std::memcpy(buffer, file.data() + offset, length);
offset += length;
return true;
},
std::forward<progress_at>(progress));

return stream_result;
is_mutable_ = true;
return {};
} else {
serialization_result_t io_result = file.open_if_not();
if (!io_result)
return io_result;

serialization_result_t stream_result = load_from_stream(
[&](void* buffer, std::size_t length) {
if (offset + length > file.size())
return false;
std::memcpy(buffer, file.data() + offset, length);
offset += length;
return true;
},
std::forward<progress_at>(progress));
return stream_result;
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is only changed because this is moved to else.

if (!header.size) {
reset();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to call reset() because we don't change anything after the above reset().

@kou kou mentioned this pull request Mar 4, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant