-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rtree #1063
base: master
Are you sure you want to change the base?
Rtree #1063
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left some initial meta comments for the Rtree.h and .cpp, so that you can start making this code more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional suggestions for parts of the code.
And of course you can apply similar techniques to get rid of duplications on many more places.
I am now fairly sure, that this whole file can be brought down to at most 328 lines of code to have a precise target:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some further comments for cleaning this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A rough review of the files directly related to the Rtree
.
There is still a lot of copy-paste-boilerplate-duplicate code that makes it hard to read, please take care of it.
#ifndef QLEVER_NODE_H | ||
#define QLEVER_NODE_H | ||
|
||
#include "./Rtree.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Move all the Rtree-related stuff into a subdirectory
util/rtree
-
Make the includes absolute (for all files:
include "util/rtree/Rtree.h"
ConstructionNode::ConstructionNode(uint64_t id, OrderedBoxes orderedBoxes) | ||
: RtreeNode{id} { | ||
this->orderedBoxes_ = orderedBoxes; | ||
// calculate the boundingBoxes | ||
this->boundingBox_ = orderedBoxes.GetBoundingBox(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this->
if there's no technical reason for it (you now use the trailing_) to disambiguate.
ConstructionNode::ConstructionNode(uint64_t id, OrderedBoxes orderedBoxes) | |
: RtreeNode{id} { | |
this->orderedBoxes_ = orderedBoxes; | |
// calculate the boundingBoxes | |
this->boundingBox_ = orderedBoxes.GetBoundingBox(); | |
ConstructionNode::ConstructionNode(uint64_t id, OrderedBoxes orderedBoxes) | |
: RtreeNode{id, orderedBoxes.GetBoundingBox()} , orderedBoxes_{std::move(orderedBoxes)}{} |
* Add all children of a certain node at once. | ||
* This is used when a leaf node is reached. | ||
*/ | ||
if (this->GetOrderedBoxes().WorkInRam()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never use this->
(everywhere)
if (this->GetOrderedBoxes().WorkInRam()) { | ||
for (RTreeValueWithOrderIndex box : | ||
this->GetOrderedBoxes().GetRectanglesInRam()) { | ||
RtreeNode leafNode = RtreeNode(box.id, box.box); | ||
this->AddChild(leafNode); | ||
} | ||
} else { | ||
for (const RTreeValueWithOrderIndex& element : | ||
FileReader(this->GetOrderedBoxes().GetRectanglesOnDisk())) { | ||
RtreeNode leafNode = RtreeNode(element.id, element.box); | ||
this->AddChild(leafNode); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the orderedBoxes_
a std::variant<blaInRam, blaOnDisk>
and then use std::visit
here (more typesafety, and less duplicated code).
OrderedBoxes& ConstructionNode::GetOrderedBoxes() { | ||
return this->orderedBoxes_; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just be called orderedBoxes()
. And consider making the non-const getters private wherever possibly (unless you need them for some reason, I don't yet undderstand enough).
auto match : ctre::range< | ||
R"( *([\-|\+]?[0-9]+(?:[.][0-9]+)?) +([\-|\+]?[0-9]+(?:[.][0-9]+)?))">( | ||
wkt)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somewhere have some ctre helpers that allow for concatenation of regexes, then you can get rid of the duplication. Also: does the wkt format allow for arbitrary whitespace, or does it have to be the space character?
[[nodiscard]] double MinX() const { return box.min_corner().get<0>(); } | ||
[[nodiscard]] double MaxX() const { return box.max_corner().get<0>(); } | ||
[[nodiscard]] double MinY() const { return box.min_corner().get<1>(); } | ||
[[nodiscard]] double MaxY() const { return box.max_corner().get<1>(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, you ave this functionality, reuse it (the first MinX
function in this file.).
auto sizeLeft = | ||
(uint64_t)(std::ceil(((double)splitResult.bestIndex - 2.0) / 2.0) * | ||
(double)S); | ||
uint64_t sizeRight = this->size_ - sizeLeft; | ||
uint64_t split0ByteSize = | ||
sizeLeft * (4 * sizeof(double) + sizeof(uint64_t) + 2 * sizeof(uint64_t)); | ||
uint64_t split1ByteSize = sizeRight * (4 * sizeof(double) + sizeof(uint64_t) + | ||
2 * sizeof(uint64_t)); | ||
bool split0InRam = split0ByteSize * 4 < maxBuildingRamUsage; | ||
bool split1InRam = split1ByteSize * 4 < maxBuildingRamUsage; | ||
|
||
if (!split0InRam) { | ||
splitBuffers.rectsD0Split0.rectangles = filePath + ".0.dim0.tmp"; | ||
splitBuffers.rectsD1Split0.rectangles = filePath + ".0.dim1.tmp"; | ||
} | ||
|
||
if (!split1InRam) { | ||
splitBuffers.rectsD0Split1.rectangles = filePath + ".1.dim0.tmp"; | ||
splitBuffers.rectsD1Split1.rectangles = filePath + ".1.dim1.tmp"; | ||
} | ||
|
||
std::pair<BasicGeometry::BoundingBox, BasicGeometry::BoundingBox> | ||
boundingBoxes = | ||
PerformSplit(splitResult, splitBuffers, M, S, maxBuildingRamUsage); | ||
|
||
if (!split0InRam) { | ||
split0.SetOrderedBoxesToDisk(rectsD0Split0, rectsD1Split0, sizeLeft, | ||
boundingBoxes.first); | ||
} else { | ||
split0.SetOrderedBoxesToRam(rectsD0Split0, rectsD1Split0, | ||
boundingBoxes.first); | ||
} | ||
|
||
if (!split1InRam) { | ||
split1.SetOrderedBoxesToDisk(rectsD0Split1, rectsD1Split1, sizeRight, | ||
boundingBoxes.second); | ||
} else { | ||
split1.SetOrderedBoxesToRam(rectsD0Split1, rectsD1Split1, | ||
boundingBoxes.second); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of code duplication for split0
and split1
factor out the current code using one or multiple lambdas, and then just call.
if (splitResult.bestDim == 0) { | ||
pushSmallBoundaries(splitBuffers.rectsD0Split0.rectanglesSmall, | ||
splitBuffers.rectsD0Split1.rectanglesSmall); | ||
|
||
// placeholder, since we need the min and max element of the split in the | ||
otherDimension.smallSplit0 = &splitBuffers.rectsD1Split0.rectanglesSmall; | ||
otherDimension.smallSplit1 = &splitBuffers.rectsD1Split1.rectanglesSmall; | ||
// first two spots | ||
otherDimension.smallSplit0->emplace_back(); | ||
otherDimension.smallSplit0->emplace_back(); | ||
otherDimension.smallSplit1->emplace_back(); | ||
otherDimension.smallSplit1->emplace_back(); | ||
} else { | ||
pushSmallBoundaries(splitBuffers.rectsD1Split0.rectanglesSmall, | ||
splitBuffers.rectsD1Split1.rectanglesSmall); | ||
|
||
// placeholder | ||
otherDimension.smallSplit0 = &splitBuffers.rectsD0Split0.rectanglesSmall; | ||
otherDimension.smallSplit1 = &splitBuffers.rectsD0Split1.rectanglesSmall; | ||
|
||
otherDimension.smallSplit0->emplace_back(); | ||
otherDimension.smallSplit0->emplace_back(); | ||
otherDimension.smallSplit1->emplace_back(); | ||
otherDimension.smallSplit1->emplace_back(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same argument here about code duplication.
if (dim == 0) { | ||
currentSmallList = &splitBuffers.rectsD0Split0.rectanglesSmall; | ||
if (currentSplitInRam || workInRam) { | ||
currentList = &std::get<multiBoxWithOrderIndex>( | ||
splitBuffers.rectsD0Split0.rectangles); | ||
} else { | ||
currentList = &rectanglesOnDiskS0D0Stream.value(); | ||
} | ||
} else { | ||
currentSmallList = &splitBuffers.rectsD1Split0.rectanglesSmall; | ||
if (currentSplitInRam || workInRam) { | ||
currentList = &std::get<multiBoxWithOrderIndex>( | ||
splitBuffers.rectsD1Split0.rectangles); | ||
} else { | ||
currentList = &rectanglesOnDiskS0D1Stream.value(); | ||
} | ||
} | ||
} else { | ||
if (dim == 0) { | ||
currentSmallList = &splitBuffers.rectsD0Split1.rectanglesSmall; | ||
if (currentSplitInRam || workInRam) { | ||
currentList = &std::get<multiBoxWithOrderIndex>( | ||
splitBuffers.rectsD0Split1.rectangles); | ||
} else { | ||
currentList = &rectanglesOnDiskS1D0Stream.value(); | ||
} | ||
} else { | ||
currentSmallList = &splitBuffers.rectsD1Split1.rectanglesSmall; | ||
if (currentSplitInRam || workInRam) { | ||
currentList = &std::get<multiBoxWithOrderIndex>( | ||
splitBuffers.rectsD1Split1.rectangles); | ||
} else { | ||
currentList = &rectanglesOnDiskS1D1Stream.value(); | ||
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a lot of code duplication for the dim0
and dim1
, please remove.
Bachelorprojekt - Noah Nock