You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While developing a concurrency PM bug detecting tool, we have detected 2 bugs in APEX.
APEX's get operations execute in a lock-free manner, and as far as we are aware, no other mechanism exists to guarantee data retrieved by a get operations is explicitly persisted (via flush fence).
To aid the developers, we provide one of the stack traces for the PM accesses involved in this bug.
In this bug, the payload is written to PM, and is immediately visible to other threads, but not persisted.
Even though these operations are protected by a mutex, the Search operation is not, and a race arises from that scenario.
We recommend either protecting the Search operation with a lock, or ensuring data is persisted before (or simultaneously) being visible to other threads. This can be achieved via non-temporal stores for example. Although care should be taken to ensure atomicity in the case of insertion since the size of the KV pair is greater than 8 bytes.
Similarly to the first bug, but for the key, instead of the value in the KV store.
In this case, the side effects, are different, APEX may return a KV pair that has been inserted, however, after a crash, the pair is no longer present.
On the other hand, APEX may fail to retrieve an erased KV pair before a crash, while it is available after the crash.
This breaks crash-consistency.
While developing a concurrency PM bug detecting tool, we have detected 2 bugs in APEX.
APEX's get operations execute in a lock-free manner, and as far as we are aware, no other mechanism exists to guarantee data retrieved by a get operations is explicitly persisted (via flush fence).
To aid the developers, we provide one of the stack traces for the PM accesses involved in this bug.
using AlexDataNode = alex::AlexDataNode<unsigned long, unsigned long, alex::AlexCompare, my_alloc::allocator<std::pair<unsigned long, unsigned long> >, true>;
using Apex = alex::Apex<unsigned long, unsigned long, alex::AlexCompare, my_alloc::allocator<std::pair<unsigned long, unsigned long> >, true>;
First bug - Update/Insert value races with Search
In this bug, the payload is written to PM, and is immediately visible to other threads, but not persisted.
Even though these operations are protected by a mutex, the Search operation is not, and a race arises from that scenario.
We recommend either protecting the Search operation with a lock, or ensuring data is persisted before (or simultaneously) being visible to other threads. This can be achieved via non-temporal stores for example. Although care should be taken to ensure atomicity in the case of insertion since the size of the KV pair is greater than 8 bytes.
PM stores:
AlexDataNode (/root/apex/src/benchmark/../core/apex_nodes.h:3798)
Apex::update_unsafe (/root/apex/src/benchmark/../core/apex.h:2353)
Apex::update (/root/apex/src/benchmark/../core/apex.h:2342)
AlexDataNode::insert_element_at (/root/apex/src/benchmark/../core/apex_nodes.h:3479)
AlexDataNode::insert (/root/apex/src/benchmark/../core/apex_nodes.h:3262)
Apex::insert_unsafe (/root/apex/src/benchmark/../core/apex.h:1446)
Apex::insert (/root/apex/src/benchmark/../core/apex.h:1417)
PM loads:
AlexDataNode::linear_probing_search_exact (/root/apex/src/benchmark/../core/apex_nodes.h:2933)
AlexDataNode::find_payload (/root/apex/src/benchmark/../core/apex_nodes.h:2217)
Apex::search_unsafe (/root/apex/src/benchmark/../core/apex.h:1380)
Apex::search (/root/apex/src/benchmark/../core/apex.h:1370)
AlexDataNode::linear_probing_search_exact (/root/apex/src/benchmark/../core/apex_nodes.h:2915)
AlexDataNode::find_payload (/root/apex/src/benchmark/../core/apex_nodes.h:2217)
Apex::search_unsafe (/root/apex/src/benchmark/../core/apex.h:1380)
Apex::search (/root/apex/src/benchmark/../core/apex.h:1370)
Second bug - Insert/Erase key races with Search
Similarly to the first bug, but for the key, instead of the value in the KV store.
In this case, the side effects, are different, APEX may return a KV pair that has been inserted, however, after a crash, the pair is no longer present.
On the other hand, APEX may fail to retrieve an erased KV pair before a crash, while it is available after the crash.
This breaks crash-consistency.
PM stores:
AlexDataNode::insert_element_at (/root/apex/src/benchmark/../core/apex_nodes.h:3480)
AlexDataNode::insert (/root/apex/src/benchmark/../core/apex_nodes.h:3262)
Apex::insert_unsafe (/root/apex/src/benchmark/../core/apex.h:1446)
Apex::insert (/root/apex/src/benchmark/../core/apex.h:1417)
AlexDataNode::erase (/root/apex/src/benchmark/../core/apex_nodes.h:3606)
Apex::erase_unsafe (/root/apex/src/benchmark/../core/apex.h:2313)
Apex::erase (/root/apex/src/benchmark/../core/apex.h:2302)
PM load:
alex::AlexCompare::operator()<unsigned long, unsigned long> (/root/apex/src/benchmark/../core/apex_base.h:296)
AlexDataNode::linear_probing_search_exact (/root/apex/src/benchmark/../core/apex_nodes.h:962)
AlexDataNode::find_payload (/root/apex/src/benchmark/../core/apex_nodes.h:2217)
Apex::search_unsafe (/root/apex/src/benchmark/../core/apex.h:1380)
Apex::search (/root/apex/src/benchmark/../core/apex.h:1370)
The text was updated successfully, but these errors were encountered: