Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intention behind fsync_recursive #335

Open
liss-h opened this issue Oct 10, 2024 · 4 comments
Open

Intention behind fsync_recursive #335

liss-h opened this issue Oct 10, 2024 · 4 comments

Comments

@liss-h
Copy link
Contributor

liss-h commented Oct 10, 2024

Hi, there is this function fsync_recursive which walks up the directory tree until it reaches the root and fsyncs everything in between.

inline bool fsync_recursive(const fs::path &path) {
fs::path p(path);
p = fs::canonical(p);
while (true) {
if (!fsync(p.string())) {
return false;
}
if (p == p.root_path()) {
break;
}
p = p.parent_path();
}
return true;
}

This behaviour causes issues on SELinux systems when metall is used as part of a system service. The reason being that SELinux will not allow system services to read files that are not theirs.

If you assume that your data is in /var/local/service/data this causes issues as soon as metall tries to fsync /var/local as it will not have access to it with a properly configured policy (because there is no reason for it to have access there).

So I was wondering what the intention of that function is: Why does it sync all the way towards the root directory? Would it be enough to only fsync up to the datastore root?

Example strace:

open("/var/local/service/data", O_RDONLY|O_LARGEFILE) = 4
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(open+0x5e) [0x1469c36]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x115) [0x14f9035]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]
fsync(4)                                = 0
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(fsync+0x1c) [0x1482e7c]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x123) [0x14f9043]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]
--
open("/var/local/service", O_RDONLY|O_LARGEFILE) = 4
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(open+0x5e) [0x1469c36]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x115) [0x14f9035]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]
fsync(4)                                = 0
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(fsync+0x1c) [0x1482e7c]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x123) [0x14f9043]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]
--
open("/var/local", O_RDONLY|O_LARGEFILE) = 4
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(open+0x5e) [0x1469c36]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x115) [0x14f9035]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]
fsync(4)                                = 0
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(fsync+0x1c) [0x1482e7c]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x123) [0x14f9043]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]
--
open("/var", O_RDONLY|O_LARGEFILE)      = 4
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(open+0x5e) [0x1469c36]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x115) [0x14f9035]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]
fsync(4)                                = 0
 > /usr/local/bin/service(sccp+0x19) [0x147e45f]
 > /usr/local/bin/service(fsync+0x1c) [0x1482e7c]
 > /usr/local/bin/service(metall::mtlldetail::fsync_recursive(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x123) [0x14f9043]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_test_file_space_free(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x84) [0x12f6084]
 > /usr/local/bin/service(metall::kernel::mmap_segment_storage<long, unsigned long>::priv_open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, void*, bool)+0x5ae) [0x12fcb0e]
 > /usr/local/bin/service(metall::kernel::manager_kernel<unsigned int, 2097152ul>::priv_open(char const*, bool, unsigned long)+0x633) [0x12fba13]
 > /usr/local/bin/service(metall::basic_manager<unsigned int, 2097152ul>::basic_manager(metall::open_only_t, char const*)+0x5e) [0x12fb37e]
 > /usr/local/bin/service(metall_manager* open_impl<metall::open_only_t{}>(char const*)+0x27) [0x12e7057]

@liss-h
Copy link
Contributor Author

liss-h commented Oct 10, 2024

Somewhat related: std::filesystem::canonical does a readlink on every part of the path /var, /var/local, ... (thats news to me but apparently it does that)

This will also get denied be SELinux

@KIwabuchi
Copy link
Member

So I was wondering what the intention of that function is: Why does it sync all the way towards the root directory? Would it be enough to only fsync up to the datastore root?

If I remember correctly, I found the idea in another file system-related library that was implemented by a filesystem expert.
It is just for making sure to flush the changes to storage. But it is a very cautious strategy.

I think it is okay not to call fsync in parent directories.

Somewhat related: std::filesystem::canonical does a readlink on every part of the path /var, /var/local, ... (thats news to me but apparently it does that)

It sounds like we shouldn't use the fsync_recursive function on SELinux. Let's turn it off on the environment. Do you know how to detect SELinux in C++ code? If it is not easy, we could turn off the feature in any environment.

@liss-h
Copy link
Contributor Author

liss-h commented Oct 11, 2024

It sounds like we shouldn't use the fsync_recursive function on SELinux. Let's turn it off on the environment. Do you know how to detect SELinux in C++ code? If it is not easy, we could turn off the feature in any environment.

I don't think its possible, at least not in a robust way. Do you think it would be enough to fsync up to the datastore base directory?

@KIwabuchi
Copy link
Member

I did some investigation. Calling fsync all up to the root may be overkill. I changed the code to fsync only a parent directory.
The related PR has been merged to the develop: #183.
Let me know if you see any issues still or if the PR introduces a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants