Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: smp: CPU start may result in null pointer access #68388

Merged

Conversation

dcpleung
Copy link
Member

It is observed that starting up CPU may result in other CPUs crashing due to de-referencing NULL pointers. Note that this happened on the up_squared board, but there was no way to attach debugger to verify. It started working again after moving z_dummy_thread_init() before smp_timer_init(), which was the old behavior before commit
eefaeee where the issue first appeared. So mimic the old behavior to workaround the issue.

Fixes #68115

It is observed that starting up CPU may result in other CPUs
crashing due to de-referencing NULL pointers. Note that this
happened on the up_squared board, but there was no way to
attach debugger to verify. It started working again after
moving z_dummy_thread_init() before smp_timer_init(), which
was the old behavior before commit
eefaeee where the issue
first appeared. So mimic the old behavior to workaround
the issue.

Fixes zephyrproject-rtos#68115

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
@dcpleung dcpleung marked this pull request as ready for review February 1, 2024 00:26
Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No complaints about the change per se, but the symptom is a little scary. Initializing the dummy thread (which is on the stack) later (right before the swap that will use it) is the proximate cause of the bug?!

@dcpleung
Copy link
Member Author

dcpleung commented Feb 1, 2024

No complaints about the change per se, but the symptom is a little scary. Initializing the dummy thread (which is on the stack) later (right before the swap that will use it) is the proximate cause of the bug?!

My guess is that this is due to the current thread pointer being NULL at boot, and it is set at the end of z_dummy_thread_init(). But without the ability to attach debugger onto the target, I cannot confirm if this is really the case.

@dleach02 dleach02 merged commit caacc27 into zephyrproject-rtos:main Feb 5, 2024
27 checks passed
@dcpleung dcpleung deleted the kernel/workaround_smp_start_issue branch February 5, 2024 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

up_squared: tests/kernel and tests/lib/c_lib/thrd - many tests fail
7 participants