Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HelixTaskExecutor threads are not named properly after re-initing #2991

Open
klsince opened this issue Jan 10, 2025 · 0 comments
Open

HelixTaskExecutor threads are not named properly after re-initing #2991

klsince opened this issue Jan 10, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@klsince
Copy link

klsince commented Jan 10, 2025

Describe the bug

Version: 1.3.1

Normally, the HelixTaskExecutor thread pool uses the following pattern to name the threads from the pool, with a thread id suffix. This makes it easy to tell the threads apart in the pool for debugging.

HelixTaskExecutor-message_handle_thread_31

Recently, we found those threads got named as below when Helix sessions got reconnected, like due to long GC pauses. As all threads got the same name, making debugging much harder.

HelixTaskExecutor-message_handle_STATE_TRANSITION

To Reproduce

This should be reproduced by causing loss of session and reconnect. In particular to get this Init() method to run. As we saw this INFO logs before the threads got renamed.

// HelixTaskExecutor.java
...
  void init() {
    LOG.info("Init HelixTaskExecutor");
...

Expected behavior

Continue to have this naming pattern after re-initing the HelixTaskExecutor

HelixTaskExecutor-message_handle_thread_31

Additional context

We may consider to refine the logic to re-init the pool

  void init() {
    LOG.info("Init HelixTaskExecutor");

   ...       ExecutorService newPool = Executors.newFixedThreadPool(item.threadPoolSize(),
            r -> new Thread(r, "HelixTaskExecutor-message_handle_" + type)); <-----------------
      ...
  }

by continuing to use this naming pattern. And I think we don't even need to reset thread_uid counter, so it'll be easy to tell if a thread pool gets created.

"HelixTaskExecutor-message_handle_thread_" + thread_uid.getAndIncrement()));

Thanks!

@klsince klsince added the bug Something isn't working label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant