-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix leaked zombie processes when closing surfaces #4605
Conversation
I checked out your branch on my Ubuntu 24.04 and built it ( Log output after pressing
|
Yikes! That’s no good. I think I know what the issue is though. Thanks for checking it out! I’ll try to get a Linux VM set up soon so that I can fix it correctly for both platforms. |
Okay, thanks to @slovdahl's feedback and a fresh look at the code, I realized there's a much simpler fix available, which I've just pushed. I think there's a higher than average risk of future bugs here with any slight changes in this area, so I still think it may be worth a rethink of much of subprocess management code. That said, this PR should fix at least some (possibly all) zombie leak cases without introducing any risk of us getting stuck in a loop and hanging. Note that I still haven't had a chance to test on Linux, so someone should definitely do that before we merge. |
I'm afraid the ghostty process still hangs when pressing |
Hmm, that's surprising. Is it possible that you're still building the previous version of my branch? Note that I did a force-push in order to update this PR, so you may need to do a hard reset or similar (a simple pull likely won't do it). |
Doing a first pull of your repo/branch and build now, let's see. |
When exiting the shell with ctrl+d or
Tried again with gtk-single-instance forced to false, but still the same. And with closing the window through Gnome / window manager:
During the gtk2 era when I used glib, it was also tricky to get threads to nicely quit. Resulted in 'the thread must always exit itself', you can't force-abort it from the main thread, or it will stay behind. But, as I said, that was ages ago, and I hope that modern Zig doesn't have the same issues as an ageing glib version :). |
Yeah, I deleted the local branch before fetching your branch again. Re-did it once more and verified I have d58b435 locally as my HEAD. Noticed something interesting now though. This is But, if I press |
Never tested more tabs, so I don't know the 'original' 1.0.0 / 1.0.1 behaviour. But this branch indeed 'hangs' / stalls when opening multiple tabs and closing one with ctrl+d. Funnily, if I 'split right' for example, it seems to close OK with ctrl+d. I don't know if it's related. But if I have an alacritty terminal running with If I then close the ghostty window with ctrl+d, the waits keep showing. If I type |
Likely unrelated: #3224 |
FWIW, windows close normally with |
Yeah, for me as well. The problem was never with windows not closing. The problem is with the ghostty process not exiting when the shell decides to quit. |
Built myself a debug build. All cases with only a single tab. On
Pressing the X button in the top right corner to close the window:
This PR when pressing
This PR when pressing the X button in the top right corner:
|
Thanks for all the info folks. Clearly I need to get a Linux test setup before proceeding much further. This all works well on MacOS (at least for me). I have some theories about what might be causing the stuck process, but don't want to keep guessing when I can't test it out locally. |
I tried to debut the GTK side of this a little bit. First time I'm touching both Zig and GTK, so bear with me.. 😁 I'm not sure if it's the problem, but one thing that I find a bit strange is that cc'ing @jcollie as the original author of the |
Small progress here: managed to get a linux environment stood up today, and can reproduce the behavior exactly as described above. Ran out of time to debug but should be able to get to that soon. |
787962f
to
37f8741
Compare
I need to boot up my Windows machine to test these changes, since something is obviously amiss there. Will be able to do that tomorrow I hope. |
Can I try on Linux already? Or do you have something like 'just wait and sit till I say it is ready' :). |
No harm in trying. I don't think it fixes anything there, but I never dug deep in to the linux specific issues so it's possible that it does. |
Issue description can be found at #4554, which this fixes.
I'm not thrilled about this solution, but it's the pragmatic option. I reached my wits' end trying to figure out why things weren't quite the same on linux, and we'd need to revisit that if we want to solve this in a different way.
Some notes:
waitpid
on subprocesses that exit independently of ghostty (i.e. they terminate, rather than us terminating them by e.g. closing a tab)std.posix.waitpid
panics ifwaitpid
returnedECHILD
, which happens if the process has already been reaped when we callwaitpid
waitpid
, and so adding in a call towaitpid
caused the IO thread to panic (this is what @slovdahl was seeing below with an older version of my fix, I believe)Details of the fix:
waitpid
, but to avoid panicking on linux, I switched us over to usingstd.c.waitpid
instead ofstd.posix.waitpid
Command.wait
method, and discovered we could simplify it a bitwaitpid
that had been introducedThe only other option (other than getting to the bottom of the linux reaping mystery, which may still be worth doing), is to introduce a comptime check around the additional call to
waitpid
so that it happens on MacOS only. Would be happy to switch to that approach if it's preferred.I definitely would understand if you don't want to merge this as is and would prefer to really get to the bottom of what's happening on linux. I just don't have more time to dedicate to that right now. I could look more next week, but would probably benefit from a second pair of eyes to ensure I'm not missing something obvious.