-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event queue loops infinitely with warnings in Revolt\EventLoop\TimerQueue #104
Comments
Any specific changes that you made to your application that triggered this? |
Unfortunately nothing that can be easily pinpointed, we're developing the solution quite organically and we've now noticed this issue after stabilizing the solution. We run 25 processes in production with at least 10 of them getting fairly heavy "traffic" (redis events being listened to, I/O is AmPHP RedisClient, fully-asynchronous pgsql, AmPHP HttpServer and HttpClient for aggregating external APIs). And it's still a chance event that we observe one of them hanging like that. We're testing switching to the pecl/ev driver and if we succeed we won't see this show up in production anymore (as far as I can see TimerQueue is used exclusively in StreamSelectDriver). I will try my best to come up with a repeatable stress test directly for the TimerQueue, in isolation. I think this is my best chance of catching the issue. |
I've run as heavy a stress test of TimerQueue as I could think of , it's not failing under any circumstances. Assuming the same TimerCallback doesn't get added twice in the queue, I found no way of triggering the same bug. I haven't tested what happens if the assertion in the insert() method fails (in production we keep zend.assertions disabled). I cannot easily discern a codepath in AbstractDriver + StreamSelectDriver that would make that assertion fail, hence why I haven't tested such a case. In case anybody wants to try, or enhance, the stress-test I've written, I'm providing it here. Feel free to close this issue. use Revolt\EventLoop\Internal\TimerQueue;
use Revolt\EventLoop\Internal\TimerCallback;
$queue = new TimerQueue();
// Global error handler to catch and display warnings
set_error_handler(function ($errno, $errstr, $errfile, $errline) {
global $queue;
echo "ERROR: $errstr in $errfile on line $errline\n";
echo "Stack trace:\n";
debug_print_backtrace();
echo "\n-----\n\n";
var_dump($queue);
// Halt execution on first sign of trouble
die;
});
$extracted = 0;
$removed = 0;
$added = 0;
$lifetimeAdded = 0;
$nextId = 'a';
$callbackFn = fn() => null;
$iterations = 0;
$lastReport = microtime(true);
while (true) {
$iterations++;
// in a loop, insert a bunch of callbacks with expirations in the very near future (0-500ms) or past (0-100ms).
// Out of the bunch, select a few randomly (20% chance) to remove using $queue->remove($callback)
$toRemove = [];
for ($i = random_int(2, 100); $i > 0; $i--) {
$callback = new TimerCallback($nextId++, 0, $callbackFn, microtime(true) + (random_int(-100, 500) / 1000));
// 20% chance to remove the callback
if (random_int(1, 5) === 1) {
$toRemove[] = $callback;
}
$queue->insert($callback);
$added++;
}
foreach ($toRemove as $callback) {
$queue->remove($callback);
$removed++;
}
// // sleep 2ms
// usleep(2000);
// pop all expired callbacks off the queue
while ($callback = $queue->extract(microtime(true))) $extracted++;;
// every 5 seconds, report status
if (microtime(true) - $lastReport > 5) {
$lastReport = microtime(true);
$lifetimeAdded += $added;
echo "Added $added callbacks (total: $lifetimeAdded)\n";
echo "Extracted $extracted callbacks\n";
echo "Removed $removed callbacks\n";
echo "Current iteration: $iterations, nextID: $nextId\n";
// echo $queue->dump_contents();
echo "---------------------------------\n\n";
$added = 0;
$removed = 0;
$extracted = 0;
}
} |
Thank you for the additional details! If you can't run with assertions on production, maybe you can modify the code to run these assertions as normal ifs instead of them being assertions? |
We've switched to the pecl/ev driver on production (which naturally eliminated the problem), I will be pushing a code change to make the driver and zend.assertions configurable, and will switch all our staging environments to use streamselect driver with assertions. The amount of traffic run through staging is significantly lower, so I expect it will take a long time until we observe the bug again. I'll keep you guys posted. |
This is very difficult to reproduce, and I am still actively trying to create a testcase with TimerQueue used in isolation that reproduces the underlying issue otuside the event loop itself. Until then, maybe this is going to be obvious to somebody more knowledgeable of the TimerQueue implementation.
After hours (sometimes days) of running the event loop, there is a chance that the event loop blocks and throws an apparently endless stream of warnings of the following nature:
Warning: Undefined array key 321 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 108
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 108
Warning: Undefined array key 321 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 109
Warning: Undefined array key 322 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 109
Warning: Undefined array key 322 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 109
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 109
Warning: Undefined array key 322 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 113
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 113
Warning: Undefined array key 322 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 129
Warning: Undefined array key 322 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 130
Warning: Attempt to read property "id" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 130
Warning: Undefined array key 306 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 143
Warning: Attempt to read property "id" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 144
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 108
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 113
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 68Warning: Attempt to read property "id" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 142
Warning: Undefined array key 306 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 143
Warning: Attempt to read property "id" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 144
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 108
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 113
Warning: Undefined array key 306 in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 143
Warning: Attempt to read property "id" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 144
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 149
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 95Warning: Attempt to read property "id" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 133
Warning: Attempt to read property "expiration" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 95
Warning: Attempt to read property "id" on null in /app/vendor/revolt/event-loop/src/EventLoop/Internal/TimerQueue.php on line 133
And then a seemingly endless stream of almost the same, complaining at infinite about array key 306 on line 143, then a bunch of read property on null ('expiration', 'id') on lines 144, 149, 95, 133.
As soon as I, and if, can reliably reproduce this outside of a production environment, I will update. I am currently untangling the code to the best of my abilities.
This looks like a pure code problem, but if it matters:
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
PHP 8.2.27 (cli) (built: Feb 25 2025 02:25:39) (NTS)
Using Revolt/event-loop 1.0.7 (was same on 1.0.6) as part of amphp/amp 3.0.2.
The text was updated successfully, but these errors were encountered: