MySQL connection exceptions #2038

adrianbj · 2025-02-07T15:47:57Z

Hi @ryancramerdesign - I can't find it at the moment, but I know some time ago there was some discussion about PW retrying connections automatically which had you add this:

https://github.com/processwire/processwire/blob/f22739a54c8fd8d1ce3e41a9b95e37129f6376b1/wire/core/DatabaseQuery.php#L732-L743

The problem is that I don't think it's inclusive of enough scenarios. I recently had a failure with a Digital Ocean managed database connection - even with a standby node - apparently "There was a failover event that involved the underlying node switching to the standby, which may have been triggered by either a degraded primary node or a missed heartbeat signal from the primary."

The error was "PDOException SQLSTATE[08S01]: Communication link failure: 1053 Server shutdown in progress" on $query->execute();

I am not sure exactly what would be involved in handling this, but would it be as simple as removing these checks?

$retry = $code === 2006 || stripos($msg, 'MySQL server has gone away') !== false;
if($retry && $numTries < $options['maxTries']) {

Also is there really a good reason for maxTries to be set, let alone to something as low as 3? I just worry that if a database connection isn't responding then it could be a substantial period of time before it's available again. In my case it was running background tasks and most of them failed because they were in a large loop.

Would love if you think this could be made more robust to handle all exception types and not just the classic 2006 gone away.

The text was updated successfully, but these errors were encountered:

adrianbj · 2025-02-10T16:25:35Z

@ryancramerdesign - a little more info from Digital Ocean.

They promoted my standby node to the new master and spun up a new standby node. They didn't provide an exact time that this took, but apparently it should not have taken more than 8 minutes.

But, they also noted "Note that our platform would trigger a node replacement when there's 180 seconds period of unavailability, so you should account for at least that amount of time when you're setting your retry and timeout values."

So I am not sure exactly what the potential downtime could be, but I do that think PW's retry needs to be more robust. Either that, or I need to add a custom trycatch that doesn't give up after 3 tries and continues the foreach loop once a connection is available again.

Would love your thoughts on whether you'd be willing to improve this is the core or if you think it's something I need to handle.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MySQL connection exceptions #2038

MySQL connection exceptions #2038

adrianbj commented Feb 7, 2025

adrianbj commented Feb 10, 2025

MySQL connection exceptions #2038

MySQL connection exceptions #2038

Comments

adrianbj commented Feb 7, 2025

adrianbj commented Feb 10, 2025