saptune called to often by agent #363

scmschmidt · 2024-02-02T13:12:37Z

scmschmidt
Feb 2, 2024

During writing tests for the new saptune checks I noticed, that the default for saptune-discovery-period is 10s, which has the following problems:

A saptune status can run several seconds, mainly due to the verify run, which can be very time consuming. Traversing the block devices on large machines has proven to be a real time eater. This causes a consistent load on the machine , which is not desired.
Currently saptune protects the execution of multiple instances in parallel by a simple locking. This means, a saptune status blocks other saptune calls. This not only can make working with saptune nearly impossible, but also causes checks to fail, if they run during a saptune status triggered by the disvovery process. On large machines even a discovery could lock out its next cycle.

I think its best to increase saptune-discovery-period to a much higher value like 1800 or 36000.
This solves problem 1, but only mitigates number 2. It still can happen, that a check run fails, if that saptune runs into a lock caused by discovery.

@angelabriel is working on a patch for saptune (3.1.2) where locking is limited to commands which have the potential to change the system. (We know, that locking only code paths that really change something during the time of change, would be best, but this causes a large code rework, which currently we cannot afford.)
A discovery saptune status, as well as the commands used in the gatherer currently (status, solution|note list|verify),
would not set a lock anymore and not interfere with each other.

The risk remains, that a saptune command executed by the user could set a lock which would interfere with discovery and the checks. With some exceptions, those commands are short-lived, so may be the gatherer could be enhanced in a way that if its saptune is running into a lock, it tries again after a second or two.

dottorblaster · 2024-02-13T13:42:18Z

dottorblaster
Feb 13, 2024

@scmschmidt for sure we can increase the Saptune discovery value to something higher, we just need to settle on a specific high enough value. 1800-36000 is a pretty broad interval, could you provide insights about what value we should adopt?

Apart from that feel free to inform users that every discovery interval is configurable to make it tailored to their needs. 👍

1 reply

scmschmidt Feb 13, 2024
Author

It should be 1800 to 3600 (one zero too many)., but still is a range. ;-)
I'd say let's take 1800. Checking every 30 minutes seems reasonable to me.

abravosuse · 2024-02-16T10:25:12Z

abravosuse
Feb 16, 2024

@scmschmidt the default saptune discovery period has been changed to 900s (15m). See trento-project/agent#322.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

saptune called to often by agent #363

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

saptune called to often by agent #363

scmschmidt Feb 2, 2024

Replies: 2 comments · 1 reply

dottorblaster Feb 13, 2024

scmschmidt Feb 13, 2024 Author

abravosuse Feb 16, 2024

scmschmidt
Feb 2, 2024

Replies: 2 comments 1 reply

dottorblaster
Feb 13, 2024

scmschmidt Feb 13, 2024
Author

abravosuse
Feb 16, 2024