saptune called to often by agent #363
Unanswered
scmschmidt
asked this question in
Q&A
Replies: 2 comments 1 reply
-
@scmschmidt for sure we can increase the Saptune discovery value to something higher, we just need to settle on a specific high enough value. 1800-36000 is a pretty broad interval, could you provide insights about what value we should adopt? Apart from that feel free to inform users that every discovery interval is configurable to make it tailored to their needs. 👍 |
Beta Was this translation helpful? Give feedback.
1 reply
-
@scmschmidt the default saptune discovery period has been changed to 900s (15m). See trento-project/agent#322. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
During writing tests for the new
saptune
checks I noticed, that the default forsaptune-discovery-period
is 10s, which has the following problems:A
saptune status
can run several seconds, mainly due to the verify run, which can be very time consuming. Traversing the block devices on large machines has proven to be a real time eater. This causes a consistent load on the machine , which is not desired.Currently
saptune
protects the execution of multiple instances in parallel by a simple locking. This means, asaptune status
blocks othersaptune
calls. This not only can make working withsaptune
nearly impossible, but also causes checks to fail, if they run during asaptune status
triggered by the disvovery process. On large machines even a discovery could lock out its next cycle.I think its best to increase
saptune-discovery-period
to a much higher value like 1800 or 36000.This solves problem 1, but only mitigates number 2. It still can happen, that a check run fails, if that
saptune
runs into a lock caused by discovery.@angelabriel is working on a patch for
saptune
(3.1.2) where locking is limited to commands which have the potential to change the system. (We know, that locking only code paths that really change something during the time of change, would be best, but this causes a large code rework, which currently we cannot afford.)A discovery
saptune status
, as well as the commands used in the gatherer currently (status
,solution|note list|verify
),would not set a lock anymore and not interfere with each other.
The risk remains, that a
saptune
command executed by the user could set a lock which would interfere with discovery and the checks. With some exceptions, those commands are short-lived, so may be the gatherer could be enhanced in a way that if itssaptune
is running into a lock, it tries again after a second or two.Beta Was this translation helpful? Give feedback.
All reactions