-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add log_level integration test and reduce order dependence between load and log_level #1344
Conversation
As the issue fixed here affects developer/admin usability rather than data collection, I don't propose to back-cherry-pick it to 4.4.x. |
@nichamon, could you please review this? |
The code looks okay. A downside is that there may not be any I am running it against the |
@baallan the reason there is an order dependence is because the plugin decides what the log name is. I'm struggling with creating logs for plugins that are not loaded yet. What is the problem exactly with setting the log level after the plugin is loaded and the name is known and you can get a proper error message if the configuration writer finger fumbles the log name? |
@tom95858 Another of the problems is that log contexts with dynamically determined names like a plugin store.store_csv.$context might want to be configured as part of the plugin configuration that happens between load and config (where $context is finally registered only after the first set of type 'context' arrives). The user knows the exact name they care about but the plugin only knows naming construction patterns. Finally, least-surprise suggests that we follow the practice in other tools that provide context dependent logging; e.g. log4c/log4j let the end user write a configuration file parsed at global log init, not after context registration. This decouples The finger-fumble problem exists but is trivially handled. log_status clearly displays "(pending registration)" for any plugin not yet loaded or any name that is misspelled. For example a common error will be for someone to set the log level of a plugin with
As we do not in any other area of the C-based command language provide typo-detection messages of the sort "X123 not defined; did you mean item X124?" I don't see why we would do it here. If a core fuzzy-matching library was added, it could of course be supported in many areas of the interface including log config. However, if you want a WARNING or INFO message emitted when log level is set for a pending name, i'll be happy to add that. This would be equivalent to the message GDB gives when you set a breakpoint to a function not yet loaded. |
What is hard about this?
You're not defining open() or printf() here @baallan which is where this quote comes from, these are application level interfaces and ironically this change makes them more unpredictable not less.
A person would have to understand what "pending registration" means. It's silly, they won't.
I'm sorry, I can't parse this part in any way. |
How do we feel about adding a log-level option to the load command? That way we could get what plugin writers want/need and not have the possibility of waif logs that are not associated with any plugin. |
The concern for "waif's" is massively overblown and customizing old commands to accommodate new orthogonal features seems like the wrong way to go. Separation of concerns (what logging i want vs what test i am applying it to) seems really obvious here. Also, this moves us for logging into the declarative direction in all but the most verbose debugging scenarios. In fact I dislike 'extending load' intensely. Here's my motivating use-case for this dislike:
Similarly, we can in tutorials publish a variety of log level setting prologs for different purposes that users can then apply to any scenario they want to investigate. e.g.
|
I love making the log struct opaque. One thing I'm not following in the "(pending registration)" part. It seems at first glance that the only way that any "subsystem" can log messages is through the ovis_log_t handle. The only way to get an ovis_log_t handle is through ovis_log_register(), at which time "desc" is set. Under what conditions would "(pending registration)" be displayed? Do we need that at all? |
He's using at as a "sentinel" to determine when/if the log has been pre-registered or registered. I think this is something called "data coupling" which is, in general, highly discouraged. This is still top of tree, so we have time to refactor. |
@morrone "(pending registration)" means no plugin or framework has yet called register(context_name, context_description). I have provided usage examples above where the configurer of ldmsd would want to set the log configuration for a given plugin before the plugin is loaded. Allowing level setting only after the plugin is loaded (a) prevents controlling the logging that may happen during loading, (b) forces modifications of a configuration script to be debugged into that script after the load statement, introducing the possibility of other inadvertent changes to the configuration being debugged. (Pending registration) is the place-holder for the description that the plugin will provide. It is displayed by log_status which lists all the registered (or with this change the expected) log names and descriptions. |
That seems like a problematic design in the existing log code. How would a mere human ever know what to put in the "name=" field for the log command then, if the plugin can make that anything it wants? That is inconsistent with existing command design if the name isn't the same name used in the other commands. I think I agree that this points to us needing to allow passing a log level in as part of "load". I certainly sympathize with the desire to set log levels at the top of the file, but this is maybe one of the many down sides of us having an imperative configuration system rather than having a declarative configuration file. Perhaps more global log level setting (with proper checks to ensure no dangling log levels) can be part of the declarative configuration format that we are going to get from maestro? |
Oh, well in that case I think it would be more straight-forward to just add a:
to the struct. That is far more clear to the code reviewer what you are trying to accomplish. As currently written, that says to me "we expect log messages to be sent to this before registration happens". Having a separate flag in the struct that gates usage based on whether or not it is registered makes things far clearer. Sure, the code that handles log_status needs to be updated to say "(Pending registration)" when a log struct is not yet registered, but you are genuinely adding a new state of existence to that struct, so making that explicit makes things clearer, I think. |
There is a much more complicated implementation I could propose for a separate "preregistration" list. Such a list would collect all the log_level settings for names that cannot be permanently resolved at the time the statement is parsed/first processed. No new command-line syntax needed; just semantics extended to make log configuration much more flexible. This list would actually permanently include any regex name policies as well, since it is almost certain that some plugins or some contexts will be registered after regex policies are defined; it would be similar to the repeated handling of a regex matching list for producers. There could still be a typo in the name during a preregistration, but we would separately list preregistration & regex log_level rules from the actually running logs. |
@morrone On the topic of "declarative" or not, a very common debugging log scenario is something like:
where we don't want per-sample info in bulk. We could expand the log_level API to have not just a plugin context but a second function context.
but this then adds comparison logic and another argument for the action to the log call processing . |
@baallan, this is getting a bit far afield. All we want to do is log messages from init(), correct? |
@morrone, there are commands to query the registered logs. And the log naming is structured, so I can say "log_level name=sampler.*" and select all sampler messages, etc... I think this whole thing stems from the multi-config changes that added an init function and @baallan wants to be able to control how messages are emitted by that function. This is why I suggested amending the load command to specify the initial log level. You could change it globally too, by setting a log level for the default log, but then you would get a flood of messages that you may not be interested in. As I said above, the design case was to allow the user to issue a query command to see what logs are available and then use the log_level command to select what they were interested in changing. The man page would/should also specify what the log name is for each plugin. With this change, anyone can submit a log_level request and create a log that may never be used (without error) which is why @nichamon and I are skeptical of this change. Also, they may request to see debug messages from an init function and never know that they had specified the wrong log name. At this point, I think we should revert this change and replace it with the extra option(s) for the load command. @nichamon, what do you think? |
I don't think this has landed yet. Nothing to revert, unless I'm missing something? |
Sorry, I meant re-factor... |
Even if we mix in log_level with load, that doesn't fix the problem that the current implementation for wildcard log name matching only applies to logs that are already registered. (Admittedly, this patch as-is also doesn't fix the regex registration problem-- I am trying to first get the load debugging problem fixed and get the precedent set that we need forward-looking log level setting policy conceptually). If folks generally prefer a one-off solution for load and for Mon to bring the regex policy processing up to the same methodology we have for producer/updtr matching, that would be adequate. |
Firstly, it's important to clarify that the pull request aims to enable applications to set the log level of a non-existing log subsystem. @baallan initiated the patch because he has a specific use case: he wants to observe DEBUG messages from a particular plugin during its loading and configuring stages, and then change the level to ERROR after the plugin has been configured. It's crucial to note that the reason for the order dependency is not because plugins can name their log subsystems arbitrarily. Rather, it's because Another crucial point is that LDMSD plugins cannot arbitrarily name their log subsystems. A plugin's log subsystem must align with |
To address @baallan's use case of setting the log level of a specific plugin during both load and config times, different from the global log level, there are two proposed solutions:
In my view, I'm not particularly supportive of either option. Firstly, regarding option 1), I'm concerned that it could potentially lead to more errors in LDMSD configuration. While LDMSD might not rigorously check for typos in all instances, I don't believe we should introduce additional opportunities for typos to occur. As for option 2), modifying the plugin interface will effects on all plugin implementations, which leads to a huge change. |
I believe that the primary use case for changing the log level before and after configuring a plugin occurs mainly during the testing stage of a release or when troubleshooting issues arises. An alternative approach to achieving what Ben desires is to initially set the global log level to DEBUG (or the most verbose level they want to see log messages) and then adjust it at the end of the configuration file. While this method would result in a higher volume of log messages, it offers a workaround. By utilizing tools like |
@baallan I just reread issue #1343 and your example in #1342. I want to understand your use case and the problems you encountered, which led you to require more verbose logging during the loading process and also this patch. From this, I inferred the following:
Please confirm if my understanding is accurate. |
@nichamon That is partially correct. In addition to the not loading example you correctly extracted, there is information I want logged at successful load time rather than waiting for config() or sample(). |
What Ben, please be specific. |
The specifics are highly plugin and developer dependent; they're messages to the debug or warning log that go beyond the standard plugin loading messages from the framework about finding the library. So how is it relevant to this discussion? An obvious example would be something like "resource xyz does not exist; this plugin cannot do anything useful on this node. if it should, you may need to 'insmod xyz' before starting ldmsd". |
I apologize, but that is not an example, it's a vague statement about a presumed problem. What is the name of the sampler? What is the resource in question? From which function/context, etc... do you want this message emitted from? |
@tom95858 a substantial list of samplers (tx2mon, the various permutations on samplers of IB, omnipath, infiniband, slingshot, and lustre) would be improved by putting appropriate diagnostics in the load() implementation instead of delaying to config(). This is even more true for the multi-instance samplers where the problem is with the plugin/system combination and not with a specific configuration of the plugin. |
@nichamon, you mentioned:
That (type.name) was simply an opinion and a recommendation; it was not something broadly agreed to and it is not something enforceable when the registration name is simply a string. For example (type.name.instance_name) names are needed for filtering logs from multi-instance plugins. Because the plugin author may determine a need for further named log subcontexts for more complicated components (like samplers that also handle streams and sets or like transform plugins or like whatever else we cannot foresee), we have no business describing (type.name) as the only allowable context name. But this is far afield from the primary discussion which is that configuring a log context should not be tied to that context already having been defined (and that 'load' is the poster-child example of why they should not be sequentially tied). |
@baallan, I am closing this request. Please move this to a discussion and we will continue there. |
fixes #1343 (excess order dependence between load and log_level).
updates the header documentation for ovis_log_set_level_by_name.
Make the ovis_log_t pointer opaque outside ovis_log.c
illustrates #1342 (excess reordering of log_level statements breaks debugging scenario log flow management).
will generate a FAIL message at the end due to reordering of log_level commands in the input script.