Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-33142 Distinguish between id and sequence number in SysInfoLogger #19367

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

shamser
Copy link
Contributor

@shamser shamser commented Dec 18, 2024

Use id or msgid for unique SysInfoLogger message identifier Use SeqNum for the sequence number used to the generated ensure id/msgid is unique

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-33142

Jirabot Action Result:
Workflow Transition To: Merge Pending
Updated PR

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - looks sensible, but spotted one minor (not-new) issue and I suspect worth defining the other masks and commenting on the bit space and how it's used.

#define ATTR_TIMESTAMP "@ts"
#define ATTR_AUDIENCE "@audience"
#define ATTR_CLASS "@class"
#define ATTR_CODE "@code"
#define ATTR_HIDDEN "@hidden"
#define ATTR_SOURCE "@source"

#define NON_SYSLOGMSG_MASK 0x8000000000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: static constexpr unsigned __int64 nonSysInfoLogMsgMask = 0x8000000000; .. would be more modern/compiler friendly. Or "sysLogInfoIsStaticMask"?

@@ -89,4 +89,8 @@ SYSINFO_API unsigned deleteLogSysInfoMsg(IConstSysInfoLoggerMsgFilter * msgFilte
SYSINFO_API bool deleteLogSysInfoMsg(unsigned __int64 msgId, const char *source=nullptr);
SYSINFO_API unsigned deleteOlderThanLogSysInfoMsg(bool visibleOnly, bool hiddenOnly, unsigned year, unsigned month, unsigned day, const char *source=nullptr);

/* makeMessageId - Create a message id from date, sequence number and nonSysInfoLogMsg flag to uniquely identify a message
- nonSysInfoLogMsg flag is used to identify whether or not the message managed by SysInfoLogger */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial:

whether or not the message managed by SysInfoLogger
whether or not the message is managed by SysInfoLogger

{
return id<<21 | year<<9 | month<<5 | day;
return (nonSysInfoLogMsg?NON_SYSLOGMSG_MASK:0) | seqN<<21 | year<<9 | month<<5 | day;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not new/minor: bits 9-20 inclusive reserved for year (12 bits), so max year is 4095?
Not that it's ever going to an issue :) , but potentially confusing to reader, should cope with any 4 digit year, and needs 14 bits for that. so would have to be seqN<<23 I think.

Arguably might be better to more explicitly acknowledge that the sequence number is 32bit only and shift it to the top 4 bytes, and use the spare bits in the lower unsigned space for everything else:

    return seqN<<32 | (nonSysInfoLogMsg?nonSysInfoLogMsgMask:0) | year<<9 | month<<5 | day;

where nonSysInfoLogMsgMask uses (e.g.) 23th bit

Probably worth adding constexpr for day/month/year masks, with comments to bit range for clarify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I'm not sure that max year 4095 is confusing. 9999 is another arbitrary upper limit just like 4095. (It is just as valid to have year 10000 year 9999!)
  2. If supporting all 4 digit years is important, we need the nonSysInfoLogMsgMask at 24 bit (not bit 23)
  3. If we have the nonSysInfoLogMsgMask as the 32 bit, it would allow other bits between 25-31 to be be available for other uses.

@jakesmith

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right 9999 is just as arbitrary - I think as a long as it's clear. Add comments that using 12 bits and what upper limit is and it'll be fine (I think supporting 0-9999 would be conventional though).

If supporting all 4 digit years is important, we need the nonSysInfoLogMsgMask at 24 bit (not bit 23)

yes 24th bit:

0-4 = day
5-8 = month
9-22 = if 14 bits for year [ currently 9-20 (12 bits) ]
23-31 = for other meta, e.g. 1 bit to signify static messages.
32-63 = upper 32 bits for sequence number.

would be worth adding this as a comment block (near the mask definition) and/or adding masks for other for others together with the comments.

@shamser shamser requested a review from jakesmith December 23, 2024 16:26
{
return id<<21 | year<<9 | month<<5 | day;
// bits 0-4 = day, bits 5-8 = month, bits 9-23 = year
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these ranges are inclusive. 9-23 = 15 bits - allowing year range of 0 - 32767.

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - looks good. 1 comment (https://github.com/hpcc-systems/HPCC-Platform/pull/19367/files#r1895937587) re. year range.

Please squash.

@shamser shamser force-pushed the issue33142 branch 2 times, most recently from ec55bcd to 968121e Compare December 23, 2024 16:54
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser one question.

// bit 23-30 unused
// bit 31 = (flag) unset means SysInfoLogger managed message, set means message managed elsewhere
// bits 32-63 = sequence number
return seqN<<32 | (nonSysInfoLogMsg?nonSysInfoLogMsgMask:0) | year<<9 | month<<5 | day;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? seqN is an unsigned seqn<<32 will be 0 (or undefined).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works. Should I make all the vars 64bit to make it clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you mean by it works. It didn't even compile on my machine in debug when I pasted it into the code generator:

: error: left shift count >= width of type [-Werror=shift-count-overflow]

It looks correct with the extra cast.

@ghalliday
Copy link
Member

Also should the commit message be "sequence" rather than "sequency"?

@ghalliday
Copy link
Member

reflection: Some unit tests would have caught the issue. Are there any unit tests for this set of functionality?

Use id or msgid for unique SysInfoLogger message identifier
Use SeqNum for the sequence number used to the generated ensure id/msgid is unique

Signed-off-by: Shamser Ahmed <shamser.ahmed@lexisnexis.com>
@shamser shamser changed the title HPCC-33142 Distinguish between id and sequency number in SysInfoLogger HPCC-33142 Distinguish between id and sequence number in SysInfoLogger Jan 10, 2025
@shamser shamser requested a review from ghalliday January 10, 2025 15:25
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good. @shamser How could the problem have been caught by the unit tests?

@ghalliday ghalliday merged commit 5c4989b into hpcc-systems:master Jan 14, 2025
52 checks passed
Copy link

Jirabot Action Result:
Added fix version: 9.10.0
Workflow Transition: 'Resolve issue'

@shamser
Copy link
Contributor Author

shamser commented Jan 14, 2025

the problem have been caught by the unit tests

It would have been caught by the unit test: DaliSysInfoLoggerTester. However, DaliSysInfoLoggerTester is disabled in the smoke tests because Dali is not available in the workflow that executes unittests. Should dali access be enabled for smoke testing unit tests? @ghalliday

@jakesmith
Copy link
Member

the problem have been caught by the unit tests

It would have been caught by the unit test: DaliSysInfoLoggerTester. However, DaliSysInfoLoggerTester is disabled in the smoke tests because Dali is not available in the workflow that executes unittests. Should dali access be enabled for smoke testing unit tests? @ghalliday

I'm slightly surprised there was haven't got a framework for this already.
I think it is reasonable for some unittests to involve other components like Dali - but aren't necessarily the whole environment, and don't require ECL queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants