Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-32041 WsStore support invalid IPT paths #18763

Conversation

rpastrana
Copy link
Member

@rpastrana rpastrana commented Jun 13, 2024

  • Encodes IPT path components
  • Decodes Key metadata

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32041

Jirabot Action Result:
Workflow Transition: Merge Pending
Updated PR

@rpastrana rpastrana requested a review from ghalliday June 14, 2024 13:08
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that it is necessary to encode the namespaces - they are explicitly under our control, and it would always report an error if an application had used an invalid value. We should be encoding the key though - as you have.

I would also look at using encodePTreeNameUtf8Char to encode the user name. It would need a slight modification to also encode an at at the start of the string (would be simple to implement by passing in first as a parameter and a little care inside the function)

@rpastrana
Copy link
Member Author

I am not sure that it is necessary to encode the namespaces - they are explicitly under our control, and it would always report an error if an application had used an invalid value. We should be encoding the key though - as you have.

I would also look at using encodePTreeNameUtf8Char to encode the user name. It would need a slight modification to also encode an at at the start of the string (would be simple to implement by passing in first as a parameter and a little care inside the function)

The namespaces are provided by the caller, we're not likely going to encounter issues, but we might as well sanitize all user input.
What's the motivation for looking into encodePTreeNameUtf8Char? The function used now ultimately uses encodePTreeNameUtf8Char anyway.

Also, I need to decode the key names in a couple of the wsstore methods as well...

@rpastrana rpastrana force-pushed the HPCC-32041-WsStoreHandleInvalidIPTPaths branch from d1a442f to 52f90db Compare June 14, 2024 18:47
@rpastrana rpastrana requested a review from ghalliday June 14, 2024 18:48
@rpastrana
Copy link
Member Author

All pre-existing use cases succeeded, added new tests to account for encode-able IPT path components
image

@rpastrana
Copy link
Member Author

corresponding junit tests: hpcc-systems/hpcc4j#717

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be inclined to merge (with a fix for encoding ns) and ignore the problem of user names starting with an @

This file would benefit from a future PR that commoned up duplicate code. It would reduce the size and make changes like this easier to review.


xpath.appendf("/%s/%s", ns, key);
xpath.appendf("/%s/%s", ns, encodedKey.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name space not encoded.

@ghalliday
Copy link
Member

@rpastrana this is waiting for you - one case of a namespace not being encoded (so it could be inconsistent)

@rpastrana
Copy link
Member Author

@rpastrana this is waiting for you - one case of a namespace not being encoded (so it could be inconsistent)

Thanks. This is on my radar. How important is the leading '@' problem you mentioned. That fix involves heavily used logic and will take longer to test.

@rpastrana
Copy link
Member Author

@rpastrana this is waiting for you - one case of a namespace not being encoded (so it could be inconsistent)

Thanks. This is on my radar. How important is the leading '@' problem you mentioned. That fix involves heavily used logic and will take longer to test.

UPDATE, just noticed your comment on this matter...

@rpastrana rpastrana requested a review from ghalliday June 28, 2024 18:11
@rpastrana
Copy link
Member Author

Updated test case: hpcc-systems/hpcc4j#717

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed another case.

@@ -189,34 +213,42 @@ bool CDALIKVStore::fetchKeyProperty(StringBuffer & propval , const char * storen
if (isEmptyString(storename))
throw MakeStringException(-1, "DALI Keystore fetchKeyProperty(): Store name not provided");

StringBuffer encodedStoreName;
encodePTreeName(encodedStoreName, storename);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't being used in line 230

@rpastrana rpastrana requested a review from ghalliday July 9, 2024 22:02
@rpastrana
Copy link
Member Author

@ghalliday good catch. Fixed, took another look and hopefully didn't miss any others.

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana unfortunately a couple of further issues.

One general comment - not to be address in this pr - is that the code would be clearer if the code to encode the values was adjacent to the code that used it.

for (unsigned i = 0; i < keys.length(); i++)
{
StringBuffer decoded;
decodePtreeName(decoded, keys.item(i));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have already been decoded within the function - so they will currently be decoded twice. That could potentially cause problems with some pathological key names.

kvpair->setKey(attributes->queryName());
//it's possible this has been encoded, so decode it
StringBuffer decoded;
decodePtreeName(decoded, attributes->queryName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute names have not been encoded - so there should not be matching code to decode them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IPT attributes iterator contains all keys under a particular namespace branch, those keys would have been encoded in the set method. I believe decoding is necessary here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read the code it looked like the attributes would be things like DALI_KVSTORE_CREATEDTIME_ATT which have not been encoded (and are also not user specified).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana you haven't responded to this comment. I can merge as-is because the decoding will have no effect, but I also don't think it is ever necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to decode. Will revert

Comment on lines +413 to +407
StringBuffer decoded;
decodePtreeName(decoded, name.str());

kvpair->setKey(decoded.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observation: No worth changing - but this breaks the encapsulation - the fact that the keys are encoded shouldn't really need to be known by the calling code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, I do wish the encoding decoding could have been hidden from the client (ws_store) in this case.

@rpastrana
Copy link
Member Author

No longer re-decoding.

@rpastrana
Copy link
Member Author

Added new junit test. Results:
Fetching all WsCli@ntT_estStore.Junit_t@estskeys...
All Keys: {a=ddfa, ecl.playground.sample.default=Java Simple, global.test=success, files.rowperpage.default=50, encod@ble=whatever}
Fetching encoded key attributes: WsCli@ntT_estStore.Junit_t@ests.encod@ble
Key Metadata: {@CreateTime=2024-07-11T14:51:17, @Createuser=Juni@tUser}
Fetching encoded key: WsCli@ntT_estStore.Junit_t@ests.encod@ble
Key/value: whatever

@rpastrana rpastrana requested a review from ghalliday July 11, 2024 14:55
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana I think there is still some unnecessary decoding of attribute names, but I would merge as is since it has no ill effect (just confusing because not needed).
Please sqaush.

- Encodes all user provided IPT path components
- Decodes all IPT paths provided to user

Signed-off-by: Rodrigo Pastrana <rodrigo.pastrana@lexisnexisrisk.com>
@rpastrana rpastrana force-pushed the HPCC-32041-WsStoreHandleInvalidIPTPaths branch from e1e785d to a55eb1c Compare July 16, 2024 14:10
@ghalliday ghalliday merged commit 4280470 into hpcc-systems:candidate-9.4.x Jul 17, 2024
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants