OF-2628: Avoid polynomial regular expression used on uncontrolled data #2221

guusdk · 2023-07-18T13:53:32Z

Prevent possibility of regular expression execution depending on user-provided values taking exceptionally long / many resources to complete

This commit replaces the regular expression with a solution that doesn't use regular expressions. Arguably, the complexity of this code does not change much.

Prevent possibility of regular expression execution depending on user-provided values taking exceptionally long / many resources to complete This commit replaces the regular expression with a solution that doesn't use regular expressions. Arguably, the complexity of this code does not change much.

guusdk · 2023-07-18T13:54:33Z

Is this a solution that covers all edge cases, and is roughly not much worse in performance as compared to the original?

Fishbowler · 2023-07-18T14:35:22Z

I wonder if we can test efficacy AND performance with a unit test pre- and post-change

GregDThomas

LGTM, and is better in that it allows hex char references.

I can't offer any opinion on performance, other than perhaps suggest it probably doesn't matter too much unless it's awful, which I doubt?

Fishbowler · 2023-07-21T21:36:10Z

Added some unit tests and ran them on main and on this branch a bunch of times via mvn -Dtest=XMLLightweightParserTest#testHasIllegalCharacterReferences* test - sometimes one is faster, sometimes the other. There's no perceivable difference.

Fishbowler · 2023-07-21T21:37:19Z

I won't merge unit Guus gives my unit tests a look. They're really very brief...

GregDThomas · 2023-07-21T21:42:53Z

Might want to replace @test with @RepeatedTest(10_000) - or some other number, but they look pretty quick to run - to get a bigger sample for timing purposes.

guusdk · 2023-07-22T06:07:54Z

The original should also have allowed for hex-based character references - but the fact that that wasn't clear is probably a good indication that this change isn't the worst, from a readability perspective.

As for the test:

/me points at org.jivesoftware.openfire.nio.XmlNumericCharacterReferenceTest
Although it's good to get some anecdotal evidence, I don't think we should use unit testing for automated load comparison, to be honest.

This reverts commit dc11312.

Fishbowler · 2023-07-25T08:17:08Z

Thanks Guus. I totally didn't spot those, and they're much better functional coverage. Reverted.

I agree with not using unit tests for automated load comparison in the general case. It requires constrained and understood code paths and an environment where the performance variables are somewhat understood. Understanding the results are also important (e.g. one run with a 10% isn't interesting, but a consistent 10% execution time change very much is).

guusdk requested review from GregDThomas and Fishbowler July 18, 2023 13:53

GregDThomas approved these changes Jul 21, 2023

View reviewed changes

OF-2628: Add unit tests

dc11312

Fishbowler approved these changes Jul 21, 2023

View reviewed changes

Revert "OF-2628: Add unit tests"

1763634

This reverts commit dc11312.

Fishbowler merged commit 9fea45d into igniterealtime:main Jul 25, 2023
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OF-2628: Avoid polynomial regular expression used on uncontrolled data #2221

OF-2628: Avoid polynomial regular expression used on uncontrolled data #2221

guusdk commented Jul 18, 2023

guusdk commented Jul 18, 2023

Fishbowler commented Jul 18, 2023

GregDThomas left a comment

Fishbowler commented Jul 21, 2023

Fishbowler commented Jul 21, 2023

GregDThomas commented Jul 21, 2023

guusdk commented Jul 22, 2023

Fishbowler commented Jul 25, 2023

OF-2628: Avoid polynomial regular expression used on uncontrolled data #2221

OF-2628: Avoid polynomial regular expression used on uncontrolled data #2221

Conversation

guusdk commented Jul 18, 2023

guusdk commented Jul 18, 2023

Fishbowler commented Jul 18, 2023

GregDThomas left a comment

Choose a reason for hiding this comment

Fishbowler commented Jul 21, 2023

Fishbowler commented Jul 21, 2023

GregDThomas commented Jul 21, 2023

guusdk commented Jul 22, 2023

Fishbowler commented Jul 25, 2023