Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct regexp lowercase \z terminator #335

Closed
wants to merge 1 commit into from

Conversation

johnnyshields
Copy link
Contributor

@johnnyshields johnnyshields commented Sep 1, 2024

In VALID_DECIMAL128_STRING_REGEX, a recent commit used \Z (uppercase). Probably \z (lowercase) is the correct one. The current regex will allow "21.23\n" as a valid value.

$ Matches the end of a line.

\z Matches the end of the string.

\Z Matches the end of the string unless the string ends with a ``\n'', in which case it matches just before the ``\n''.

https://ruby-doc.com/docs/ProgrammingRuby/html/language.html#UJ

@jamis
Copy link
Contributor

jamis commented Sep 4, 2024

Hey @johnnyshields -- thanks for the PR. Is the concern that a string with a newline matches, and you believe it shouldn't?

Looking at the original regex (prior to that commit you mentioned), it used /^...$/ anchors, and also matches the newline:

"21.23\n" =~ /^[\-\+]?(\d+(\.\d*)?|\.\d+)(E[\-\+]?\d+)?$/i # => 0

The \Z anchor preserves this behavior, whereas (as you noted) the lowercase \z does not. It should be noted that Float("21.23\n") also succeeds, so allowing a trailing newline has some precedent in Ruby itself.

If I'm misunderstanding your concern, please let me know. Otherwise, I'm inclined to let the current implementation stand.

It should be noted in passing that the regex was changed at the recommendation of the CodeQL static analyzer, which offered \Z as the recommended alternative to $. I had no compelling reason to prefer one over the other, so opted to appease CodeQL.

@johnnyshields
Copy link
Contributor Author

If being permissive of the trailing newline is intentional, then fine to keep it. (In general \z is preferable to \Z; in the vast majority of regexp use cases one does not intend to allow trailing newlines.)

@jamis jamis closed this Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants