Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

í is not considered a valid character in the email address #457

Open
yash-toddleapp opened this issue Jun 27, 2024 · 3 comments
Open

í is not considered a valid character in the email address #457

yash-toddleapp opened this issue Jun 27, 2024 · 3 comments

Comments

@yash-toddleapp
Copy link
Contributor

we have an email joemartínez@google.com which is indeed a valid email address. But MailChecker.isValid("joemartínez@google.com") returns false for this because the regex that you're using doesn't match this

@lyrixx
Copy link
Collaborator

lyrixx commented Jun 28, 2024

Would you mind to work on a PR?

@yashptel
Copy link

Would you mind to work on a PR?

Sure, this is basically what I changed to fix it. Let me know what you think about the solution
image

  • I tested with few languages that I know

  • Go seems to be using a different regex which seems to be working as expected without any changes

  • Javascript and Node.js works as expected

  • Python would need to switch to regex from re for unicode characters support

  • I am linking the regex below if you want to test it

/^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E\p{L}\p{N}]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22))(?:\.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E\p{L}\p{N}]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\]))$/u

@YasharF
Copy link
Contributor

YasharF commented Mar 2, 2025

Is í the only character that has an issue in this case, or are we looking at a bunch of characters that include í ? I think either way, it might be a good idea to add/expand the unit tests to include such cases (if the RFC allows it) as part of the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants