Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split by comma + handle nested parentheses #16

Merged
merged 2 commits into from
Feb 22, 2024

Conversation

Vuizur
Copy link
Contributor

@Vuizur Vuizur commented Feb 21, 2024

This is the proper version of xxyzz/WordDumb#187
I think it's an improvement. There might be more possible heuristics to apply here.

I additionally found that the current regex did not handle nested parentheses when I stumbled upon some glosses with them in the English Wiktionary.

Vuizur and others added 2 commits February 22, 2024 18:30
Only keep the text before the first comma, texts after comma usually
are not a complete sentence.
@xxyzz xxyzz merged commit cdec50a into xxyzz:master Feb 22, 2024
19 checks passed
@xxyzz
Copy link
Owner

xxyzz commented Feb 22, 2024

Thanks for your contribution! I changed the remove_parentheses() a bit and added some tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants