How to set a ignore area #572

AkonXI · 2024-11-26T05:51:27Z

I want to compare the difference between rich text and plain text and get each char's index in rich text. As the pic show ，HTML tags can affect the comparison results. so How could I set a regex such as /<(\S{1,}?)[^>]*>|<\/\1>/gms to ignore these tags but keep their positions in the string or treat HTML tags as a whole ？

The text was updated successfully, but these errors were encountered:

ExplodingCabbage · 2025-02-17T12:18:43Z

Sounds like you'd maybe want to write your own tokenizer (maybe treating each tag, attribute, or text node as a token), tokenize the strings you're comparing to diffArrays? I'm not 100% sure I understand what you want to achieve from the description but I suspect it could be done with a custom tokenizer as described in the second paragraph of Defining custom diffing behaviors.

ExplodingCabbage added the question label Feb 17, 2025

ExplodingCabbage closed this as completed Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to set a ignore area #572

How to set a ignore area #572

AkonXI commented Nov 26, 2024

ExplodingCabbage commented Feb 17, 2025

How to set a ignore area #572

How to set a ignore area #572

Comments

AkonXI commented Nov 26, 2024

ExplodingCabbage commented Feb 17, 2025