Skip to content

v0.4.1 Phrasesearch Performance Improvements

Compare
Choose a tag to compare
@gandersen101 gandersen101 released this 31 Jan 00:04
  • Spaczz's phrase searching algorithm has been further optimized so both the FuzzyMatcher and SimilarityMatcher should run considerably faster.
  • The FuzzyMatcher and SimilarityMatcher now include a thresh parameter that defaults to 100. When matching, if flex > 0 and the match ratio is >= thresh during the initial scan of the document, no optimization will be attempted. By default perfect matches don't need to be run through match optimization.
  • flex now defaults to len(pattern) // 2. This creates more meaningful difference between "default" and "max" with longer patterns.
  • PEP585 code updates.