You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When using dynamic summaries with bolding in a Vespa Docker container, non‑ASCII characters (e.g. lowercase "ç", "ö", "ü") are incorrectly tokenized and highlighted. The output shows mismatched tokens, e.g.
Run a query that triggers dynamic summary generation using:vespa query "select * from docs where userQuery(@q)" tracelevel=5 query="Aksalaçlarla" ranking.profile="bm25" > output.txt
Expected behavior
Dynamic summary generation should correctly process and highlight full tokens containing non‑ASCII characters without splitting them.
Environment (please complete the following information):
OS: Running in Docker container using pyvespa
Infrastructure: self-hosted
vespaengine/vespa:latest
Vespa version
Vespa CLI version 8.482.31
Trace output
{
"trace": {
"children": [
{
"message": "Using query profile 'default' of type 'root'"
},
{
"message": "Resolved properties:\ntracelevel: 5 (from request)\nyql: select * from docs where userQuery(@q) (from request)\nranking.profile: bm25 (from request)\ntimeout: 10000 (from request)\nquery: Aksalaçlarla (from request)\nmodel.locale: tr (from request)\nmaxHits: 200 (from query profile 'default')\n"
},
{
"message": "Invoking chain 'vespa' [com.yahoo.prelude.statistics.StatisticsSearcher@native -> com.yahoo.prelude.querytransform.PhrasingSearcher@vespa -> ... -> federation@native]"
},
{
"children": [
{
"message": "Invoke searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'"
},
{
"message": "Query parsed to: select * from sources * where weakAnd(default contains \"Aksala\\u00E7larla\") timeout 10000"
},
{
"message": "YQL query parsed: [select * from docs where weakAnd(default contains \"Aksala\\u00E7larla\") timeout 10000]"
},
{
"message": "Invoke searcher 'com.yahoo.search.yql.FieldFilter in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.searcher.JuniperSearcher in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.searcher.PosSearcher in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.semantics.SemanticSearcher in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.search.grouping.GroupingQueryParser in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.searcher.BlendingSearcher in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'"
},
{
"message": "Invoke searcher 'com.yahoo.search.searchers.OpportunisticWeakAndSearcher in vespa'"
},
{
"message": "Invoke searcher 'federation in native'"
},
{
"message": "Federating to [encodingtesting_content]"
},
{
"children": [
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.NGramSearcher in encodingtesting_content'"
},
{
"message": "Rewritten to n-gram matching: [select * from docs where weakAnd((default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"Aksal\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"ksala\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"sala\\u00E7\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"ala\\u00E7l\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"la\\u00E7la\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"a\\u00E7lar\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"\\u00E7larl\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"larla\"))) timeout 9991]"
},
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.DefaultPositionSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.querytransform.CJKSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.querytransform.LiteralBoostSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.RangeQueryOptimizer in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.SortingDegrader in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.searchers.QueryValidator in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.grouping.GroupingValidator in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.WandSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.querytransform.RecallSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.searchers.ValidateNearestNeighborSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.searchers.ValidateMatchPhaseSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.searchers.ValidateFuzzySearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.yql.FieldFiller in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.searchers.InputCheckingSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.significance.SignificanceSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.querytransform.StemmingSearcher in encodingtesting_content'"
},
{
"message": "Stemming with language TURKISH"
},
{
"message": "Stemming: [select * from docs where weakAnd((default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"Aksal\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"ksala\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"sala\\u00E7\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"ala\\u00E7l\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"la\\u00E7la\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"a\\u00E7lar\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"\\u00E7larl\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, implicitTransforms: false}\"larla\"))) timeout 9991]"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.querytransform.NormalizingSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.VespaLowercasingSearcher in encodingtesting_content'"
},
{
"message": "Lowercasing: [select * from docs where weakAnd((default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"aksal\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"ksala\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"sala\\u00E7\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"ala\\u00E7l\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"la\\u00E7la\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"a\\u00E7lar\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"\\u00E7larl\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false}\"larla\"))) timeout 9991]"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.searcher.ValidateSortingSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.querytransform.BooleanSearcher in encodingtesting_content'"
},
{
"message": "BooleanSearcher: Nothing added to query"
},
{
"message": "Invoke searcher 'com.yahoo.search.grouping.vespa.GroupingExecutor in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.searcher.ValidatePredicateSearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.search.searchers.ContainerLatencySearcher in encodingtesting_content'"
},
{
"message": "Invoke searcher 'com.yahoo.prelude.cluster.ClusterSearcher in encodingtesting_content'"
},
{
"message": "encodingtesting_content.num0 search to dispatch: query=[WEAKAND(100) (AND aksal ksala salaç alaçl laçla açlar çlarl larla)] timeout=9991ms offset=0 hits=10 rankprofile[bm25] groupingSessionCache=true sessionId=d66de3e4-f2f0-4544-ba8f-1cf83f9f6d4c.1740125444707.3.bm25 grouping=0 : restrict=[docs]"
},
{
"message": "Current state of query tree: WEAKAND[N=100]{\n AND{\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=1 words=true]{\n \"aksal\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=2 words=true]{\n \"ksala\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=3 words=true]{\n \"salaç\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=4 words=true]{\n \"alaçl\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=5 words=true]{\n \"laçla\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=6 words=true]{\n \"açlar\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=7 words=true]{\n \"çlarl\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=8 words=true]{\n \"larla\"\n }\n }\n}\n"
},
{
"message": "YQL+ representation: select * from docs where weakAnd((default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 1}\"aksal\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 2}\"ksala\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 3}\"sala\\u00E7\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 4}\"ala\\u00E7l\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 5}\"la\\u00E7la\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 6}\"a\\u00E7lar\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 7}\"\\u00E7larl\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 8}\"larla\"))) timeout 9991"
},
{
"message": "Dispatching to search node in cluster = dispatcher.encodingtesting_content key = 0 hostname = app path = 0 in group 0 statusIsKnown = true working = true activeDocs = 1 targetActiveDocs = 1"
},
{
"message": "Sending search request with jrt/protobuf to node with dist key 0"
},
{
"message": [
{
"start_time": "2025-02-21 08:10:44.711 UTC",
"traces": [
{
"timestamp_ms": 0.556817,
"event": "searching for 10 hits at offset 0",
"tag": "query_start"
},
{
"traces": [
{
"timestamp_ms": 0.603174,
"event": "Start query setup"
},
{
"timestamp_ms": 0.607503,
"event": "Deserialize and build query tree"
},
{
"timestamp_ms": 0.716088,
"event": "Build query execution plan"
},
{
"timestamp_ms": 0.846941,
"event": "Optimize query execution plan"
},
{
"timestamp_ms": 0.877587,
"event": "Perform dictionary lookups and posting lists initialization"
},
{
"timestamp_ms": 0.879888,
"event": "Prepare shared state for multi-threaded rank executors"
},
{
"timestamp_ms": 0.891868,
"event": "Complete query setup"
}
],
"timestamp_ms": 0.893241,
"tag": "query_setup"
},
{
"timestamp_ms": 1.211605,
"tag": "query_execution",
"threads": [
{
"traces": [
{
"timestamp_ms": 0.918797,
"event": "Start MatchThread::run"
},
{
"timestamp_ms": 1.101065,
"event": "Start match and first phase rank"
},
{
"timestamp_ms": 1.168828,
"event": "Create result set"
},
{
"timestamp_ms": 1.193982,
"event": "Wait for result processing token"
},
{
"timestamp_ms": 1.195625,
"event": "Start result processing"
},
{
"timestamp_ms": 1.199151,
"event": "Start thread merge"
},
{
"timestamp_ms": 1.199765,
"event": "MatchThread::run Done"
}
]
}
]
},
{
"timestamp_ms": 1.253374,
"event": "returning 1 hits from total 1",
"tag": "query_reply"
}
],
"distribution-key": 0,
"document-type": "docs",
"duration_ms": 1.257014
}
]
},
{
"message": "encodingtesting_content.num0 dispatch response: Result (1 of total 1 hits)"
},
{
"message": "encodingtesting_content.num0 returns:\n #: 0, relevancy: 2.3014565796142468, source: encodingtesting_content.num0, uri: null\n"
},
{
"message": "Return searcher 'com.yahoo.prelude.cluster.ClusterSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.searchers.ContainerLatencySearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.prelude.searcher.ValidatePredicateSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.grouping.vespa.GroupingExecutor in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.BooleanSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.prelude.searcher.ValidateSortingSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.VespaLowercasingSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.prelude.querytransform.NormalizingSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.prelude.querytransform.StemmingSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.significance.SignificanceSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.searchers.InputCheckingSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.yql.FieldFiller in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.searchers.ValidateFuzzySearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.searchers.ValidateMatchPhaseSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.searchers.ValidateNearestNeighborSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.prelude.querytransform.RecallSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.WandSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.grouping.GroupingValidator in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.searchers.QueryValidator in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.SortingDegrader in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.RangeQueryOptimizer in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.prelude.querytransform.LiteralBoostSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.prelude.querytransform.CJKSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.DefaultPositionSearcher in encodingtesting_content'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.NGramSearcher in encodingtesting_content'"
},
{
"message": "GroupingExecutor.fill(null) = {[null]}"
},
{
"message": "encodingtesting_content.num0 fill to dispatch: query=[WEAKAND(100) (AND aksal ksala salaç alaçl laçla açlar çlarl larla)] timeout=9991ms offset=0 hits=10 rankprofile[bm25] groupingSessionCache=true sessionId=d66de3e4-f2f0-4544-ba8f-1cf83f9f6d4c.1740125444707.3.bm25 grouping=0 : restrict=[docs] summary=[null]"
},
{
"message": "Current state of query tree: WEAKAND[N=100]{\n AND{\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=1 words=true]{\n \"aksal\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=2 words=true]{\n \"ksala\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=3 words=true]{\n \"salaç\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=4 words=true]{\n \"alaçl\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=5 words=true]{\n \"laçla\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=6 words=true]{\n \"açlar\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=7 words=true]{\n \"çlarl\"\n }\n WORD[fromSegmented=false index=\"\" origin=\"(0 12)\" segmentIndex=0 stemmed=false uniqueID=8 words=true]{\n \"larla\"\n }\n }\n}\n"
},
{
"message": "YQL+ representation: select * from docs where weakAnd((default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 1}\"aksal\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 2}\"ksala\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 3}\"sala\\u00E7\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 4}\"ala\\u00E7l\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 5}\"la\\u00E7la\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 6}\"a\\u00E7lar\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 7}\"\\u00E7larl\") AND default contains ({origin: {original: \"Aksala\\u00E7larla\", offset: 0, length: 12}, normalizeCase: false, implicitTransforms: false, id: 8}\"larla\"))) timeout 9991"
},
{
"message": "Sending 1 summary fetch requests with jrt/protobuf"
},
{
"message": "ProtoBuf: Resending query during document summary fetching"
}
]
},
{
"message": "Got 1 hits from source:encodingtesting_content"
},
{
"message": "Return searcher 'federation in native'"
},
{
"message": "Return searcher 'com.yahoo.search.searchers.OpportunisticWeakAndSearcher in vespa'"
},
{
"message": "Return searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'"
},
{
"message": "Blended result returns:\n #: 0, relevancy: 2.3014565796142468, source: encodingtesting_content, uri: null\n"
},
{
"message": "Return searcher 'com.yahoo.prelude.searcher.BlendingSearcher in vespa'"
},
{
"message": "Return searcher 'com.yahoo.search.grouping.GroupingQueryParser in vespa'"
},
{
"message": "Return searcher 'com.yahoo.prelude.semantics.SemanticSearcher in vespa'"
},
{
"message": "Return searcher 'com.yahoo.prelude.searcher.PosSearcher in vespa'"
},
{
"message": "Return searcher 'com.yahoo.prelude.searcher.JuniperSearcher in vespa'"
},
{
"message": "Return searcher 'com.yahoo.search.yql.FieldFilter in vespa'"
},
{
"message": "Return searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'"
},
{
"message": "Return searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'"
},
{
"message": "Return searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'"
},
{
"message": "Return searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.yql.FieldFilter in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.searcher.JuniperSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.searcher.PosSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.semantics.SemanticSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.grouping.GroupingQueryParser in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.searcher.BlendingSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.searchers.OpportunisticWeakAndSearcher in vespa'"
},
{
"message": "Invoke fill(null) on searcher 'federation in native'"
},
{
"children": [
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.NGramSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.DefaultPositionSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.querytransform.CJKSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.querytransform.LiteralBoostSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.RangeQueryOptimizer in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.SortingDegrader in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.searchers.QueryValidator in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.grouping.GroupingValidator in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.WandSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.querytransform.RecallSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.searchers.ValidateNearestNeighborSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.searchers.ValidateMatchPhaseSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.searchers.ValidateFuzzySearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.yql.FieldFiller in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.searchers.InputCheckingSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.significance.SignificanceSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.querytransform.StemmingSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.querytransform.NormalizingSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.VespaLowercasingSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.searcher.ValidateSortingSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.querytransform.BooleanSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.grouping.vespa.GroupingExecutor in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.searcher.ValidatePredicateSearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.search.searchers.ContainerLatencySearcher in encodingtesting_content'"
},
{
"message": "Invoke fill(null) on searcher 'com.yahoo.prelude.cluster.ClusterSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.cluster.ClusterSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.searchers.ContainerLatencySearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.searcher.ValidatePredicateSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.grouping.vespa.GroupingExecutor in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.BooleanSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.searcher.ValidateSortingSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.VespaLowercasingSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.querytransform.NormalizingSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.querytransform.StemmingSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.significance.SignificanceSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.searchers.InputCheckingSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.yql.FieldFiller in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.searchers.ValidateFuzzySearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.searchers.ValidateMatchPhaseSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.searchers.ValidateNearestNeighborSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.querytransform.RecallSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.WandSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.grouping.GroupingValidator in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.searchers.QueryValidator in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.SortingDegrader in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.RangeQueryOptimizer in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.querytransform.LiteralBoostSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.querytransform.CJKSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.DefaultPositionSearcher in encodingtesting_content'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.NGramSearcher in encodingtesting_content'"
}
]
},
{
"message": "Return fill(null) on searcher 'federation in native'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.searchers.OpportunisticWeakAndSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.searcher.BlendingSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.grouping.GroupingQueryParser in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.semantics.SemanticSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.searcher.PosSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.searcher.JuniperSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.yql.FieldFilter in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'"
},
{
"message": "Return fill(null) on searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'"
},
{
"message": "Query time query 'WEAKAND(100) Aksalaçlarla': 16 ms"
},
{
"message": "Summary fetch time query 'WEAKAND(100) Aksalaçlarla': 6 ms"
},
{
"message": "Vespa version: 8.471.25"
}
]
}
]
},
"root": {
"id": "toplevel",
"relevance": 1.0,
"fields": {
"totalCount": 1
},
"coverage": {
"coverage": 100,
"documents": 1,
"full": true,
"nodes": 1,
"results": 1,
"resultsFull": 1
},
"children": [
{
"id": "id:benchmark:docs::gibberish",
"relevance": 2.3014565796142468,
"source": "encodingtesting_content",
"fields": {
"sddocname": "docs",
"text": "<hi>Aksala</hi>ç<hi>larla</hi>lar",
"documentid": "id:benchmark:docs::gibberish",
"doc_id": "gibberish"
}
}
]
}
}
The text was updated successfully, but these errors were encountered:
Describe the bug
When using dynamic summaries with bolding in a Vespa Docker container, non‑ASCII characters (e.g. lowercase "ç", "ö", "ü") are incorrectly tokenized and highlighted. The output shows mismatched tokens, e.g.
"text": "<hi>Aksala</hi>ç<hi>larla</hi>lar"
To Reproduce
which results in the following schema deployed:
vespa document put doc.json
vespa query "select * from docs where userQuery(@q)" tracelevel=5 query="Aksalaçlarla" ranking.profile="bm25" > output.txt
Expected behavior
Dynamic summary generation should correctly process and highlight full tokens containing non‑ASCII characters without splitting them.
Environment (please complete the following information):
Vespa version
Vespa CLI version 8.482.31
Trace output
The text was updated successfully, but these errors were encountered: