Skip to content
This repository has been archived by the owner on Feb 17, 2024. It is now read-only.

IndexNotFoundException #192

Open
StackTraceYo opened this issue Nov 4, 2022 · 1 comment
Open

IndexNotFoundException #192

StackTraceYo opened this issue Nov 4, 2022 · 1 comment

Comments

@StackTraceYo
Copy link

hello - im attempting to use this library v0.2 in a yarn, with my driver running on the cluster

I am encountering the following exception -

Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file found in MMapDirectory@/local/hadoop/disksdl/yarn/nodemanager/usercache/spotci/appcache/application_1617967855014_1171701/container_e136_1617967855014_1171701_02_000001/tmp/spark-search/application_1617967855014_1171701-sparksearch-rdd0-index-3 lockFactory=org.apache.lucene.store.NoLockFactory@4a1941a4: files: [] at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:715) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:84) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:64)

Im wondering if there was any info on where to start looking for why it would be empty?

thanks

@StackTraceYo
Copy link
Author

StackTraceYo commented Nov 15, 2022

@phymbert any input would be appreciated.

I also sometimes see a similar error but there are other files in the index

org.apache.lucene.index.IndexNotFoundException: no segments* file found in MMapDirectory@/local/hadoop/disksdl/yarn/nodemanager/usercache/spotci/appcache/application_1617967855014_1203855/container_e136_1617967855014_1203855_02_000001/tmp/spark-search/application_1617967855014_1203855-sparksearch-rdd0-index-0 lockFactory=org.apache.lucene.store.NoLockFactory@2a910ebf: files: [_0.fdt, _0_Lucene84_0.tip]

wondering if it has to do with how im loading queries, i have queries saved as parquet, I am loading them using standard spark.read.parquet so its a dataset, then calling

searchRDD.searchJoinQuery[SearchQuery]( queries.rdd, queryBuilder = queryStringBuilder(_.q), topKByPartition, minScore )

SearchQuery is just a case class wrapper around a string

if I create a dataset directly using spark by passing a sequence like spark.createDataset(Seq(..)) I dont see this problem

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant