107 Star 1.3K Fork 303

GVPKoode / Kooder

 / 详情

UTF8 encoding is longer than the max length 32766

已完成
缺陷 拥有者
创建于  
2021-04-22 08:46

建索引时异常:

java.lang.IllegalArgumentException: Document contains at least one immense term in field="source" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[49, 52, 55, 9, 48, 9, 49, 9, 51, 9, 55, 9, 55, 9, 55, 9, 52, 55, 9, 49, 52, 55, 9, 49, 52, 55, 9, 49, 52, 55]...', original message: bytes can be at most 32766 in length; got 58299
	at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:981)
	at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:524)
	at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:488)
	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:208)
	at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:415)
	at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1471)
	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1757)
	at com.gitee.kooder.code.CodeFileTraveler.updateDocument(CodeFileTraveler.java:58)
	at com.gitee.kooder.code.GitRepositoryProvider.addFileToDocument(GitRepositoryProvider.java:291)
	at com.gitee.kooder.code.GitRepositoryProvider.indexAllFiles(GitRepositoryProvider.java:261)
	at com.gitee.kooder.code.GitRepositoryProvider.pull(GitRepositoryProvider.java:180)
	at com.gitee.kooder.indexer.FetchTaskThread.handleCodeTask(FetchTaskThread.java:148)
	at com.gitee.kooder.indexer.FetchTaskThread.lambda$handleTasks$2(FetchTaskThread.java:110)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at com.gitee.kooder.indexer.FetchTaskThread.handleTasks(FetchTaskThread.java:106)
	at com.gitee.kooder.indexer.FetchTaskThread.lambda$run$0(FetchTaskThread.java:78)
	at com.gitee.kooder.utils.BatchTaskRunner.compute(BatchTaskRunner.java:56)
	at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
	at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
	at com.gitee.kooder.utils.BatchTaskRunner.execute(BatchTaskRunner.java:50)
	at com.gitee.kooder.indexer.FetchTaskThread.lambda$run$1(FetchTaskThread.java:78)
	at com.gitee.kooder.utils.BatchTaskRunner.compute(BatchTaskRunner.java:56)
	at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes can be at most 32766 in length; got 58299
	at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:270)
	at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:177)
	at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:971)
	... 29 more

评论 (0)

红薯 创建了缺陷
红薯 关联仓库设置为Koode/Kooder
红薯 通过 koode/kooder Commit d6e7ddb任务状态待确认 修改为已完成
展开全部操作日志

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(1)
36 ld 1578913711
Java
1
https://gitee.com/koode/kooder.git
git@gitee.com:koode/kooder.git
koode
kooder
Kooder

搜索帮助