我正在升级到 Solr 4.1,但在使用新 API 检索位置和偏移信息时遇到问题。我的索引 由一个文档组成,其中一个字段包含字符串“one quick brown fox jumped over an lazy dog”。我正在查询我的'one' 索引并尝试检索与'one' 相对应的位置和偏移量。
这是代码片段
Terms terms=reader.getTermVector(docId, fieldName);
TermsEnum termsEnum= terms.iterator(TermsEnum.EMPTY);
BytesRef term;
while((term=termsEnum.next())!=null){
String docTerm = term.utf8ToString();
DocsAndPositionsEnum docPosEnum = termsEnum.docsAndPositions(null, null, DocsAndPositionsEnum.FLAG_OFFSETS);
//Check if the current term is the same as the query term and if so
//retrieve all positions (can be multiple occurrences of a term in a field) corresponding to the term
if (queryTerms.contains(docTerm)) {
int position;
while((position=docPosEnum.nextPosition())!=-1){
int start=docPosEnum.startOffset();
int end=docPosEnum.endOffset();
//Store start, end and position in an a list
}
}
}
内部 while 循环不正确。非常感谢任何有关如何遍历 DocsAndPositionsEnum 中所有位置的指示。
最佳答案
这是对我有用的
Terms terms=reader.getTermVector(docId, fieldName);
TermsEnum termsEnum= terms.iterator(TermsEnum.EMPTY);
BytesRef term;
while((term=termsEnum.next())!=null){
String docTerm = term.utf8ToString();
//Check if the current term is the same as the query term and if so
//retrieve all positions (can be multiple occurrences of a term in a field) corresponding to the term
if (queryTerms.contains(docTerm)) {
DocsAndPositionsEnum docPosEnum = termsEnum.docsAndPositions(null, null, DocsAndPositionsEnum.FLAG_OFFSETS);
docPosEnum.nextDoc();
//Retrieve the term frequency in the current document
int freq=docPosEnum.freq();
for(int i=0; i<freq; i++){
int position=docPosEnum.nextPosition();
int start=docPosEnum.startOffset();
int end=docPosEnum.endOffset();
//Store start, end and position in a list
}
}
}
https://stackoverflow.com/questions/15370652/