[sword-devel] Searching and Lucene thoughts

DM Smith dmsmith555 at yahoo.com
Tue Mar 1 19:31:27 MST 2005


When the index is built, lucene sees each verse as a separate document. 
When a document is added to the index, lucene gives it a number one 
higher than the last document added. When a hit is returned from lucene, 
one of the fields of that hit is that number (called the document id). 
As long as documents are added in a meaningful order and as long as 
documents are not removed, these can be relied upon.

Lucene does not regard adjacency of documents as anything special.

So the short answer to your question is: No. We have to implement 
adjacency outside of Lucene.

Lucene cannot find documents that have Moses within 5 of documents that 
have Aaron. It can find documents with Moses and it can find documents 
with Aaron. And our program can figure out if these are close enough.

Lets look for 1Cor 15:54-57, but remember the verse is something like: 
sin has been swollow up in victory through Christ.
This is a melding of all these verses. Finding this would be difficult.

By the way, try it in Sword, 1Cor15:57 comes up as the first answer when 
using lucene.
All the other methods come up with no answer.


Chris Little wrote:

> Any chance we can get verse-crossing search hits with Lucene? What I 
> mean is, suppose a person is searching for something and knows a few 
> different words in the phrase, but the phrase itself crosses a verse 
> boundary. Can we make this work at all, where the search returns a 
> range of verses that contain the search term? Does Lucene have a 
> concept of verses being contiguous parts of a single larger document?
>
> --Chris
>
> _______________________________________________
> sword-devel mailing list
> sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
>


More information about the sword-devel mailing list