[sword-devel] Fast search -- some ideas

David Burry sword-devel@crosswire.org
Thu, 31 Aug 2000 16:24:23 -0700

At 11:15 AM 8/31/2000 -1000, Brandon Staggs wrote:
>> >> Proximity: ...
>> > As might be guessed from my earlier comments this is an
>> > area that I have given a lot of thought to what is involved. :-)
>> I would gladly learn some more about this. This where I believe
>> the real value of powerful searching can come in.
>For what it's worth, proximity searching where the users says "these words
>within X verses of each other" should be fairly trivial to implement, once
>you have your bitmaps set. All you have to do is instead of ANDing the two
>bitmaps (if they are searching for two words) is iterate through each
>position, then look for a corresponding position in the other maps within X
>positions. Then set up a new bitmap and flip all the bits in that range True
>where necessary.

What if you use bit shifts of one word's bitmap to create a proximity mask, then AND that proximity mask with the original bitmap of the other word to get proximity matches?  Just that bitwise operations should in theory be much faster than interating/scanning looking for something.  I'm still not sure of the quickest way to do bitwise operations on very large bitmaps (say, 100K for a bitmap of every word-proximity instance of a certain word in the Bible instead of the smaller 4K one required for just verse operations), but I'll read the book, I'll read the book!  ;o)

>If you want to offer "within X words" or "within X sentences" or "within X
>paragraphs" or whathaveyou then you need to get more complex and pick up
>that _Managing Gigabytes_ book.

Just ordered!  Thank you all very much!

>But the reality is that most people are fine doing a simple AND search
>within verses, and this is by far the greatest use of the searching funtion
>in Bible software.

True, but I'm interested in doing much more as soon as I learn how to make it fast and efficient... ;o)