[sword-devel] indexed search discrepancy (and sword 1.6.0+dfsg-2)

Jonathan Marsden jmarsden at fastmail.fm
Sat Aug 29 23:34:15 MST 2009

Matthew Talbert wrote:

> OK, here are results. All tests are done with my previous changes; the
> only difference is the first index has stop words, the second doesn't.

> KJV 7.3MB 6.3MB
> Finney 654KB 518KB
> ESV 5.9MB 5.0MB

So roughly 20% extra.  I see no reason not to go for it -- but then, I'm
a desktop user with a monstrous 640GB hard drive :)  Are there
situations and systems where this would be a significant issue?

> For those wondering why a search for "the lord" doesn't segfault, it's
> only when you search for a stop word alone that there is a segfault.
> If you want to talk about confusing users, the current system would
> seem illogical (I searched for "god is" and got nothing??).

Agreed.  Unless the 20% extra space requirement is really an issue in
some circumstances, it looks like the right approach would be to just
index everything, and so get more correct search results.


