[sword-devel] indexed search discrepancy (and sword 1.6.0+dfsg-2)
jmarsden at fastmail.fm
Sat Aug 29 18:16:11 MST 2009
Matthew Talbert wrote:
> I'm attaching a patch to fix several issues with indexed search.
Cool! I hope they make it into SWORD.
IANASD (I am not a SWORD developer!), but, it is usually easier on the
person checking and commiting changes from patches if the solution to
issue, or each additional enhancement, is provided as a separate
independent patch. That way, if there are any doubts or concerns about
one part of the set of patches, the "good stuff" can still be easily
applied, leaving the rest for further study or fixup.
> Issue 1: large text fields weren't getting indexed due to a low MAX_CONV_SIZE
> Resolution: change MAX_CONV_SIZE to 1024 * 1024, and add call to
> writer to boost its maximum field size
This one seems to me a clear win, a "just do it" type of change.
> Issue 2: search causes segfault when searching for stop words
> Resolution: set analyzer stop words to NULL for both index
> creation and search. Possibly this would only have to be set for
> search, and left on to lower the index size.
The "possibly" worries me a bit :) Do we need to test with and without
the stopwords at index creation time, and see how much index size is
affected? Have you already done any testing along those lines?
> Issue 3: index causes segfault *after indexing* when module location
> isn't writable.
> Resolution: check the return value of
> FileMgr::createParent(target + "/dummy"); if return value is -1, abort
Looks like a clear win to me.
> In addition, this patch adds fields for footnotes, morphology, and
> headers. I *really* would like to see this added to the default
> indexing. ...
> ... nor was I entirely comfortable with the code I had written, ...
Sounds like a "needs further study" idea, to me?
I'm about to create a new SWORD 1.6.0+dfsg-2 package with USBINARY
defined (so we can handle encrypted modules -- quite a regression to
break those, and almost certainly my fault -- oops!). I'll look at
adding in your fixes for Issues 1 and 3, and the search-time part of
your fix for issue 2, and see what I think about the result... I'm not
going to even look at the "adds fields" part, which is an enhancement
ratehr than a straight bug fix, until we get some feedback from the real
SWORD developers on that :)
More information about the sword-devel