[bt-devel] Fwd: Re: clucene crash when searching

Troy A. Griffitts scribe at crosswire.org
Tue Nov 18 13:43:25 MST 2008


Dear all,

You all know I am biased, so please expect a biased email following:

I have been unaware of any specific problems or requests from the BT 
team regarding our search facilities.  If we do not expose features the 
BT team needs, they are welcome to submit a well thought out patch which 
extends our search architecture (e.g., the ability to supply a 
destination path for the lucene index-- presumably per user? but why? 
If the module was installed by the user, the index will be in a location 
writable; if not, the one who installed the module (root to 
/usr/share/sword?) should create the index if it is desired, IMO) or at 
least post your needs and use cases on sword-devel so the community can 
assess and possibly help you with a contribution.

Not using the builtin SWORD engine has these issues, in my opinion:

1. My most recent experience with Bibletime was very unpleasant, as I 
wanted to search a particular module which I very seldom use, but had to 
create an index first.  I am very willing to take a 5 second search time 
hit and save drive space for these modules-- especially if I have 200+ 
modules installed.

2. The SWORD engine implements Bible specific type searches: Strongs, 
Morph, Footnotes, non-accented searches for ancient languages, stripping 
of annotation for ancient papyri transcription, etc.  The BT team will 
have to think through all of these issues and reimplement them if they 
think they are valuable.

3. It will be hard for people to support BT as a cross-platform generic 
SWORD frontend (Windows, etc.) if it doesn't benefit from our work in 
the search code.

4. Indexes are not shared between SWORD frontends.

5. and, of course, the community arguments which you are all used to 
hearing: none of our other frontends will benefit from your ideas and 
work, vice versa, et. al.


		-Troy.





Martin Gruner wrote:
> Hi Eeli.
> 
> Unfortunately I do not agree with you, as you may expect.
> 
> Regarding Sword, there is no work going on for the search engine(s), and the 
> cluene implementation of Sword has (at least) the same problems our 
> implementation has. I second your attitude of collaboration in Sword, but am 
> still pessimistic about the speed of its development.
> 
> I do not see that we would gain much by adding support of Sword's non-indexed 
> search engines, except for the ability to search for phrases.
> 
> Searching in BT should be simple and consistent. That means that we should 
> not, in my opinion, offer different search syntaxes to the user. Maybe one 
> exception: a regexp-based search for power users, but all "normal" users 
> should have one single search to work with (from a user's point of view).
> 
> Using a search engine that works with modern, index-based technologies is the 
> proper way to go.
> 
>>> The problem is that CLucene is almost unmaintained and crashes on certain
>>> kinds of systems (got reports about crashes on BSD for example). Java
>>> Lucene is much, much better. We've hoped that CLucene developes like
>>> JLucene, but sadly that didn't work that way...
>> Ok. That's one more reason to support many engines.
> 
> No. It is one reason to choose the search engine wisely.
> 
> My suggestion would be to talk about the search engine we do use, clucene. I 
> just checked - they released a bugfix 0.9.21 version recently, and 0.9.23, 
> which is a beta-quality preview release of their next development branch, 
> which is supposed to improve Lucene compatibility/feature coverage. Ben also 
> told me that he was going to implement the wildcard operator in the beginning 
> of words (like "*minded").
> But nobody can say how long this will take. So we may want to use another open 
> source search engine which suits our needs better.
> 
> We could start a wiki page listing the specific problems that we see with 
> clucene, and investigate if they can be solved. At the same time we can 
> collect information about other search engines in a matrix of 
> features/properties that we do need. Maybe we come up with something better, 
> more stable and feature-rich than clucene?
> 
> A major problem that I see: What about our release roadmap? We should not 
> start changing the search engine in the 1.7.x branch/release cycle. I'm 
> unhappy with the status quo, we cannot stay in beta state for a long time and 
> continue changing the internals of our software. We should release FIRST, and 
> THEN start making major changes.
> 
> What do you think, Eeli, and all others?
> 
> God bless,
> 
> mg
> 
> 
> 
> _______________________________________________
> bt-devel mailing list
> bt-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/bt-devel




More information about the bt-devel mailing list