[sword-devel] CLucene and Sword
Troy A. Griffitts
scribe at crosswire.org
Fri May 25 11:00:33 MST 2007
Have a look at the source for sword/utilities/mkfastmod.cpp and
Checking whether or not the indecies are created is the most confusing
part. Originally, the plan was to let the SWModule::search method
return whether or not a search was supported by the search type
requested. So, if you called SWModule::search requesting CLucene type,
and passed a bool * to justCheckIfSupported, it would set your bool to
true if the indecies were created, and false otherwise. This would
allow search engine plugins to create different indecies depending on
the search string features passed in and such. There are routines to
see if a driver even is compiled with code which CAN create a fast
index, and also if it HAS created the index.
Anyway, it's all too complicated and impractical. Hopefully we will
change it to something much more straightforward, like: bool
hasIndex(int searchType), when we do the 2.0 refactoring soon.
The place to look for the current interface is
sword/include/swsearchable.h Someone else wrote the comment in there,
who didn't understand how things worked. I can't blame them, as I
hadn't written ANY comments, so they at least tried. I've updated them
slightly and committed just now.
Currently, the best way to 'make it work' is to use the search dialog
from BibleCS as an example. It shows the [Create Index] button to the
user if the indecies have not been created, and if they have, it hides
the button and adds the "Optimized Search" option to the user choices if
the index is there.
Here's a direct link to the file in svn. In your browser, search for
all occurances of: toggleIndex
That should get you into all the blocks of code you need to lift.
('target' is any SWModule *)
Hope this helps,
Manfred Bergmann wrote:
> that's great.
> I finally compiled sword with clucene support for the Mac.
> Unfortunately currently only for PPC platform because cross-compiling
> clucene for Intel didn't work. Maybe I need someone with an Intel Mac
> for this.
> However, there are some question to using the sword clucene
> - where are the index files stored?
> - are there some API examples on how this works or is it straight
> forward with looking at the API docs?
> Am 18.05.2007 um 19:56 schrieb Troy A. Griffitts:
>> I believe Will's reason for not using CLucene in SWORD was because
>> he couldn't easily get CLucene compiled on the Mac. Using SWORD's
>> CLucene implementation has many advantages, and I'm not sure any real
>> world disadvantages. But, of course, I'm biased.
>> o You get to share indexes between frontends
>> o You get the implementation for free
>> o Your features continue to improve for free when others contribute
>> o You get to benefit others if you add features
>> Currently, to my knowledge, SWORD's implementation of CLucene supports
>> MORE features than any frontend exposes (with the possible
>> exception of
>> DM's latest JSword work):
>> o Full SWORD VerseKey Range parsing support (e.g., Search only in
>> Paul's Epistles, "Rom-Phile", or "Jo;1jo-3jo;rev")
>> o Choose verse or chapter granularity for a hit (e.g., Find all
>> words within the same [verse | chapter])
>> o Search in any SWORD module type (Bibles, General Books,
>> Commentaries, Lexica, Devotionals, etc.)
>> o Advantage of using SWORD's filter facility to massage data before
>> - Ignore accents and diacritics in Greek and Hebrew
>> - Ignore critical markup in transcriptions.
>> o Currently supported doc fields:
>> - key: The SWORD Key (e.g., in a lexicon "Adam", in a Bible,
>> - content: The body of the entry
>> - lemma: Strong's numbers or other lemma data included in
>> the module
>> o Seamless integration with other SWORD search mechanisms:
>> - ability to search WITHOUT creating indexes. This is
>> frustrating for me with the newest version of Bibletime. There are
>> often times when I don't want to create a lucene index on a module. I
>> seldom search most modules and an unindexed search average 5 second
>> time is perfectly acceptable to me on these modules. I neither
>> want the
>> disk overhead nor the initial index creation time.
>> - Regular Expression searching
>> - Searching in ANY EntryAttribute which existing filters, or
>> custom filters, might decide to add. Some of these currently include:
>> footnotes, headings, lemma, morph, AVPhrase (Greek lexicon, Authorized
>> Version translation choices for a Greek entry), src (interlinear data
>> which links a translation to original), refList (footnotes
>> crossreference verses), morpheme (WLC Hebrew morpheme breakdown).
>> This seems a logic place to add the ability to create new CLucene doc
>> fields based on these modular filters)
>> In conclusion, it seems to me that utilizing and extending the current
>> search support in SWORD benefits everyone and leverages an already
>> existing solid set of features.
>> Manfred Bergmann wrote:
>>> Since when is CLucene integrated in Sword and for what exactly is it
>>> Can it be used by client applications for searching?
>>> I'm not really satisfied with using Java Lucene in Objective-C in
>>> It is possible to use Java classes in Objective-C but it is not very
>>> straight forward and difficult to debug.
>>> So I'm wondering if we could get rid of Lucene and use the Sword
>>> integrated CLucene.
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> Instructions to unsubscribe/change your settings at above page
>> sword-devel mailing list: sword-devel at crosswire.org
>> Instructions to unsubscribe/change your settings at above page
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel