[bt-devel] Re: BibleTime

Lee Carpenter elc at carpie.net
Fri Dec 16 17:09:36 MST 2005


I saw that SWORD had a clucene option to the search.  Do you know which 
CLucene API it expects?  (0.8.x or 0.9.x)  CLucene 0.9.x series claims 
that it uses UCS2 internally.  My inspection of it shows that it uses 
TCHAR which turns to wchar_t if UNICODE is defined during the build and 
a simple char otherwise.  If running Windows, wchar_t is 2-bytes and 
would essentially be UCS2.  Running on Linux however, wchar_t is 4 bytes 
and would be UCS4.  That is why I used the conversion functions which 
theoretically would handle either the 2-byte or 4-byte wchar_t.

CLucene is working for me currently, but my language doesn't make use of 
many non-ASCII characters anyway, so I can't say at this point that it 
works correctly for wide characters.  It should work (using the 
conversion routines) unless somewhere in CLucene they make assumptions 
about the width of wchar_t.  Based on the way wchar_t is defined (or not 
defined as the case may be) they should not.

If you like, I can take a look at the SWORD built-in clucene search as 
well...

Lee C.


Daniel Glassey wrote:
> On 16/12/05, Troy A. Griffitts <scribe at crosswire.org> wrote:
> 
>>Hey guys,
>>        Just a quick note.  Are you all aware that SWORD does expose clucene
>>searching in the API.  We have an interface to query if indexes have
>>been created, and also to ask them to be created (reporting status) if
>>they have not been.
>>
>>        Also, it is my impression that clucene does not yet work correctly with
>>wide characters (wchar_t is also different sizes on different platforms
>>(as previously below) and does not conform to any standard).
> 
> 
> Have you tried it out? My impression is that they are just putting 16
> bits of data into whatever wchar_t is but I haven't tested it yet so I
> don't know if it works.
> 
> Regards,
> Daniel
> 


More information about the bt-devel mailing list