[bt-devel] Re: BibleTime

Martin Gruner mg.pub at gmx.net
Sun Dec 18 07:56:40 MST 2005


Hi Lee,

> As you've seen from the other posts to the list, on Linux wchar_t is 
> essentially UCS4.  The only reason I have to from UCS2 to UCS4 is to 
> handle the input string from QT which comes natively in UCS2.  I could 
> write the routine to directly stuff UCS2 chars into 4-byte variables, 
> but since it was a incredibly small amount of data, I just used the 
> convenience functions that were provided.
> 
> Since the SWORD modules are already UTF8, there is no "middle man" in 
> that conversion...

Alright, I agree, let`s use the clucene helper functions that you referred
to. Sorry for my confusion!

Btw, do you already split up the text searched according to text type (e.g.
Footnotes, Strongs, Headings etc.)? This is not really neccessary now for
1.6 imo, but it will be nice to have it one day -- IIRC the Sword clucene
routines can do this (you can search for it with "strong(s?):xy").

> My current test index using the SimpleAnalyzer is with the KJV and it's 
> 42 MB.  I didn't time it, but it seemed to take 2 to 3 minutes on my 
> Athlon 2.13 GHz.

Hm. We have to keep an eye on index size. But the KJV is perhaps the fattest
bible module that we have atm. =) This might also become interesting with
large commentaries or lexicons. But disk space should not be THE problem
these days, don`t you think?
And I`m sure that some users will want system-wide indices, but I don`t know
if it is possible / secure / good to implement that.

Looking forward to seeing your work in CVS!

Btw, do you use IRC? Perhaps we could meet in #sword on irc.freenode.net
sometime...

May God bless you and your family. Please say Hi to your wife and that I
want to thank her for allowing you to help out with BibleTime!

mg


More information about the bt-devel mailing list