[sword-devel] FFe application for sword being considered, and patches split out

Matthew Talbert ransom1982 at gmail.com
Mon Aug 31 09:22:38 MST 2009


> Since clucene isn't aiming for either UTF-16 OR UTF-32, I don't
> believe you'll be able to. A better approach would be to get the size
> of "content" and set a value based on that.

FWIW, I just did this for both searching and index creating. I
over-allocated the length by 500, which is overkill. I don't really
think it needs to be overallocated at all (content will be UTF-8 which
will always have at least one char per character, whereas we're
converting to USC2 or USC4 which both have exactly one per character,
so using the length exactly should always work, I think). Anyway, the
result is a 3 second decrease in indexing time for ESV, which is
fairly substantial I think. I'd definitely recommend doing something
like this.

Matthew



More information about the sword-devel mailing list