[sword-devel] Search up to 5.8 times faster now :)
Troy A. Griffitts
scribe at crosswire.org
Wed Jun 2 14:41:45 MST 2004
Great job! I haven't looked too closely at the code, but enough to get
the idea. Chris, I think Joachim added some logic for phrase search, as
well, though I didn't follow it when I read it in the patch briefly.
Excited to post 1.5.8 someday. Starting a new job has really been
Chris Little wrote:
> Does this only affect the multi-word search (not the phrase or regex
> searches)? It seems like we could achieve a similar gain in performance
> for the phrase searches by splitting phrases into individual words,
> applying your algorithm (search raw, then strip, then search again) to
> limit the pool to those verses that include all of the words (regardless
> of order), and then performing the current phrase search algorithm
> (strip filters, then search) on that pool. Just a thought. There might
> be some flawed logic that hasn't occurred to me.
> Joachim Ansorg wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>> the standard search function is now up to 5 times faster than before.
>> Let me explain.
>> A search in a module did the following:
>> 1. Get the text of a key by calling all the strip filters ()
>> 2. Search the search words in the stripped down text
>> 3. If it was found add it to the result
>> We assume a module with 6 strip filters.
>> This means the expensive StripText() function got called
>> 30000*6=180000 times.
>> Now we check for the words in the raw text and only check keys which
>> had a valid match in the raw text if they match in the stripped down
>> If we assume a normal query returns 100 results the StripText function
>> gets called 100*6=600 times which saves a lot of time.
>> Old/new comparision:
>> time ./old/examples/cmdline/search KJV Revelation
>> real 0m18.912s
>> user 0m18.090s
>> sys 0m0.780s
>> time ./new/examples/cmdline/search KJV Revelation
>> real 0m3.396s
>> user 0m2.540s
>> sys 0m0.830s
>> Which is an improvement factor of 5.6 :)
>> ./new/examples/cmdline/search WEB God
>> only takes 2.1 secs now.
>> Another example:
>> time ./old/examples/cmdline/search KJV God
>> real 0m20.371s
>> user 0m18.130s
>> sys 0m0.950s
>> time ./new/examples/cmdline/search KJV God
>> real 0m5.566s
>> user 0m4.730s
>> sys 0m0.810s
>> This is "only" 3.7 times faster, because searching in the raw text
>> gives more hits which means more calls to StripText(). I tested it
>> with a search for " " which means all verses and it's as slow as the
>> old one. Which ones usual search queries are a lot faster than before.
>> The fix is in CVS now.
>> - -- <>< Re: deemed!
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.2.4 (GNU/Linux)
>> -----END PGP SIGNATURE-----
>> sword-devel mailing list
>> sword-devel at crosswire.org
> sword-devel mailing list
> sword-devel at crosswire.org
More information about the sword-devel