[sword-devel] Search up to 5.8 times faster now :)
chrislit at crosswire.org
Wed Jun 2 18:38:32 MST 2004
Does this only affect the multi-word search (not the phrase or regex
searches)? It seems like we could achieve a similar gain in performance
for the phrase searches by splitting phrases into individual words,
applying your algorithm (search raw, then strip, then search again) to
limit the pool to those verses that include all of the words (regardless
of order), and then performing the current phrase search algorithm
(strip filters, then search) on that pool. Just a thought. There might
be some flawed logic that hasn't occurred to me.
Joachim Ansorg wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> the standard search function is now up to 5 times faster than before.
> Let me explain.
> A search in a module did the following:
> 1. Get the text of a key by calling all the strip filters ()
> 2. Search the search words in the stripped down text
> 3. If it was found add it to the result
> We assume a module with 6 strip filters.
> This means the expensive StripText() function got called 30000*6=180000 times.
> Now we check for the words in the raw text and only check keys which had a
> valid match in the raw text if they match in the stripped down text.
> If we assume a normal query returns 100 results the StripText function gets
> called 100*6=600 times which saves a lot of time.
> Old/new comparision:
> time ./old/examples/cmdline/search KJV Revelation
> real 0m18.912s
> user 0m18.090s
> sys 0m0.780s
> time ./new/examples/cmdline/search KJV Revelation
> real 0m3.396s
> user 0m2.540s
> sys 0m0.830s
> Which is an improvement factor of 5.6 :)
> ./new/examples/cmdline/search WEB God
> only takes 2.1 secs now.
> Another example:
> time ./old/examples/cmdline/search KJV God
> real 0m20.371s
> user 0m18.130s
> sys 0m0.950s
> time ./new/examples/cmdline/search KJV God
> real 0m5.566s
> user 0m4.730s
> sys 0m0.810s
> This is "only" 3.7 times faster, because searching in the raw text gives more
> hits which means more calls to StripText(). I tested it with a search for " "
> which means all verses and it's as slow as the old one. Which ones usual
> search queries are a lot faster than before.
> The fix is in CVS now.
> - --
> <>< Re: deemed!
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
> -----END PGP SIGNATURE-----
> sword-devel mailing list
> sword-devel at crosswire.org
More information about the sword-devel