[sword-devel] Search up to 5.8 times faster now :)

Troy A. Griffitts scribe at crosswire.org
Thu Jun 3 17:40:55 MST 2004


I'm really excited about your breakthru with the search algorithm.

Have you considered that rawtext:

<w>the Word</w> <w>was</w> <w>God</w>

will fail a phrase search for "Word was God"
and also fail a regex search for the same?

Just want to be sure we still work with the speed improvements :)

	-Troy.



Joachim Ansorg wrote:
> Hi,
> the standard search function is now up to 5 times faster than before.
> 
> Let me explain.
> A search in a module did the following:
> 	1. Get the text of a key by calling all the strip filters ()
> 	2. Search the search words in the stripped down text
> 	3. If it was found add it to the result
> We assume a module with 6 strip filters.
> This means the expensive StripText() function got called 30000*6=180000 times.
> 
> Now we check for the words in the raw text and only check keys which had a
> valid match in the raw text if they match in the stripped down text.
> If we assume a normal query returns 100 results the StripText function gets
> called 100*6=600 times which saves a lot of time.
> 
> Old/new comparision:
> 	time ./old/examples/cmdline/search KJV Revelation
> 		real    0m18.912s
> 		user    0m18.090s
> 		sys     0m0.780s
> 
> 	time ./new/examples/cmdline/search KJV Revelation
> 		real    0m3.396s
> 		user    0m2.540s
> 		sys     0m0.830s
> Which is an improvement factor of 5.6 :)
> 
> 	./new/examples/cmdline/search WEB God
> only takes 2.1 secs now.
> 
> Another example:
> 	time ./old/examples/cmdline/search KJV God
> 		real    0m20.371s
> 		user    0m18.130s
> 		sys     0m0.950s
> 
> 	time ./new/examples/cmdline/search KJV God
> 		real    0m5.566s
> 		user    0m4.730s
> 		sys     0m0.810s
> This is "only" 3.7 times faster, because searching in the raw text gives more
> hits which means more calls to StripText(). I tested it with a search for " "
> which means all verses and it's as slow as the old one. Which ones usual
> search queries are a lot faster than before.
> 
> The fix is in CVS now.
> 
> Joachim
> --
> <>< Re: deemed!
_______________________________________________
sword-devel mailing list
sword-devel at crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel



More information about the sword-devel mailing list