[sword-devel] Question for Optimization / speed experts

David White sword-devel@crosswire.org
Tue, 7 Jan 2003 07:51:33 +1100

C++ has a standard algorithm called set_intersection - found in the header
<algorithm>. Given two sorted sequences, it will output the items that are
found in both sequences. This function is very efficient, and with use of
it, I think your problem should become trivial. Documentation for this
algorithm can be found at http://www.sgi.com/tech/stl/set_intersection.html

Of course, to use it, your sequences will need to have standards-conforming
iterators. If you need any help using the function, let me know.


----- Original Message -----
From: "Joachim Ansorg" <joachim@ansorgs.de>
To: <sword-devel@crosswire.org>
Sent: Tuesday, January 07, 2003 12:46 AM
Subject: [sword-devel] Question for Optimization / speed experts

> Hi!
> Some questiosn for optimizations experts :)
> I'm working on a piece of code in BibleTime which gives me headaches.
> BibleTime supports displaying different lexicons side by side, it offers
> the keys which are available in all of the chosen lexicons.
> The function which creates the list of entries which  are available in all
> modules is ways too slow and too CPU consuming.
> I implemented it the following way:
>  1. Sort the modules by number of entries (from small to large)
>  2. Use the modules with the fewest entries and go through all of it's
> entries:
> 3. Is the entry in all other modules? Add the entry to a list of good
> entries it it is, otherwise continue with the next entry of the smallest
> and compare it with all other chosen modules.
> Assume the user selected BDB together with StrongsHebrew. Each of the
> has around 10000 entries. So the outer loop would have 10000 iterations,
> inner loop would have another 10000 iterations. So we get
> 10.000*10.000=100.000.000 = one hunred million iterations. This is ways
> slow and ways too much.
> This needs on my 1.2GHz Atlon with 512 MB Ram more then five seconds.
> If there are three, four or even five modules side by side we get much
> loops.
> Are the other ways to do such a task? Since the entry lists are sorted by
> alaphabet this would be a good point to start. But I don't know how to do
> best.
> I'd be really glad for help with this case!
> Thank you!
> Joachim
> --
> Joachim Ansorg
> www.bibletime.info
> www.ansorgs.de