[sword-devel] Question for Optimization / speed experts

David White sword-devel@crosswire.org
07 Jan 2003 20:49:49 +1100


Joachim,

Glad I could be of help.

Glen's algorithm is potentially slightly faster than use of
set_intersection. It is essentially a generalization of the
set_intersection algorithm to work on n sequences. However I think using
set_intersection will be just fine as far as speed goes.

Also note that if you wanted to show all the words that appear in any
Lexicon, as per Christian's suggestion, you can just use set_union
instead of set_intersection.

God Bless,

-David.

On Tue, 2003-01-07 at 13:39, Joachim Ansorg wrote:
> David,
> thank you for you help!
> 
> It works great and is really fast! Because programmers are lazy using the STL 
> is preferred :)
> 
> Glen, I think the STL function works almost the same way as you suggestion. 
> Good work!
> Christian, thank you for your suggestion. If it's still too slow for the users 
> (although I don't think so) I'll use some of them.
> 
> It's great to get so much help in a very short time!
> Joachim
> 
> > C++ has a standard algorithm called set_intersection - found in the header
> > <algorithm>. Given two sorted sequences, it will output the items that are
> > found in both sequences. This function is very efficient, and with use of
> > it, I think your problem should become trivial. Documentation for this
> > algorithm can be found at http://www.sgi.com/tech/stl/set_intersection.html
> >
> > Of course, to use it, your sequences will need to have standards-conforming
> > iterators. If you need any help using the function, let me know.
> >
> > -David.
> >
> > ----- Original Message -----
> > From: "Joachim Ansorg" <joachim@ansorgs.de>
> > To: <sword-devel@crosswire.org>
> > Sent: Tuesday, January 07, 2003 12:46 AM
> > Subject: [sword-devel] Question for Optimization / speed experts
> >
> > > Hi!
> > > Some questiosn for optimizations experts :)
> > >
> > > I'm working on a piece of code in BibleTime which gives me headaches.
> > > BibleTime supports displaying different lexicons side by side, it offers
> >
> > only
> >
> > > the keys which are available in all of the chosen lexicons.
> > >
> > > The function which creates the list of entries which  are available in
> > > all modules is ways too slow and too CPU consuming.
> > > I implemented it the following way:
> > >
> > >  1. Sort the modules by number of entries (from small to large)
> > >  2. Use the modules with the fewest entries and go through all of it's
> > > entries:
> > > 3. Is the entry in all other modules? Add the entry to a list of good
> > > entries it it is, otherwise continue with the next entry of the smallest
> >
> > list
> >
> > > and compare it with all other chosen modules.
> > >
> > > Assume the user selected BDB together with StrongsHebrew. Each of the
> >
> > modules
> >
> > > has around 10000 entries. So the outer loop would have 10000 iterations,
> >
> > the
> >
> > > inner loop would have another 10000 iterations. So we get
> > > 10.000*10.000=100.000.000 = one hunred million iterations. This is ways
> >
> > too
> >
> > > slow and ways too much.
> > > This needs on my 1.2GHz Atlon with 512 MB Ram more then five seconds.
> > >
> > > If there are three, four or even five modules side by side we get much
> >
> > more
> >
> > > loops.
> > >
> > > Are the other ways to do such a task? Since the entry lists are sorted by
> >
> > the
> >
> > > alaphabet this would be a good point to start. But I don't know how to do
> >
> > it
> >
> > > best.
> > >
> > > I'd be really glad for help with this case!
> > >
> > > Thank you!
> > > Joachim
> > > --
> > > Joachim Ansorg
> > > www.bibletime.info
> > > www.ansorgs.de
-- 
David White <dave@whitevine.com>