Sun, 26 Dec 1999 17:04:40 -0800
> Here we have to disagree. From my experience of searching text databases a
> linear search is unacceptably slow. This is data structures 101 stuff,
> linear search average time N/2, binary chop N log N, hash table 1. where N
> is the number of terms to search through. Okay so pre-computing the index
> files is an overhead for anything above naive linear search. However, it
> would be possible to provide index files precomputed along with the raw
> files. That means only 1 system ever need take the setup hit. A linear
> search requires that every user system take a performance hit. And every
> user take a performance hit too.
I think it would be very interesting to implement a pre-computed index
system that is optional for the user. I.e. if the user wants to save disk
space, he grabs the basic stuff we have right now (or preferrably a
compressed version), but if he wants to save time on searches, he grabs the
pre-computed search index as well. If the search index file is present in
the module directory, it uses its search method, otherwise it falls back on
a linear search.
> The only problem with this system was the port to HP workstations. A
> junior programmer was given the task and he neglected to consider big- and
> little-endian issues. For all other platforms we could take the same text
> and index files and use them "out-of-the-box" with no regard for the
> platform they were originally prepared on. I'd love to see the same
> compatibility within SWORD.
We maintain cross-endian compatability by having everything in little-endian
and converting to big where necessary (just Solaris so far). It's done in
the library, so the modules are the same for both schemes and the front ends
don't need to worry about this either, though I'm not actually sure if
anyone has compiled a frontend on Solaris. That's Solaris on Sun cpus, of
course, not intel.