[sword-devel] search idea

Trevor Jenkins sword-devel@crosswire.org
Mon, 03 Jan 2000 13:29:00 +0000


On Saturday, 1 January, 2000 21:07:21, Matthias Ansorg <aNsis@gmx.de> wrote:

> Trevor et al.,
>
> Some ideas that might perhaps be useful to integrate when planning advanced
> search features:
>
> It would be useful for advanced searching to be more able to distinguish the
> semantic means of text when searching. Examples:
>
> 1. Searching for a number like, say, 33, produces at present in some
> translations like "1952 Schlachter Bibel" hits like Psalms 18:32. The "33" is
> here contained in a string that shows that this verse was originally verse 33.
> It would be useful to do a search, say, FIND numbers(33) that finds only real
> numbers contained in the bible text.

An interesting search. I tried the same thing in Online Bible for Macintosh
only to disover that it was searching on Strong's numbers. Bizarre as the
translation I seached in (RSV) does not have these included.

My view is that text is text and annotation is annotation. The default would
be to search only the text. The search syntax I'm proposing includes "field
names" so that items like annotations could be dealt with sensibily.

The introduction of these field names is a function of the schema that
accompanies each module. The schema might be implicit for existing module or
could be made explicit. For translations some obvious field names are BOOK,
CHAPTER, VERSE. (These names would have to be internationalised at user's
preference.)

> 2. Strong's numbers: It would be useful to do a search that finds only hits
> with a given Strong number and not additional verss that contain this number
> in it's text. Perhaps FIND strongs(0929). It's interesting monitoring the use
> of a specific Hebrew word through the bible using BibleTime's graphical
> analysis feauture!

The presence of Strong's numbers n a translation would (implicitly)
introduce a suitable field into the schema.

> 3. Names: Imagine the situation you have forgotten the name of a single person
> mentioned in the Bible or a commentary or book except of one or two letters.
> You could avoid unnecessary hits by restricting the search to only names,
> perhaps FIND names(Ben*). The markup of names is provided by ThML-markup.

Again the presence of such information in a translation would (implicitly)
extend the schema with a suitable field.

The GBF markup scheme does not appear to have the same feature of
distinuishing names.

(By the way, where is a specification of ThML?)

> 4. Annotations: show only hits that occur in annotations or that occur not in
> annotations to reduce the amount of unnecessary hits to view through.

Again the presence of annotations extend the schema.

> 5. Meta information: Find (perhaps in some modules at one time) information
> that is stored in meta fields, such as the publication date or author of
> commentaries or (of course not yet implemented) general books, an appropriate
> markup like ThML provided. Such as: give me all books written by Darby would
> be a FIND meta.author(Darby) IN modules.books

I'm beginning to sound like a parrot. :-)

However, I do not see a need to distinguish this "meta" data from other
associated data provided with a text. To me "meta data" implies the
structure of the database itself, i.e. the schema. What mght be useful is a
search like FIND modules.meta=(annotation and names and ThML).


> 6. Scripture references: a appropriate marup to sripture references provided
> like in ThML, one could search for each reference to a given verse or verse
> range in every commentary and (perhaps later) even every book you have. That
> way, one could find nearly everything written about a specific verse and not
> only that which is included in the appropriate portions of your commentaries.

Parrot::mode=on;

> 7. texts with other semantic markups like date or anything else that might be
> of some use. Searching on verses that are written in a certain mood and are
> about a certain topic (proposed by Jerry Hastings earlier in this thread) is
> related to this but perhaps easier to handle when coding: these meta
> information is the same for every bible translation and is therefore not
> needed to be marked up in the bible text itself.

I'll defer on this one given the followup from Mads Kiilerich
<Mads@Kiilerich.com>.

> IMHO, the idea of using index files for searching is great, for it provides
> possibilities like creating an (perhaps semi-hand-written) index file with a
> list-of -contents for an mp3-module like an audio sermon if this becomes once
> a module for SWORD. (and is not done over href to file like at the moment in
> BibleTime).

I wonder about this "hand-written" index file. Would itnot be better to have
an installation specific contents file, which perhaps should be module in
its own right.

> Please discard that portions of this message that are only "technical toys"
> and not useful in a bible study tool directed to further HIS kingdom.

Personally I don't see your requests as technical toys. Because of my
professional involvement in full-text retrieval systems I want these same
features for the more important job of understanding the scriptures.
Yesterday I had lunch at my mother-in-law's house. She has a radio in the
kitchen tuned to BBC Radio 4. The audio quality is poor because she has it
on the long wave frequency (AM) rather than the better quality of the VHF
service (FM). The transformation is incredible when you switch her receiver
from AM to FM. For me the use of full-text searches is equivalent to that
same switch over from low quality to high quality.

One final comment concerning these extensions to the underlying schema. If
the corresponding data is not in the module then will not be possible to
search with these additional fields names. They'd always come up with no
hits.

Regards, Trevor

British Sign Language is not inarticulate handwaving; it's a living
language. So recognise it now.

--

<>< Re: deemed!