[sword-devel] Proposed parent/contained class for VerseKey: CanonVersifier?

Adrian Korten sword-devel@crosswire.org
Wed, 03 Dec 2003 10:16:20 +0700


Good day,

One example of versification that you might look at is used by Paratext. 
Please check with Nathan Miles for the correct details. I am not a 
Biblical scholar so I may have misunderstood the explanation. As I 
understand it, this program uses the versification assigned to the 
original texts of Hebrew and Greek. Hebrew for the OT, Septuagint for 
the canonical books, and Greek for the NT. When the KJV module wants to 
look up a specified text, it finds where that is in the original text 
index and then it is told where that is in the KJV.

It adds an extra layer or two of searching indexes but it does allow 
multiple versification and access to the deuter-canonicals. Perhaps 
someone can explain it better but this may give you an idea of the concept.

ak


Chris Little wrote:
> Lynn,
> 
> I don't know if this is something you would be interested in working on 
> or not, but if you're interested in contributing to development of the 
> API, perhaps you would be.
> 
> We've had plans to, in a sense, fix our versification mechanism for a 
> very long time.  The current system is extremely rigid.  It only permits 
> the versification of the KJV, and only the OT & NT books from that. This 
> needs to be changed to allow 1) support for Apocryphal books and 2) 
> support for alternate versifications.
> 
> The most rudimentary start of task 1) has been completed, in the form of 
> a bunch of b/c/v offsents akin to what you find in canon.h.  (All this 
> work is in apocrypha.h.)  I think we have in mind to target those books 
> listed in apocrypha.h initially, but leave to door open for expansion 
> to, e.g. the Ethiopic Orthodox canon.  Part of this task would be 
> exposing methods to list which books actually exist in a module.  We 
> don't want people who install the NASB to be presented with apocryphal 
> books in menus since they don't exist for that translation.  This would 
> (of course) be very helpful beyond adding Apocrypha support since there 
> are many books currently on our website that only contain the NT or the 
> OT or some other smaller portion of the Bible.
> 
> Task 2) might seem very difficult, but in a way, most of the machinery 
> already exists in Sword.  After all, Sword formerly didn't have a set 
> canon.  That was only added when Troy found that all of the books he was 
> encoding were using the same versification scheme.
> 
> If you get these tasks done, there's plenty more work to be done.  I 
> think the verse parser really could be optimized significantly.  And a 
> scanning verse parser utility, that takes a whole plaintext file as 
> input and marks all of the verse references it finds with an OSIS tag, 
> would be very useful.  We'll also need to create a method for mapping 
> between different versification schemes once we can actually support 
> them.  For example, verse 1 of most Psalms in Vulgate-descended Bibles 
> corresponds to the title in Hebrew-descended Bibles.
> 
> These tasks would probably require working closely with Troy, since I 
> think he's already put some thought into the problems and the 
> architecture of their solutions.
> 
> --Chris
> 
> 
> Lynn Allan wrote:
> 
>> I believe there would be advantages in having VerseKey derive from a base
>> class that basically is responsible only for "versification" using a
>> significantly modified canon.h. (or 'contain' ... see below???)
>>
>> The proposed class, CanonVersifier, would *not* know anything about 
>> parsing
>> or listkey. The current problem with VerseKey, IMHO, is that its 
>> elaborate
>> parsing capability becomes an "octopus with tentacles" into a large 
>> amount
>> of code. It causes a lot of unfortunate dependencies so that a newbie
>> program containing:
>>
>> printf("Hello newbie! Sword has now been demonstrated to be 'knitted
>> together' on your computer! \nEnjoy Col 3:23 = : %s\n so as to eventually
>> hear (rather than read) Matt 25:23 = %s\n", buf_Col323, buf_Matt2523);
>>
>> becomes about 200kb instead of 24kb.
>>
>> (An aside: the program that demonstrates the function:
>> void RemoveMostTagsAndExtraWhiteSpace(mutable char *buf);
>> uses a greatly simplified VerseKey that conceptually illustrates the
>> "division of labor" between the proposed CanonVersifier and the existing
>> VerseKey)
>>
>> Some specifics for CanonVersifier.h (using doxygen notation)
>>
>> /** @par constructor
>>  * From Verskey, continue using odd choice of char for testmt and bk.
>>  * (Every bit seems to count. Note that testmt would probably remain
>> "stateful")
>>  * To actually save bits, use unsigned short's for chap and verse, and
>> actual versification rather than int's
>>  */
>> CanonVersifier(char testmt, char book, unsigned short chap, unsigned 
>> short
>> verse);
>>
>> /** @remark
>>  * Possible implementation available @
>> http://www.crosswire.org/forums/mvnforum/viewthread?thread=14
>>  */
>> unsigned short getMaxChapInBook(char bk);
>>
>> /** @remark
>>  * Possible implementation available @
>> http://www.crosswire.org/forums/mvnforum/viewthread?thread=13
>>  */
>> unsigned short getMaxVerseInChap(char bk, unsigned short chap);
>>
>> // increment function: should it be within CanonVersifier?
>> // decrement function: should it be within CanonVersifier?
>>
>> Also, it would seem advantageous to have the hard-coded constants now 
>> within
>> canon.h be contained privately within CanonVersifier.cpp. Current 
>> practice
>> makes all these "magic offsets" visible to the entire world, which seems
>> contrary to data abstraction and data hiding. (When I put on my rarely 
>> used
>> "purist" hat, I am hard pressed to think of anything that warrants data
>> hiding more than these offsets.) Also, these hard-coded array constants
>> cause problems for generating precompiled headers, and might be part of
>> dramaticially improving rebuild times (along with removing array data 
>> from
>> one other .h file?).
>>
>> Another proposed change: All of the offsets that are now within canon.h
>> could be unsigned shorts. This would actually save some bits as none are
>> greater than 25,000. There are about 6 to 10 easily found changes 
>> within the
>> existing VerseKey that would result from this.
>>
>> To conclude, the existing VerseKey would be refactored to derive from
>> CanonVersifier. VerseKey would be limited to providing parsing 
>> capabilities.
>> All existing apps could continue to instantiate VerseKey objects. Future
>> apps that wanted to have minimal "footprint" (such as sword 'plug-ins'?)
>> could instantiate CanonVersefier objects. Unauthorized apps that tried to
>> use canon.h (and avoid sword-api for module access) would break ...  we'd
>> know who you are :-(
>>
>> p.s. Perhaps VerseKey should "contain" an object of CanonVersifier, 
>> rather
>> than derive from it? It would seem to "has-a" rather than "is-a"??? 
>> That's a
>> question for architects with far greater experience and talent that this
>> scribbler.
>>
>>
>> _______________________________________________
>> sword-devel mailing list
>> sword-devel@crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
> 
> 
> 
> 
> _______________________________________________
> sword-devel mailing list
> sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
>