[sword-devel] osis2mod linking bug

DM Smith dmsmith555 at yahoo.com
Fri Sep 5 07:57:37 MST 2008

Troy A. Griffitts wrote:
> Right, it seems like we might need a bool SWModule::isLinked(const SWKey 
> &) which tells you if the current location is linked to the provided 
> location.  Something which compares 'inodes' in versetext and @link 
> values in others, etc...
> That's doable from the engine point of view, but I'm not sure will solve 
> your problem.
I think the above would be a good addition, but your guess is right: it 
will not solve my problem.

Given how I am going about it now, a performance enhancer would be a method:
 * Returns true if the current key has content.
bool SWModule::hasContent()
That would check to see if the "inode" pointed to anything.

Does such a thing exist?

I tried the following:
key = module++;
and I got surprising results. It did the post increment before the 
assignment! Semantically, this is wrong!
Looking into it the post-increment operator is not defined and because 
of that the pre-increment operator is used.
Here is a post increment operator:
SWKey& operator ++(int /*unused*/) { SWKey temp = *this; increment(1); 
return temp; }
Likewise for post decrement:
SWKey& operator --(int /*unused*/) { SWKey temp = *this; decrement(1); 
return temp; }
IIRC, post-increment was added to C++ 2.0.

I also tried the following in going through a ListKey:
But I got a compile error. Seems that the signature for ListKey did not 
repeat the step = 1 from the base class.
Base class signature:
swkey.h:    virtual void increment(int steps = 1);
Derived classes with this problem:
listkey.h:    virtual void increment(int step);
rawcom4.h:    virtual void increment(int steps);
rawcom.h:    virtual void increment(int steps);
versekey.h:    virtual void increment(int steps);
My guess is that decrement has a similar problem.

> Like you mention, you're looking for a way to get the entire set of 
> entries that link to a data slot.  Something like:
> find / -samefile xyz -print
> I guess you could use the newly proposed method above to iterate all 
> entries and check if any are the 'samefile' as one you are interested 
> in.  But like the filesystem, we don't store a set of all directory 
> entries that happen to point to the same inode.  And if you're looking 
> for backward compatability, we can't change the format (not that I would 
> want to change the format for this anyway).
I'm not looking for a format change. However, there is a way to provide 
both a format change and backward compatibility: Additional files. 
Separate files would only be seen by new code.
> With the the logic osis2mod is using (for any verses outside of KJV, 
> read text from closest previous valid KJV entry, append new verse text 
> and rewrite the entry) it sounds these exception KJV-reversifications 
> should probably do the ugly: `find / -samefile xyz -print` logic to 
> check if the entry they are about to rewrite is linked to by anyone 
> else.  It sucks, but it is all I can think of that would solve this 
> problem.  And pragmatically, you probably only have to: do { module--; } 
> while (!module.Error() && module.getRawEntryBuf() == 
> aboutToBeRewrittenEntryBuf);
Works for me, with a slight variation. I'm caching the osisIDs that 
refer to more than one verse and doing the linking at the end. As each 
<verse> is seen, the content is written to the first (or only) value in 
the osisID,e.g. osisID="Gen.1.29 Gen.1.30 Gen.1.31" would store the 
content at Gen.1.29 and leave linking Gen.1.30 and Gen.1.31 to the end.

This will allow me to look for last verse in a chapter with content. So 
if <verse osisID="Gen.1.32"> is seen it's content is appended to the 
last verse in Genesis 1 having content, which is Gen.1.29.

I think it is a fair assumption for osis2mod input that a chapter will 
be defined with verses in numerical order.
> Hope I haven't pass the buck for no good reason.
> 	-Troy.
> DM Smith wrote:
>> Greg Hellings wrote:
>>> DM,
>>> On Thu, Sep 4, 2008 at 1:05 PM, DM Smith <dmsmith555 at yahoo.com> wrote:
>>>> I'm trying to solve an osis2mod linking bug that was exposed by several
>>>> beta modules.
>>>> Here is the scenario:
>>>> <verse osisID="XXX.1.29 XXX.1.30 XXX.1.31">Text for the last three
>>>> verses of XXX, chapter 1 in the KJV versification</verse>
>>>> <verse osisID="XXX.1.32 XXX.1.33">Text for additional verses in the XXX,
>>>> chapter 1 in the non-KJV versification</verse>
>>>> 1) osis2mod links 1.30 and 1.31 by writing out the start and offset of
>>>> the data for 1.29. That means that 1.29 has to be written first. In the
>>>> index file the result is that all three entries have the same start and
>>>> offset. So far so good, if there weren't a bug that writes the links
>>>> first and then the data, resulting in start/offset of 0/0 for the links.
>>>> But I've already figured out how to fix that bug.
>>>> 2) Now a non-KJV verse is found and osis2mod appends it to the last
>>>> verse of the chapter according to the KJV versification. So the entry
>>>> for XXX.1.31 is found, the raw text is gotten and it is appended and
>>>> this augmented text is written to the data file, at the end of the file.
>>>> Finally the index entry for XXX.1.31 is updated.
>>>> The problem is that 1.29 and 1.30 are not updated, they still point to
>>>> the "1.29-1.31" text and now 1.31 points to the "1.29-1.33" text.
>>>> 3) The other problem is with 1.33, it is noticed that it is out of
>>>> bounds and is changed to 1.31 and then it is linked to 1.32, which does
>>>> not exist and thus 1.31 is re-written to 0/0.
>>>> I'm looking for a way to solve these problems. Here is what I am
>>>> thinking and I'd like feedback or a better way.
>>>> For 1) I have postponed the writing of the links until the verse is
>>>> written. I could either wait until the next verse is ready to be
>>>> written, or later. I've decided to wait until the very end and do the
>>>> linking then. This might help solve 2 and 3.
>>>> For 2) I'm thinking that the verse to append is the last verse in the
>>>> chapter with content. By postponing linking until later, it will append
>>>> to 1.29. Then the linking will propagate the final start/offset values
>>>> for 1.29. The problem I have with this is that this is somewhat disk
>>>> intensive. I start with the last verse in the chapter and get the raw
>>>> text for it. If there is none, I then decrement and refetch until I find
>>>> it. I looked for a way to know if a verse were part of a linked set and
>>>> what the members of that set were, but I didn't see any in the SWORD
>>>> engine. Am I missing it?
>>> When I ran into this bug in dabbling with a SWORD front-end, I was
>>> told that the only way to test for whether it is part of a set is to
>>> compare the text of verse x with verse x-1.  I don't know that the
>>> feature we both sought has been added.  If it has, I haven not heard
>>> word of it.
>> Ouch, that is expensive. It should not be necessary to dig down to the 
>> text and then do a string comparison. Comparing the start/offset 
>> (block/start/offset for compressed) is sufficient. I don't see how to 
>> get this info. It would be nice to have the ability, in the engine, to 
>> compare two keys to see if they refer to the same text.
>>> Alternatively -- aren't we supposed to be moving to the VerseTreeKey?
>>> Shouldn't a new module in the Beta stage be using that?  It seems like
>>> that would completely do away with the problem.  Updating osis2mod to
>>> use that, either as the default or as an option for a module like this
>>> might eliminate the issue?
>> I think ultimately that osis2mod will need to be updated to handle other 
>> versifications. The VerseKey module type is significantly faster than a 
>> VerseTreeKey module will ever be.
>> My goal in modifying osis2mod is that it will create 1.5.9 compatible 
>> modules.
>> Later, I intend to make modifications that will require 1.5.12. 
>> (Specifically, an extension of preverse content that we've discussed 
>> here that will have a preverse div and not just a preverse title.) At 
>> that time, it would also be appropriate to handle VerseTreeKey as an option.
>>>> For 3) this is fairly simple, one should not link verses that are not in
>>>> the KJV versification.
>>> That seems like a very stringent restriction.  If linking is supported
>>> for the KJV system, why should it not be allowed for other systems?
>>> It seems like linking should be fixed in a way to allow everyone to do
>>> it.
>> It's not a stringent restriction for the current osis2mod which only 
>> works for KJV versifications. Later, when osis2mod is upgraded then that 
>> restriction will need to be removed.

More information about the sword-devel mailing list