[osis-users] OSIS cross-reference questions

Markku Pihlaja markku.pihlaja at sempre.fi
Wed Nov 28 05:53:58 MST 2012


DM wrote about one special case but this applies pretty well also more
generally:

Yes, it is an external semantic, not an OSIS one. It should be documented
for the reader. I doubt that the reader will ever see such documentation. A
reader of a book may know to go to an introduction to figure out what the
+, |, ;, .... mean. A user of software might not know where to go or that
it is documented anywhere.

Exactly. And for this reason there shouldn't, if possible, be anything at
all in the markup that is left open for the user's interpretation. If we
are obliged to use the concept of an indirect or a compound reference (as
we are with our Finnish translation), the software alone should be able to
handle the special cases properly and not rely on the user to understand or
even notice the special notation and act accordingly.

Thus, this suggestion of yours, for example, is not an option - at least
not for us:
Then all you can do is have a reference to the verse that has the real
note. The text of the reference would have to be understood by the reader
as meaning an indirect reference. This is no different from what you see in
the text.

That would mean markup like this:
<reference osisRef="Gen.32.31">Gen. 32:31+</reference>

In a printed Bible, users would probably pay more attention to the plus in
the notation, since they actually need to read the reference carefully in
order to look up the verse it's referring to. But in software they'd be
presented a ready link to click. Upon seeing a link, quite many users would
simply go and click the link, without paying too much attention to what the
link actually says, and thus they might not notice at all that there is a
plus. Which would result in them taking false verses as the reference.

Also, this markup would not enable future software to
implement tailored proper handling of indirect references, since the markup
wouldn't contain any indication whatsoever about the indirectness. And
software should determine structures from markup only, not text between the
markup. Of course, this part of the problem might be fixed by adding some
attribute that implies that this reference is indirect, but still, even
then software that ignores that attribute would result in erroneous results.

And this is why I still consider a direct reference to a reference as my
first option to code indirect references (unless someone still comes up
with a better and more compatible solution).

Similar problems also apply to compound references, and I believe I'll have
to go with the option of stretching the boundaries of OSIS there as well.

There are 1625 occurences of indirect references in our translation, and
about as many compound references, so they are not minor exceptions but a
very common practice.


So, much as I'd love to be able to give the users of our source file
pointers to tools ready for use with the file, I'm afraid that might turn
out to be impossible, at least with the current versions of SWORD and
JSword. I'm really sorry about that but it seems inevitable right now. Of
course, I'll be happy to endorse those tools if they one day will be
updated to support these features - but naturally I also can't expect them
to take into account special needs that seem to be related to just one
small translation.


The target of this project was to produce a file that will not need too
much manipulation to be used for a variety of purposes, and this target is
not compromised even though I'd use OSIS in unorthodox ways. Of course I'll
document any unorthodox usage and also consider whether I should even call
the markup something like "OSIS with some modifications". On the other
hand, even with these modifications the code would still seem to be valid
OSIS.

Once again, our discussions here and your help have definitely not been
redundant. You've helped me a lot to pin down some major issues about these
problems. Lots of thanks for that!

Blessings,

Markku






2012/11/26 DM Smith <dmsmith at crosswire.org>

>
> On Nov 26, 2012, at 11:22 AM, Markku Pihlaja <markku.pihlaja at sempre.fi>
> wrote:
>
>
>
> 2012/11/24 DM Smith <dmsmith at crosswire.org>
>
> Haven't been able to reply earlier.
>>
>
> No problem, great to hear from you even now! Or well, Saturday.
>
> <note type="crossreference">
>>         <reference osisRef="Gen.38.7">Gen. 38:7</reference>,
>>         <reference osisRef="Gen.38.10">10</reference>;
>>         <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>;
>>         <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>
>> </note>
>>
>> Close. The osisRef range has to have 2 osisIDs separated by a dash. So,
>> Num.26.19-Num.26.21. Also, the separator for an osisID or an osisRef is
>> never a colon, but only a period.
>>
>
> Oops... My mistake, didn't convert that "human reference" to an osis
> reference. I wrote that after a long day at work :).
>
> If, on the other hand, I list that as three subsequent notes, the
>> semicolons wouldn't be embedded in any tags and thus would be rendered even
>> when reference notes should be hidden.
>>
>> <note type="crossreference">
>>         <reference osisRef="Gen.38.7">Gen. 38:7</reference>,
>>         <reference osisRef="Gen.38.10">10</reference>
>> </note>
>> ;
>>  <note type="crossreference">
>>         <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>
>> </note>
>> ;
>>  <note type="crossreference">
>>         <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>
>> </note>
>>
>> I guess it is also true what you wrote about note tags: they represent
>> the marker(s) in the text (even though most of our printed Finnish Bibles
>> don't include markers within the text; the notes are listed after certain
>> passages with references to the position of the note instead). Also this
>> would imply that I shouldn't use the later example with three subsequent
>> notes.
>>
>>
>> This won't work as the ; are now part of the main text. It appears here
>> that you are trying to get three foot note markers separated by semi-colon.
>>
>
> Yes, just as I assumed in the text before the example. I wasn't trying to
> suggest a correct way here but rather demonstrate the problem in the
> "obvious" solution to my original problem.
>
>
>
>> The <note> element specifies the placement of a footnote marker and it's
>> content is the content of the footnote. It is really as simple as that.
>>
>
> Yes, exactly!
>
> ...listing all parts of the compound reference in one osisRef. That would
>> seem to work somehow:
>>
>>  <note type="crossreference">
>>         <reference osisRef="Gen.38.7 Gen.38.10">Gen. 38:7,10</reference>;
>>         <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>;
>>         <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>
>> </note>
>>
>> ...So is this certainly valid markup?
>>
>> It is valid, but not a good idea. There is a wide variety of software
>> that handles OSIS, e.g. SWORD and JSword. The former is focused on chapter
>> at a time presentation, so expects each reference to be a contiguous range,
>> presenting the chapter and perhaps highlighting the first contiguous range.
>> JSword takes each reference as a verse list and presents the contents of
>> each of the verses.
>>
>
> Did I understand right? JSword can handle even such osisRefs as
> <reference osisRef="Gen.38.7 Gen.38.10 Num.26.19-Num.26.21">Gen.
> 38:7,10, Num. 26:19-21</reference>
>
>
> Yes. JSword can handle this just fine. It will create a single clickable
> link, which will navigate to a page with those references' text.
>
>
>
>
>> At this time, I'm not aware of other open source OSIS software.
>>
>
> Ok, this was valuable information. I haven't really found any extensive
> lists of commercial or open source OSIS software, so this was good to know.
>
> Also, assuming "Gen.38.7 Gen.38.10" would be a valid osisRef, would also
>> for example "Gen.38.7 Gen.38.10-Gen.38.12" be? We also have a few
>> compound references consisting of separate verses AND one or more ranges.
>>
>> Yes this is valid. Any number of verses and ranges are allowed in
>> osisRefs.
>>
>
> But not the best possible idea, as you mentioned, since SWORD can't handle
> it properly (as JSword can), right? Or were you talking here about a
> different case from the one we just talked about?
>
>
> Right. It's not recommended.
>
>
> For example, does the extension part of an osisRef always need to have a
>> corresponding osisID somewhere? Or could we have a verse like this:
>>   <verse osisID="Xxx.2.14" sID=.... />
>>   Some text here. Some more text here. Even some more text here. And more
>> and more text.
>>   <verse eID="... />
>>
>> and then have a reference like this:
>>   <reference osisRef="Xxx.2.14!c">Xxx 2:14</reference>
>> with just the osisID "Xxx.2.14" declared but not "Xxx.2.14!c"?
>>
>> Yes. This allowed. Any work can have references to another work using the
>> reference system of that other work. As a result, there is no required
>> referential integrity between an osisRef and an osisID.
>>
>
> Will the reference to Xxx.2.14!c render as a reference to Xxx.2.14 or be
> ignored as referring to an unknown point? Well, that probably depends on
> the software, but how about SWORD and JSword?
>
>
> Both SWORD and JSword ignore the ! and what comes after it when
> interpreting a verse cross-reference.
>
>
>
>
> I'd suggest that you'd determine why the master text has it that way and
>> to what you are targeting the OSIS and how to best represent it in OSIS.
>> I'd suggest that you fully encode the note in both spots and somehow
>> indicate either in markup or text that the list is the same as in another
>> location.
>>
>
> I'm afraid I might not have the luxury of being allowed to do that. Either
> our translation committee in late 1980's or someone even earlier created
> the different conventions of marking crossreferences, including this
> indirect type of a reference.
>
> These conventions with our official translation are quite strict, and
> making changes to them would unfortunately require a decision from our
> General Synod which assembles twice a year...
>
> When I first saw these different reference types when starting this work,
> I immediately asked if the indirect references could be made direct by
> duplicating the target reference. But the immediate answer was no.
>
>
> Then all you can do is have a reference to the verse that has the real
> note. The text of the reference would have to be understood by the reader
> as meaning an indirect reference. This is no different from what you see in
> the text.
>
> An introduction for the work probably should explain this and other
> features of the document.
>
>
>
>
>> Final suggestion, think of markup as a language. You are translating from
>> one language (the master text) into another language (OSIS). The
>> translation, as you are finding out, is not one-to-one, but rather
>> thought-for-thought. It sounds like your master text is structured for
>> print-only. OSIS is meant to be neutral to the target, not presuming paper,
>> phone, tablet, computer, ....
>>
>
> This is exactly how I'm trying to think of this project. I'm not a Bible
> expert, nor a printed book expert but mainly a web expert and thus think
> exactly in terms of flexibility and even future unknown application types -
> as much as possible.
>
> Yes, our master text, published in 1992, was (obviously) structured for
> print. It means that we can only dream of marking up quotes, for example,
> since there are no consistent start and end markers in cases of multi-level
> nested quotes.
>
> But on the other hand, of the markup that does exist in the source, there
> isn't much such really print-specific markup or semantics that couldn't be
> reproduced digitally - most of it can actually be better
> implemented digitally than in print. The "vague" reference being probably
> the only one that doesn't fit well into the digital world, all
> other cross-reference-related issues discussed here are very well suited
> for electronic publishing and hyperlinks, even though OSIS - or at least
> some OSIS implementations - have a hard time handling them.
>
> About that "vague" reference: I currently consider dropping that - or
> actually not dropping but just implementing the "vagueness" pretty much the
> same way as in print: using the "|" separators between these references
> instead of semi-colons, but dropping the vagueness from the actual
> osisRef. That's loyal to our source and good OSIS, too.
>
> But I think I'll have to stretch the boundaries of OSIS (or at least
> current applications) a little with the indirect references. As you
> mentioned, all (or even any) ready OSIS software might not be able to
> handle that.
>
> Our main goal is not to produce a perfect OSIS source file but find a
> format that is exact and can contain all the structural information our
> translation contains. Being able to strictly conform to some format that
> already has ready-made tools to handle everything would be a bonus, but
> that comes only second to preserving all structure (including indirect or
> compound references, for example).
>
> This is indeed preparing for the future instead of just ancient print:
> using markup that might not be handled by any current software, to mark
> structures that yet can have user-friendly implementations in the future.
> In  some Bible-reading software, the indirect reference might be for
> example "See references of Matt. 8:1" and only after that provide a list or
> popup with those references. But in our OSIS file we'll have to stick to
> providing just the indirect reference, and the rest is up to the
> application. And I believe we are stretching the OSIS boundaries but not
> crossing them - you did say that I'm "free to encode it as you like.
> However, software that I'm familiar with won't handle exotic uses of OSIS."
> I'll certainly document well all exotic uses.
>
>
> IIRC, there is some markup for what is useful for print but not software.
> I think <milestone> is one of those. Maybe the only one?
>
>
> *I.* How do I markup a single but compound cross-reference that refers to
>> non-adjacent verses or ranges, so that it (structurally) differs from a
>> (more typical) note containing separate references to the same
>> verses/ranges?
>>
>> There is no such thing as a single but compound cross-reference.
>>
>
> I guess that's a matter of definitions. In our notation, these two
> examples mean quite different things:
>
> Lev. 3:17; Lev. 7:26-27
> and
> Lev. 3:17, 7:26-27
>
> The former one is a list of two separate cross-references (indicated by
> the semi-colon and space separating the two verses/ranges). The latter one
> (indicated by the comma spearator) is a single reference consisting
> of cross-references to Lev. 3:17 and Lev. 7:26-27. Of course, some
> definition of "cross-reference" might not accept the "recursion": a
> cross-reference consisting of several cross-references, but that's just
> terminology. For us, a single compound cross-reference is something real.
> And that's how the software needs to understand it, too. Once again, it
> might not be readily understood by some (or any) current OSIS
> implementation, but that's a price we have to pay in our case: limiting the
> selection of ready-made software that can be used.
>
>
> Yes, it is an external semantic, not an OSIS one. It should be documented
> for the reader. I doubt that the reader will ever see such documentation. A
> reader of a book may know to go to an introduction to figure out what the
> +, |, ;, .... mean. A user of software might not know where to go or that
> it is documented anywhere. For a SWORD module, the conf is a great place to
> document such conventions.
>
> Since this is the osis-users list, I've tried to avoid referencing SWORD
> and JSword. But it appears you may be targeting them or at least are
> interested in how they interact with an OSIS document.
>
>
>
> *II.* How do I markup a reference to a note whose source is more complex
>> than just one verse or a contiguous range?
>>
>> As separate multiple ones.
>>
>
> I hope my example above, with Lev. 3:17 and 7:26-27, demonstrated why this
> is not possible in our case. That compound reference might in turn be one
> element on a list of separate references, and it simply must not break
> apart into two independent elements of that list.
>
>
> -------------------
> To sum up:
>
> We probably won't be able to produce OSIS code that's 100% compatible with
> all current implementations. For us OSIS is not a value in
> itself (sorry!) - no one has requested for the digital Bible source file to
> be exactly OSIS. However, OSIS is one of the rather few ready formats we
> could consider to supply the necessary structural information with the
> Bible text, and flexible enough to be rather easily converted to any
> further format that users of this source file (print or digital publishers
> etc.) might need.
>
> Even though we might need take some small steps off the perfect OSIS path,
> talking to you about these things has been very valuable in order to find
> ways that will keep us as close to that path and enable us to produce
> as ready-usable OSIS as possible. And we are very thankful to all you for
> that.
>
> Blessings to all,
>
> Markku
>
>
>
> PS. I still might to get back to you on some issue, but hopefully not too
> many times now.
>
>
> Blessings to you too!
>
> In Him,
> DM
>
> _______________________________________________
> osis-users mailing list
> osis-users at crosswire.org
> http://www.crosswire.org/mailman/listinfo/osis-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/osis-users/attachments/20121128/ec7a69da/attachment-0001.html>


More information about the osis-users mailing list