[osis-core] User's Manual - .09! comments

Patrick Durusau osis-core@bibletechnologieswg.org
Wed, 19 Nov 2003 18:54:55 -0500


Steve,

Steven J. DeRose wrote:
> Nice caveats at the beginning, but lose the "!"s.
> 
> Let's call it "Draft" rather than alpha or beta.

Done.

> 
> "This manual is meant to be a guide for all user's of the OSIS" lose the 
> apostrophe.
> 
Done.

> Do we have a CSS we can refer from a stylesheet PI, so the doc can be 
> loaded as-is and at least somewhat display in browsers?
> 
No, but time permitting I will try to find or create one.

> Should we have comments directed to the osis user list as well/instead 
> of us? Should we list you as (at least) primary editor)? And the inner 
> core me, troy, todd, chris?) as co-editors?

Sure. Will add in .11.

Note Troy is setting up the following address for comments:

osis-editors@bibletechnologieswg.org

That way we can redirect as appropriate.

> 
> Can we get vspace between LIs?
> 

Do you mean in the HTML presentation or with a stylesheet for the XML 
source?

See attached, osis_list.html

> "recognized by form-aware processors" -- probably need a note pointing 
> to some info on what an architectural form is -- or, we could add a 
> glossary at some point.

May not reach it in this draft but noted.

> 
> "See Appendix ***, Validating Your OSIS Document" -- resolve stars, and 
> set title italic
> 
> (BTW, I have a real nice Goudy cursive 18pt you could use for that -- is 
> lead truetype-compatible?)
> 

Done.

> "(for example, you can't insert a Bible book within a footnote)" -- 
> isn't that untrue, because books are a type of div, and div is probably 
> (?) allowed in footnote? not sure.
> 

No, div is not allowed in footnote but dumb example. Strike!


> " can be downloaded as a package from the OSIS website." let's at least 
> put in the link and a dummy page.
> 

How about: http://www.bibletechnologies.net/ (forthcoming at this site)?

> "Most such programs also read an XML schema" -- should we note in 
> passing that there are three schema languages, and that OSIS is at this 
> time only provided in XML-Schema form? And maybe a pointer to trang?
> 

Yes, will work on language.

BTW, Lou was chiding me about two TEI people working on a wholly 
separate schema. ;-) Maybe for 3.0, assuming that Sebastian has the 
RELAX-NG version of TEI ready by that point we should write OSIS as a 
TEI application? Don't know that we would gain that much but would be an 
interesting exercise. (Like we need any exercise!)

> Nice that you've bolded tag names in the text.
> 
> "The value will generally be the short name of what is being encoded, in 
> this case the Contemporary English Version, or CEV." -- Mention ":The 
> short name is defined in the *work* declaration for the work, described 
> later."
> 

Done.

> "that it number the proscriptions" -- "s"

Probably need to rewrite anyway.

Suggest:

Hebrew tradition varies in several respects, the best
known being that it numbers what is given as a title for Psalms in
      most English translations as verse 1,
and the beginning of the psalm in such a translation as verse 2.

> 
> " It is called "canonical", and always have a value" -- have/has
> 

has. Done.

> "Note that the match between osisIDWork="CEV" in osisText and 
> osisWork="CEV" in the work element links this osisText to this 
> particular work element. "  -- add after: This *work* element should 
> (must?) be the first work element."
> 
Must, best practice.


> "See Appendix G: USMARC Relator Codes for the complete list of role 
> codes provided by the USMARC organization." This list covers an enormous 
> range, and it should seldom if ever be necessary to use a code not from 
> this list.
> 
Done.

> "Publisher element in the work element" -- Prepend "The"
> 

Done.

> "The type attribute must be set to "ISO-639," "ISO-639-2," or "SIL,"" -- 
> bold theattribute  values (and throughout)
> 

Done.

> Under "type" attribute:  Add "Note that the Dublin Core type element is 
> distinct from the OSIS type attribute (the latter can occcur on any OSIS 
> element, to distinguish relevant subdivisions of the type).
> 

Done.

> Under Identifier: All the abbreviations are all-caps except the *first* 
> one, "Dewey". Suggestions: Change to "DEWEY"; put some delimiter between 
> the abbrevvs and their descriptions; note that as with other names and 
> values, these are case-sensitive
> 

Done, made the values bold. Did not use delimiter in appendixes to avoid 
someone type exactly what they see, such as DEWEY: and they get an error.

> "Note that without the proper type attribute" -- bold type
> 

Done.

> "HOwever,"
> 

Done.

> "(particular older works)," --> "ly"
> 

Done.

> NB: We may want to write an LC-scraper that takes a pointer to a 
> directory full of OSIS texts, grabs out the author names, and looks them 
> up in LC to find their authority entry, and copy that entry for us. We 
> could also introduce a standard authority-list format (say, topic 
> maps/PSIs in a particular form? And a tool to convert LC authority 
> entries into that form? nice shot in the arm for us, topic maps, and for 
> stdzn of auth lists, which LC has not yet managed....
> 

Sure, only real problem is that the Published Subjects TC has not yet 
agreed on a format. Maybe we should just start a practice?

> 7.5. Date formats" -- this appears to be the third distinct date format 
> we introduce -- Can we collapse the prior syntaxes with this one? 
> (mostly an issue of colons vs. dots, and what truncations are allowed, I 
> think)?
> 

OK, but not for this version? ;-)

> "In such works, use the osisID attribute to identify the retrievable 
> portions" -- bold attr name.
> 

Done.

> "as found in standard work easier" -- s
> 

Done.

> "so long as verses and chapter " -- s
> 

Done.

> "The paragraph need not give an osisID for the set of verse " -- s
> 
Done.


> "there are exceptions to this, " append "elsewhere in the Bible"
> 

Done.

> "Sometimes a verse or chapter starts or end " -- s
> 

Can't find, must have been changed.

> "Elements that are "milestoneable" in the OSIS schema are" -- Prepend " 
> Empty elements are indicated in XML by a tag with "/" preceding the 
> final ">": thus "<verse/>" rather than <verse> or </verse>. Elements 
> used in this way are commonly called "milestones", and those particular 
> elements in OSIS that permit this alternate encoding are thus called 
> "milestoneable".
> 
Done.

> "A way to declare the list of characters, or castList;" bold element name
> 

Done.

> "to be listed separatelyl; "

Done.

> 
> "of an individualmaking an"
> 
Done.


> "Note that in this example the high priest's short speech in verse 1 is 
> marked up as a normal container element with normal start- and end-tags, 
> because it fits within the bounds of the verse. However, Stephen's 
> speech starts in the middle of verse 2 and continues to the end of verse 
> 53. This necessitates marking up verse 2 using a milestone pair, as 
> shown. The other verses are entirely enclosed within the speech, and so 
> need not be marked up using milstone pairs. When a conflict arises 
> between the scope of chapter/verse units and other units, the 
> chapter/verse units give way by being represented as milestones. If a 
> conflict arises between two other units (say, a quote that encompasses 
> part but not all of each of two paragraphs), it is left to the encoder's 
> discretion which or them is represented via milestones." -- Did we 
> prohibit this a little earlier where we said you can't mixed milestoned 
> and non-milestoned elements? or was that just for verses and chapters? 
> Have to clarify this or it looks like a contradiction.
> 
Think the best course is to never mix milestone form with regular one in 
a single text. Need to decide on a practice and stick to it.

Better?:

Note that in this example the high priest's short speech in verse 1
is marked up as a normal container element with normal start- and
end-tags, as is Stephen's reply. But, note that all the verse
boundaries have been repesented with milestoneable verse elements. The
reason for this is quite simple, if the encoding jumps from using
containers for verses and only on occassion changes to milestones,
noting that Stephen's speech start inside a verse, the file becomes
very difficult to process reliably.  When a conflict arises between
the scope of chapter/verse units and other units, the chapter/verse
units give way by being represented as milestones. If a conflict
arises between two other units (say, a quote that encompasses part but
not all of each of two paragraphs), it is left to the encoder's
discretion which or them is represented via milestones.



> "Thus, like TEI, markup of poetry refers to lines and line groups. " -- 
> quote or embolden
> 



> "The lg or "line group" element " -- bold
> 
Done.

> "Thus it covers for units like couple" -- append "t"
> 

Done.

> "The l element " -- bold
> 

Done.

> NB: Can we reduce the vertical space above and below examples?
> 

Will see what I can do with the stylesheet.

> Add example for <lb/>. Also add bold in description where appropriate.
> 

Noted, will reach tonight/tomorrow.

> "12.6.4. table" -- we should at least say whether nested tables are 
> permitted.
> 

Permitted by the schema, child of cell.

> "rather then just within either" s/then/than/
> 

Done.

> " For example, <milestone type="page> n="3"/>" -- turn gt into quote to 
> fix syntax.
> 

Done.

> Lay out predefined types the same everywhere -- I like the way the 
> milestone types are done, and would match the others to it. Except, I'd 
> also bold the labels.
> 

Will do more space with the stylesheet and delete the ":" character. Say 
what is in bold is what you type.

> "The start of the first column need not be marked " -- Prepend "Assuming 
> page boundaries are also marked,"
> 

Done.

> Reference: add example?
> 

Will do.

> "<verse osisID="NRSV:Mark.5.41">He took her by the hand and said to her: 
> <q><foreign xml:lang="arc">Talitha cum</foreign></q>, which means, 
> <q>Little girl, get up!</q></verse>" -- preformatted line too line -- 
> break it up so it stays visible.
> 

Done.

> "Provides simple text highlighting capability; types can " -- s/types 
> can/types that can
> 

Done.

> " the <hi> element" -- lose brackets and gain bold (likewise anywhere 
> else this happens)
> 
Done.


> "If it is known why a word or phrase "  -- s/known/known with reasonable 
> certainty"
> 
Done.


> At end of "hi" section append: If needed, additional types may be added, 
> but must begin with "x-".
> 

Done.

> "Your are Simon sone of John" -- s/sone/son/

Done.

> 
> "Remember that a computer cannot distinguish Job, as in the man from Ur, 
> from job, as in 'I have a job for you...' without your assistance" -- 
> append (at least at the beginning of a sentence)
> 

Done.

> "Any use of any form of the name of the Deity is marked with divineName. 
> " -- In that case, how do we formally distinguish the tetragrammaton? 
> type on divinename? should we enumerate the extant Biblical phrases as 
> types, such as el, el shaddai, el elyon, ado---, y---, etc.?
> 

Hmmm, don't know. Best we can do now is suggest type.

> "boudnaries"

Done.

> 
> " At this time it is intended for use within notehi> elements." -- fix
> 
Done.


> "This example illustrates (or reinforces several points):" -- move ")" 
> left 2 words
> 

Done.

> "A note appear" -- s
> 
Done.


> Under "w": at least list permitted attributes and their meanings, and 
> note that we're working on a linguistic annotation module..
> 

Will do.

> "To identify a reference ito</i> a" -- fix
> 

Done.

> "For example, pThe correctness" -- fix
> 

Done.

> " The scope element " -- bold

Done.

> 
> "(without the colon, it would be interpreted as a top-level identifier 
> within the work)." -- s/the work/the default work/
> 

Done.

> "(introded by Whittingham about ????)," -- typo, finish & add ref?
> 

Done.

> "The parts of an osisID may contain any mixture of numbers, letters, 
> hyphens, and underscore. However, to avoid conflict with the other 
> punctuations used (such as ":" to separate the work from the in-work 
> location, "@" to separate fine-grained references in osisRefs, and "!" 
> to separate work-specifiec extensions to a versification scheme), no 
> other characters are allowed. " -- is it true hyphens are allowed? 
> Doesn't that croak range syntax?
> 

Hyphen in that description is an error. Removed.

> ""!" as the terminator (after which encoders may append names and/or 
> numbers to provide finer-grained reference points)." -- Explain what 
> this is doing here -- too abrupt transition.
> 

Also does not seem to fit where it occurs. Marked to explain and 
possibly move.

> NB: We should state that there is no need for an explicit work element 
> to declare the reference systems whose names we have predefined. 
> Probably say that here, and earlier where they are listed.
> 

Will do.

> "or a section heading in the Mark " -- lose "the"
> "within a given canonically-reference unit." -- d
> 

Done.

> "To refer to specific locations within a named canonical reference 
> element" -- prepend subhead "Fine-grained references".
> 

Done.


> "No markup included within the element specifies" -- d

Now reads: Markup does not imply a space for purposes of counting even 
if it may for purposes of layout, printing or indexing.

> 
> NB: Under this section, what shall we say about ignorable whitespace?
> 
Hmmm, at first blush I would say that "ignorable whitespace" is just 
that, ignorable. ;-) Did not check to be sure but isn't whitespace 
within content always significant? Seems like that is the only 
whitespace that either users or applications should be counting for cp 
references.

> "purposes of layout or printing" -- add ", indexing, "
> 
Done.

> "Thus, the intuitive count will not be changed by the insertion of 
> notes, references, critical apparatus, and the like)." -- balance ()
> 
Done.


> "Grains: s finds " -- replace with "s (short for "string") finds"
> 

Done.

> "BTG intends to develop an XML schema for declaration files that can 
> express such systems, and their mapping to other systems. This work has 
> not been completed. However, we reserve the following names for 
> versification schemes we already know to be relevant:
> 
> 
>     Hebrew
>     NA27
>     SamPent
>     LXX" -- synchronize this list with the earlier one (perhaps just by 
> cross-referencing and deleting it here).
> 

Put in reference to earlier definition.

> "The header must include work declarations for the document itself, and 
> for the versification system it uses.
> " -- perhaps append "except that the predefined versification systems 
> need not have work declarations."
> 
What about: Each work must identify which versification scheme(s) it 
uses; this is done by a reference to the versification scheme declared 
by a <hi rend="bold">work</hi> declaration in the header except that the 
predefined versification systems need not have work declarations.


> "actualloy"
> 
Done.


> "unproofead"
Done.

> 
> "Howeve,"
> 
Done.

Not as many comments in the text to remind me of things as I feared! 
Still, a good many things to polish off but it should be a respectable 
draft.

Will try to have a new version off to you with additional content by 
lunch time.

Hope you are having a great evening!

Patrick


> That's all I got. It's looking pretty good; very few bits missing, good 
> examples, etc.
> 
> Hurrah!
> 
> S
> 
> 
> -- 
> 
> 
> Steve DeRose -- http://www.derose.net
> Chair, Bible Technologies Group -- http://www.bibletechnologies.net
> Email: sderose@acm.org  or  steve@derose.net


-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Patrick.Durusau@sbl-site.org
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model

Topic Maps: Human, not artificial, intelligence at work!