[osis-core] Re: [osis-user] class vs type

Patrick Durusau patrick at durusau.net
Sat Mar 11 15:40:03 MST 2006


DM,

Interested in your suggestion of quarterly releases. Mostly my fault but 
we have never really followed a traditional development sort of cycle, 
public roadmap, etc.

Perhaps it is time for us to consider being a little less ad hoc and a 
bit more formal.

I can't speak for everyone in the core group but the spirit is willing. ;-)

It would certainly put us in a position of better communication with the 
user community and keep us moving forward. (There are some really cool 
things I would like to see in 4.0 and later. ;-) Topic maps like stuff 
and that sort of thing.)

I will make the rounds by phone this next week and see if there is 
general agreement to doing a more formalized development process with 
all the transparency of an open source project.

My situation may be about to change to the point where I could allocate 
time for a more regular process.

See what you can provoke by posting to the OSIS list!

Hope you are having a great weekend!

Patrick

DM Smith wrote:

> Patrick, Troy,
>    To respond to both.
>
> Patrick Durusau wrote:
>
>> Troy,
>>
>> Troy A. Griffitts wrote:
>>
>>> Patrick and DM,
>>>     Very good point that it forces discussion about unresolved 
>>> problems. But I'm not convinced we're not suggesting addition of the 
>>> same thing: multiple x- attribute values on the type attribute to 
>>> solve the problem.
>>>
>>>     It seems the same to me:
>>> <tag type="x-multivalue:a:b:c">
>>> and
>>> <tag private="rend:a rend:b rend:c">
>>>
>>> I think I'd prefer the syntax of the later, because it preserves 
>>> 'type' to allow something OSIS-valid for tag.
>>
> I am suggesting that OSIS be change to allow the following for type: 
> (osisValue|x-attributeExtension)(\s+(osisValue|x-attributeExtension))*
> So this would result in
>    <tag type="x-a x-b x-c">
> If private were defined at this point in time I would use it as Troy 
> suggested
>    <tag private="rend:a rend:b rend:c">
> But since private is not defined, I am going to use:
>    <tag type="x-multivalue:a:b:c">
> Until appropriate support can be added.
>
>>>
>>> The more fundamental question arises... do we have any tags where it 
>>> would be logical to allow multiple OSIS-valid types?  or "Can a tag 
>>> be of multiple types simultaneously?"  And if so, what does subType 
>>> mean in that context?
>>
>
> I'll give one: In the DTD I am converting there is a <bi> tag which is 
> bold, italic. I need to convert this to:
>    <hi type="italic><hi type="bold">...</hi></hi>
> This would be more natural as
>    <hi type="bold italic">...</hi>
> Allowing this condensation results in simpler xml and that is less 
> error prone.
>
>>>
>>> I still agree that DM's point is good.  Currently, it's hard to 
>>> store private data in an osis doc, and that might be a good thing.
>>>
>> Err, we will have to wait for DM to respond but I thought his point 
>> was that it is better to *not* store private date in an osis doc.
>>
>> In other words, he wants a want to transform private data into a 
>> public format, that is to keep the information but in an OSIS form.
>>
>> Is that close DM?
>
>
> Actually, I was arguing both. Having a private attribute allows me to 
> work ahead of the current OSIS standard, which can be good, if I am 
> working with the OSIS committee, following their guidance. But it also 
> allows for decreased portability/increased proprietary documents. 
> These have to be balanced. And for this reason, I urge caution.
>
> If the OSIS spec were to be updated quarterly on the basis of 
> demonstrated existing need and also upon committee decision, then 
> there would not be a big need to have private as a work ahead.
>
> However, with the KJV project that Troy did and which I am updating, 
> there is a need to bury programmatic authoring decisions in the 
> document so that revisions can be readily done. IMHO, if private is 
> used, it should be "for internal use only" and should not be anything 
> that an external processor must/should look at when rendering the 
> document. I'm sure Troy sees more need for it than this.
>
> I don't think that two attribute are needed but it is as if there are 
> two distinctly different purposes: osisFuture="..." and private="..." 
> where osisFuture="..." represents a possible future for OSIS that has 
> been discussed but not finalized and private="..." is truly "for 
> internal use only".
>
>>
>> This may not be a type question although it started off talking abou 
>> the type attriubte.
>
> No it is not a type question.
>
> I framed it as a type question only because in the set of global 
> attributes, that seemed the only place such a behavior could be defined.
> I would suggest a different global attribute, say rend, style, class, 
> role, or dohickey. (By the way the OSIS manual has rend as an 
> attribute on the hi element! Clearly an error.)
>
>>
>> It seems that what DM needs to record is display behavior. I am not 
>> sure I want to attempt to define even a fairly extensive range of 
>> display behavior but that is without really looking to see what that 
>> would take. It might be a fairly good sized list but any one 
>> application would only need a few of those.
>
>
> Don't do it! Just open Word or OpenOffice Writer, go to the page to 
> format a paragraph or to define a style and take a look at everything 
> that can be blended. The vocabulary is finite but very large! Just 
> provide a place to hang a symbolic constant that an external system 
> can define (e.g. class values in HTML are interpreted by CSS).
>
>>
>> Sorta like type on milestone. There are any number of uses for 
>> milestones but with very few values we caught the vast number of 
>> usual cases where it would be used.
>
>
> One that is missing is paragraph break to record where in the 
> "original" the paragraph broke. My point is not to add or argue this 
> one, but that there is always "one more" that could be argued.
>
>>
>> Since I haven't looked I have no feel for whether such a list could 
>> be created for display-behaviors, but suspect there must be some set 
>> of usual and customary display behaviors, due to the limitations of 
>> browsers if nothing else. All sorts of things are "possible" but 
>> increasingly unlikely towards the margins.
>
>
> The class attribute in HTML is the attachment point for display 
> behavior. It allows for CDATA but in practice is limited to a 
> whitespace separated list of keys that can be used in a CSS 
> stylesheet. The range of values is infinite, and in practice rather 
> large and cannot be broken down into a well defined vocabulary, even 
> for a problem domain, e.g. Bibles.
>
> And I don't think it is a problem domain in which OSIS should be 
> wanting to define the range of display behaviors. Just provide a 
> mechanism for that to happen and let people be creative in its use. 
> The only caveat is that a document should be "accessible" when such 
> styling is not applied. The providers of the document should also 
> provide a CSS stylesheet or some kind of documentation as to the 
> meaning of all "class" values. At least all open source documents.
>
>
>
>>
>> Hope you are having a great day!
>>
>> Patrick
>>
>>
>>>     -Troy.
>>>
>>>
>>>
>>> Patrick Durusau wrote:
>>>
>>>> DM,
>>>>
>>>> I don't thinks that Tro was implying that he wasn't taking the 
>>>> problem seriously.
>>>>
>>>> I do think you have a good point about avoiding, to the extent 
>>>> possible,  proprietary extensions that would decrease portability.
>>>>
>>>> Ultimately that is to no small degree a question of judging the 
>>>> tradeoffs.
>>>>
>>>> Troy: What do you think about DM's comment in terms of embedding 
>>>> arbitrary data that may not be documented or standardized?
>>>>
>>>> Hope you are having a great day!
>>>>
>>>> Patrick
>>>>
>>>> DM Smith wrote:
>>>>
>>>>> I think private would be good as a convienent place for a 
>>>>> work-ahead, but I'd be concerned that it be used for a work around 
>>>>> without meaningful discussion here. And I'd be concerned if it 
>>>>> became an easy way out. As it stands, the problems I face and have 
>>>>> posted here are real and have been taken seriously. I think in 
>>>>> part because there is no (good) way to do it in OSIS. I've really 
>>>>> appreciated the progress each version of OSIS has made. I'd like 
>>>>> to see that continue.
>>>>>
>>>>> I have worked with a few DTD's now and I am impressed with OSIS. 
>>>>> Most of the other DTDs allow for arbitrary markup that in essence 
>>>>> makes a document proprietary as processing it would require custom 
>>>>> routines. The way OSIS is written right now, only the 
>>>>> attributeExtensions are proprietary.
>>>>>
>>>>> Troy A. Griffitts wrote:
>>>>>
>>>>>> Patrick,
>>>>>>     A while back, we had briefly discussed adding a global 
>>>>>> 'private' attribute to the schema.  Basically, a place for 
>>>>>> organizations to place private use information on any tag.  Not 
>>>>>> sure where people fell on the sides of that issue, but I would be 
>>>>>> in favor of such an attribute.
>>>>>>
>>>>>> o    It would allow me to have a basic valid OSIS document while 
>>>>>> we debate how to move all the private data into best practice OSIS.
>>>>>> o    It would allow helpful runtime information to be stored by 
>>>>>> our engine and still allow schema validation against the document.
>>>>>> o    It would allow any data to be stored (like DM's example) 
>>>>>> which don't directly map to OSIS.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Patrick Durusau wrote:
>>>>>>
>>>>>>> DM,
>>>>>>>
>>>>>>> I take it is your requirement that you be able to have NMTOKENS 
>>>>>>> as the data type for the type attribute?
>>>>>>>
>>>>>>> I don't have any strong objections to space delimited data types 
>>>>>>> so I will pass it along to the core group and see if we can get 
>>>>>>> a consensus on that.
>>>>>>>
>>>>>>> Our next release will be OSIS 2.5 next Fall. I am hopeful we 
>>>>>>> will see some more tools and stylesheets posted in the meantime.
>>>>>>>
>>>>>>> Hope you are having a great day!
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> DM Smith wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Patrick Durusau wrote:
>>>>>>>>
>>>>>>>>> DM,
>>>>>>>>>
>>>>>>>>> Are you saying that the DTD has a rend attribute that can have 
>>>>>>>>> several possible values at the same time?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes. Rend is defined as NMTOKENS. Which allows a space 
>>>>>>>> separated list of NMTOKEN.
>>>>>>>> The global attributes on elements in this system are:
>>>>>>>>    id      ID          #IMPLIED
>>>>>>>>    lang    IDREF       #IMPLIED
>>>>>>>>    n       CDATA       #IMPLIED
>>>>>>>>    rend    NMTOKENS    #REQUIRED
>>>>>>>>    type    NMTOKEN     #IMPLIED
>>>>>>>>
>>>>>>>> In some cases rend is not required.
>>>>>>>>
>>>>>>>> I have mapped
>>>>>>>>    this DTD's    OSIS's
>>>>>>>>    id            id
>>>>>>>>    lang          xml:lang
>>>>>>>>    n             n
>>>>>>>>    rend          subType
>>>>>>>>    type          type
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> While I don't doubt that is possible, I am not sure why anyone 
>>>>>>>>> would want to do it.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> It does not quite matter why. I am working with a legacy 
>>>>>>>> document that has it and I need to preserve it.
>>>>>>>> It is needed to express that an element belongs to different 
>>>>>>>> classes of presentation simultaneously. See below for more.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sounds like a hack to allow poorly written XSLT stylesheets.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> It has nothing to do with XSLT stylesheets. Its for CSS 
>>>>>>>> stylesheets. XSL was originally intended to allow the 
>>>>>>>> transformation and the styling of a document. XSLT only 
>>>>>>>> implemented the transformation aspect. IIRC, it was felt that 
>>>>>>>> CSS would do the job of presentation. I haven't looked at it 
>>>>>>>> yet but it looks like xsl-fo is intended to style a document.
>>>>>>>>
>>>>>>>> The value of the HTML class attribute and this DTD's rend 
>>>>>>>> attribute is that it allows for the separation of presentation 
>>>>>>>> and content. Prior to it, one embedded the presentation 
>>>>>>>> directly into the document.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> For example, if I have <q type = "emphasis">, even though I 
>>>>>>>>> only have one value for type, nothing prevents me from having 
>>>>>>>>> different renderings of the contents of <hi> based upon its 
>>>>>>>>> position in the markup tree, for example <hi type="emphasis"> 
>>>>>>>>> being rendered differently when it is a child of <title> from 
>>>>>>>>> when it is a child of <p> versus when it is a child of <q>.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This is absolutely true and has no bearing on whether rend has 
>>>>>>>> one value or multiple ones. Or whether having multiple value is 
>>>>>>>> of any value (pun intended).
>>>>>>>>
>>>>>>>> Allowing multiple values allows the expression of different 
>>>>>>>> kinds of roles/dimensions to be used at the same time.
>>>>>>>>
>>>>>>>> For example, we may want an ordered list or an unordered list. 
>>>>>>>> And because nesting is allowed, we may have lists of lists. And 
>>>>>>>> any list in the tree can be either. But what if we want to have 
>>>>>>>> different kinds of ordered lists and unordered lists.
>>>>>>>> Say
>>>>>>>>    revealed - all the children of a node are shown with the parent
>>>>>>>>    initially-hidden- all the children of a node are initially 
>>>>>>>> hidden but stay shown until hidden again
>>>>>>>>    popup - shown for a time when a user expresses interest in 
>>>>>>>> them.
>>>>>>>> And, as an processing optimization it is needed to be known 
>>>>>>>> whether the list of children is to not wrap, wrap in a narrow 
>>>>>>>> presentation or a wide presentation.
>>>>>>>>
>>>>>>>> And in this example, it is both possible and reasonable to have 
>>>>>>>> a list of children that have different behaviors.
>>>>>>>> (This is a simplification of real world example of a system I 
>>>>>>>> wrote using CSS. There were other dimensions as well, such as 
>>>>>>>> data source: synthetic, program generated, user input. The HTML 
>>>>>>>> document were simple lists with <ul> and <ol> having multiple 
>>>>>>>> class values, with each class value representing a different 
>>>>>>>> concept.)
>>>>>>>>
>>>>>>>> To do this with a single value, I would need one for each 
>>>>>>>> possible combination; in this case a set of 2x3x3=18 different 
>>>>>>>> values. (Well actually 9, because HTML has ul and ol)
>>>>>>>>
>>>>>>>> In the case at hand, the element is a paragraph tag. It is not 
>>>>>>>> clear what the different values are, but let me suppose that 
>>>>>>>> they deal with justification, first-line indentation, 
>>>>>>>> subsequent-line indentation, line spacing, handling of first 
>>>>>>>> letter, etc. The application of these behaviors is entirely 
>>>>>>>> unpredictable in the document, so creating a general purpose 
>>>>>>>> stylesheet is out of the question and a specific one would end 
>>>>>>>> up as a complex program that has too great a knowledge of the 
>>>>>>>> document.
>>>>>>>>
>>>>>>>> Since I am dealing with transforming a legacy document into 
>>>>>>>> OSIS, I need a way to preserve the values. I am wondering how. 
>>>>>>>> (As a programmer, I can figure out many work arounds, such as 
>>>>>>>> littering the document with hi elements or replacing spaces 
>>>>>>>> with a character that is not allowed in NMTOKEN, like ':' 
>>>>>>>> type="x-multivalue:a:b:c") And if this ability will be added to 
>>>>>>>> a later version of osis, I would like to pick a hack that would 
>>>>>>>> allow a good path to it tomorrow.
>>>>>>>>
>>>>>>>> In order to separate presentation from structured content, 
>>>>>>>> there needs to be a semantic for deterministically attaching 
>>>>>>>> presentation to content. This is what the class and rend 
>>>>>>>> attribute provide.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Does that catch the gist of the problem or have I 
>>>>>>>>> misunderstood the issue? (It is 5 AM local time so the latter 
>>>>>>>>> is entirely possible.)
>>>>>>>>>
>>>>>>>>> Hope you are having a great day!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> And I hope you are having a good and full night of sleep.
>>>>>>>>
>>>>>>>> --DM
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Patrick
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> DM Smith wrote:
>>>>>>>>>
>>>>>>>>>> In html elements define a class attribute which indicates 
>>>>>>>>>> that a particular element in its context belongs to a class 
>>>>>>>>>> of that element. The primary use of this is to indicate where 
>>>>>>>>>> styles can be attached. It appears that type can be used for 
>>>>>>>>>> the same purpose, but not quite. In html, class is defined as 
>>>>>>>>>> "class    space separated list of classes" It's type is 
>>>>>>>>>> CDATA, but in spirit is NMTOKEN. What this allows is for an 
>>>>>>>>>> element to be cross-classified. That is more than one class 
>>>>>>>>>> can apply.
>>>>>>>>>>
>>>>>>>>>> However in OSIS there is only one type "word" that can be used.
>>>>>>>>>>
>>>>>>>>>> I am working to convert an xml document to OSIS. This 
>>>>>>>>>> document's DTD defines an attribute, rend, in much the same 
>>>>>>>>>> way as class, in that it is a "role" to which style should be 
>>>>>>>>>> applied, with the possibility of several "roles".
>>>>>>>>>>
>>>>>>>>>> What is the proper way to do this?
>>>>>>>>>>
>>>>>>>>>> I can figure out several ways to do this but none seem quite 
>>>>>>>>>> right. For example, artificially, I can nest <hi> elements to 
>>>>>>>>>> achieve a similar, but not quite the same result.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> osis-core mailing list
>>> osis-core at bibletechnologieswg.org
>>> http://www.bibletechnologieswg.org/mailman/listinfo/osis-core
>>>
>>>
>>>
>>
>
>
>

-- 
Patrick Durusau
Patrick at Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005

Topic Maps: Human, not artificial, intelligence at work! 




More information about the osis-core mailing list