[osis-core] whitespace

Patrick Durusau osis-core@bibletechnologieswg.org
Fri, 08 Aug 2003 04:41:12 -0400


Troy A. Griffitts wrote:
> Hey guys,
> We've run into an interesting dilemma lately that I would like your 
> input on.
> If the body of an OSIS document contained excerpt like this:
> <seg osisID="entry">
>     This is an entry.  I was just going to make 2 points:
>         o  this is point 1
>         o  and this is point 2
> </seg>
> What would be acceptable handling / rendering?

Interesting question!

Easy answer, for XML parsers, canonicalization algorithm says:

10. Retain all white space in character content.

> Problems:
> 1) the initial indent.  Is it acceptable to remove tabs?
> 2) the return after the first line.  Is it acceptable to remove \n
> 3) the double space after the first sentence.  Can be remove?
> 4) the manual indentation and \n of point 1 and 2.  ?

Realizing that I probably hold the minority position, ;-), I would 
recommend normalizing as part of the application (note not the XML 
parser), all the white space in your example to single spaces.

Reasoning is that users should be using markup and stylesheets on markup 
documents to achieve meaningful presentation. That is not to say that I 
don't do the same thing in text files while I am taking notes at 
meetings, but I don't use it in markup documents I produce for meeting 


My, my, has someone been offering unsolicited markup advice? :-)

Hope you are having a great day!


> I'm looking for input on how to handle whitespace.
>     Thanks for your time.
>         -Troy.
> _______________________________________________
> osis-core mailing list
> osis-core@bibletechnologieswg.org
> http://www.bibletechnologieswg.org/mailman/listinfo/osis-core

Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model

Topic Maps: Human, not artificial, intelligence at work!