Fri, 08 Aug 2003 04:41:12 -0400
Troy A. Griffitts wrote:
> Hey guys,
> We've run into an interesting dilemma lately that I would like your
> input on.
> If the body of an OSIS document contained excerpt like this:
> <seg osisID="entry">
> This is an entry. I was just going to make 2 points:
> o this is point 1
> o and this is point 2
> What would be acceptable handling / rendering?
Easy answer, for XML parsers, canonicalization algorithm says:
10. Retain all white space in character content.
> 1) the initial indent. Is it acceptable to remove tabs?
> 2) the return after the first line. Is it acceptable to remove \n
> 3) the double space after the first sentence. Can be remove?
> 4) the manual indentation and \n of point 1 and 2. ?
Realizing that I probably hold the minority position, ;-), I would
recommend normalizing as part of the application (note not the XML
parser), all the white space in your example to single spaces.
Reasoning is that users should be using markup and stylesheets on markup
documents to achieve meaningful presentation. That is not to say that I
don't do the same thing in text files while I am taking notes at
meetings, but I don't use it in markup documents I produce for meeting
> (I DON'T WANT RECOMMENDATION ON HOW THIS _SHOULD_ BE MARKED UP)
My, my, has someone been offering unsolicited markup advice? :-)
Hope you are having a great day!
> I'm looking for input on how to handle whitespace.
> Thanks for your time.
> osis-core mailing list
Director of Research and Development
Society of Biblical Literature
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Topic Maps: Human, not artificial, intelligence at work!