[osis-core] whitespace MY POSITION SUMMARY, PLEASE READ

Troy A. Griffitts osis-core@bibletechnologieswg.org
Fri, 08 Aug 2003 19:22:38 -0700


Ok, one last statement from me to sum up my position and I will TRY not 
press it any further.

I think there is a difference between whitespace against space/tag/return.

Whitespace, __by definition__ should be allowed reduced to whatever 
other subset of whitespace is desired.

The English language MAKES GRAMMATICAL USE OF space/tab/return and other 
languages may use other similar things.  IN THE MOST RECENT CHINESE 
BIBLE WE JUST RELEASED, THEY PLACE 2 SPACES BEFORE THE NAME 'GOD' OUT OF 
RESPECT. (sortof like we capitalize 'God').  Same grammatical mechanism 
with 2 implementations-- one just happens to use a character that occurs 
in a set that we call 'whitespace'.

There are many grammatical uses (NOT FORMATTING) for these characters.

We need a way to differentiate characters someone added as 'whitespace' 
from text containing these actual characters.

OSIS is not a DISPLAY FORMATTING markup.  Forcing markup for spaces and 
newlines and possibly tabs is to mandate markup (some of which was 
created for a much different meaning) for common English entities that 
are much more easily typed with their single English character.

I can guess the problems that objectors have with this;  I'm facing them 
right now for this next release of our software.

I still say allow (even DEFAULT) xml:space="preserve" in OSIS, or at 
least in some set of elements including: <verse>.

Although I still feel that ALLOWING (even if we don't default) 
xml:space="preserve" at the top of a doc will let AN ABUNDANCE OF 
DOCUMENTS that don't warrant hundreds of hours of detailed markup make 
it into OSIS, I'll try to think of other English examples that NEED 
these characters (can you find all the spaces?)


Here is my address:
	CrossWire Bible Society
	c/o Troy A. Griffitts
	P. O. Box 2528
	Tempe, AZ  85280-2528


	The standard paragraph start uses a tab at the beginning and sentences 
are divided by 2 spaces.  I think we should promote retention of good 
English.


Here is a code example I worked on last night:

	for (Node **i = &head; *node; node = &((*note)->next))
		printf("\t%s\n", (*note)->iLoveTabs());


My contact information is as follows:
	Office:	(602) 628-7771
	Fax:	(602) 628-7771


John Doe				8-Aug-2003
20 E. Jones
Phx., AZ  85001

Dear Mr. Doe:

It has come to our attention that the space, tab, and return may be 
taken out of the English language when using the OSIS markup.  I find 
this a travesty.

	Sincerely,




		-Troy A. Griffitts
		Director
		CrossWire Bible Society













Chris Little wrote:

> 
> 
> Troy A. Griffitts wrote:
> 
>>> Realizing that I probably hold the minority position, ;-), I would 
>>> recommend normalizing as part of the application (note not the XML 
>>> parser), all the white space in your example to single spaces.
>>
>>
>> no, no; I know of at least one other that might agree with you.
> 
> 
> That would be me.  Contiguous whitespace should be equivalent to a 
> single instance of any type of whitespace.
> 
> My best reason for saying that is that encoders will treat the situation 
> as such if they have knowledge of HTML.  Editors like XMLSpy also 
> happily insert whitespace for pretty formatting (though they might quit 
> doing that if xml:space="preserve" were assigned).
> 
> I think Troy's example should reduce to:
> <seg osisID="entry">This is an entry. I was just going to make 2 points:
> o this is point 1 o and this is point 2</seg>
> and that the person who encoded this should be chided, harshly.  Adding 
> linebreak elements is simple and retains most of the important 
> formatting of this.
> 
> All that said, I also forsee that there ARE a very few instances where 
> contiguous whitespace itself needs to be encoded.  Stylistics like 
> double space between sentences are one.  Another might be encoding some 
> kind of manuscript or document facsimile where multiple spaces are 
> interpreted to exist within the original.
> 
> Adding an &nbsp; entity seems like a pretty painless and helpful 
> shortcut to add (since people could already use 0xA0), but might send 
> the wrong message by encouraging presentation formatting.  Adding an 
> element like HTML's <pre> would be another (extremely unpleasant, in my 
> opinion) possibility.
> 
> --Chris
> 
> 
> 
> _______________________________________________
> osis-core mailing list
> osis-core@bibletechnologieswg.org
> http://www.bibletechnologieswg.org/mailman/listinfo/osis-core