[osis-core] Re: ews, non-Unicode characters, and SIL Galatia. Anyone have an explaination?

osis-core@bibletechnologieswg.org osis-core@bibletechnologieswg.org
Wed, 10 Jul 2002 09:01:25 -0500


Dear Todd,

Greetings! How is the move going? I just got back from vacation so I'm in
major catch-up mode right now. I'm copying Peter Constable on this to see
if he has anything to add. I would be in agreement with you that we should
not create any standards that deviate from the main-stream even if it might
be more convenient. SIL is committed to using standard Unicode encoding and
if the characters we need are not there we either propose them to the
Unicode Consortium or we put them in the PUA.

To tell you the truth I don't really know what a "EWS" is but if it isn't
standard Unicode or it is not in the PUA then the intended solution is a
bit more creative than our tastes will allow. Peter, do you have anything
to add?

Dennis



                                                                                                           
                      "Todd                                                                                
                      Tillinghast"              To:      <osis-core@bibletechnologieswg.org>, "Dennis      
                      <todd@contentfram         Drescher" <dennis_drescher@sil.org>                        
                      eworks.com>               cc:                                                        
                                                Subject: ews, non-Unicode characters, and SIL Galatia.     
                      07/08/02 06:15 PM         Anyone have an explaination?                               
                                                                                                           
                                                                                                           



It seems like we are trying to reverse the basic XML tenant of all
Unicode characters.  If the character set currently available
electronically for a translation is not a Unicode character set then the
character set should be translated into a Unicode character set.  If one
does not exist there is the user defined space in Unicode.

I know that the SIL folks are well aware of the issues with Unicode.  It
does not seem appropriate to add an attribute that allows the use of non
Unicode characters.  If someone wants to encode something using what
appears to be Unicode characters but are really not then they can BUT we
should not create a standards that encourages it.

I a may be misunderstanding the whole issue and the purpose of ews and
how all of this relates to Unicode.  If so, please correct and enlighten
me.

I am also sending this to Dennis Dresher at SIL in hopes of getting his
feed back.  If he replies I will post his response.

Todd

> -----Original Message-----
> From: owner-osis-core@bibletechnologieswg.org [mailto:owner-osis-
> core@bibletechnologieswg.org] On Behalf Of Harry Plantinga
> Sent: Sunday, July 07, 2002 7:00 AM
> To: osis-core@bibletechnologieswg.org
> Subject: Re: [osis-core] Is this ready for OSIS 1.1?
>
> Todd,
>
> I'm not totally up on "ews" intended usage, but I think I recall
> that it would be used to say that the following is Greek
> beta code, some SIL encoding, etc.
>
> I'm not sure if this is what you are getting at, but there are indeed
> problems with representing Greek in a font like SIL Galatia in an
> otherwise Unicode document. The problem is that SIL Galatia
> uses character codes that are not legal bytes in a UTF-8 document.
> So you get XML parse errors.
>
> Representing Greek using a Greek font like that uses characters in
> the 128..255 range in an otherwise Unicode document is a bit thorny...
>
> -Harry
>
> ----- Original Message -----
> From: "Todd Tillinghast" <todd@contentframeworks.com>
> To: <osis-core@bibletechnologieswg.org>
> Sent: Sunday, July 07, 2002 1:20 AM
> Subject: RE: [osis-core] Is this ready for OSIS 1.1?
>
>
> > >
> > > ews is "electronic writing system" -- e.g. a <seg> may use a
> > non-standard
> > > encoding such as SIL Galatia for greek.
> >
> > Do you mean that ews would be used to indicate that non-unicode
> > characters are being used?
> >
> > Todd