[sword-devel] conf utf-8

DM Smith dmsmith555 at yahoo.com
Mon Feb 14 05:22:32 MST 2005


UTF-8 has big and little endian byte orderings.
If there is no byte mark, it will be significant to use a particular 
byte ordering (either little-endian or big-endian).
If there is a BOM, then it can be interrogated and the UTF can be 
interpret in either fashion.
Even so, I think that it would be best to settle upon a particular byte 
ordering.
Windows does it backward from the rest of the world.

Chris Little wrote:

>
>
> Troy A. Griffitts wrote:
>
>>     My guess about the characters which keep the .conf file from 
>> being recognized... try adding a few newlines to the beginning of the 
>> file.  I would guess that XXX[Section Name] at the beginning is just 
>> causing our .conf reader to not recognize the "Section Name".
>
>
> The three characters are the Unicode byte-order mark (BOM). See 
> http://www.unicode.org/faq/utf_bom.html#BOM for full details. But, 
> basically, it's the codepoint U+FEFF, encoded at the beginning of a 
> file. From this character, you can tell whether you have UTF-16 
> big-endian, UTF-16 little-endian, or UTF-8.
>
> I would recommend we go ahead and support it (to the extent that we 
> check for it and throw it away) since it's not something that just 
> notepad adds to file. (No need to fix before the trip, though, I think.)
>
> --Chris
>
> _______________________________________________
> sword-devel mailing list
> sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
>


More information about the sword-devel mailing list