[sword-devel] Why is OSIS preferred? Was Re: usfm2osis.pl
chrislit at crosswire.org
Tue Jul 1 11:27:16 MST 2008
Karl Kleinpaste wrote:
> David Haslam <d.haslam at ukonline.co.uk> writes:
>> There are needed some improvements in punctuation -
>> eg. sometimes there's no space after a full-stop in the jvn module.
> That might be legitimate in those languages; I wouldn't know. But the
> modules' *.conf mention that one ought to use a certain font (Charis
> SIL) because of language-specific weirdnesses.
These look like upstream conversion errors. And while it would probably
be easy for us to simply fix them and move on, I think we want to hold
to the principle of using the data we receive as it is and sending
potential bug reports back to the provider.
The need for Charis SIL (and, indeed, the most recent version of Charis
SIL) stems from the use of PUA codepoints in SIL content. They have an
institutional PUA policy in place. At least some of this content
required use of the PUA at the time of encoding. As of Unicode version
5.1, many codepoints that SIL employed in the PUA have received official
(non-PUA) codepoints in the Unicode Standard. All of the WBTI content
was run through TECkit to convert SIL PUA codepoints to Unicode 5.1,
where possible. So there is less PUA usage than the was in the data
delivered to us. However, some of that data still requires a font with
all of the formerly PUA codepoints in their new Unicode 5.1 positions.
That essentially means we need a Unicode 5.1 font. Yet, there will
remain some PUA codepoints in the data, meaning we also need an SIL
Unicode font (since they generally include the PUA codepoints if it's
appropriate to the script repertoire of a give font). Thus we need an
SIL font that supports Unicode 5.1, of which there are two that I'm
and Doulos SIL:
Charis SIL is better than Doulos SIL for the simple reason that it
includes bold, italic, and bold italic fonts, whereas Doulos SIL does
not. Older versions of Charis SIL and indeed other fonts, such as Times
New Roman, are likely to work perfectly well for some of the Bibles, but
Charis SIL was the targeted font during conversion.
More information about the sword-devel