[sword-devel] Bible translations for languages in which a punctuation mark is used as a letter of the alphabet?

Sun Jun 2 14:38:25 MST 2013

On 6/2/2013 7:44 AM, David Haslam wrote:
> Some alphabets make use of a character that in other languages is normally
> classed as a punctuation mark.
>
> Examples are many, but here's a verse in *Tongan*, a language where the *ʻ
> (fakauʻa)* occurs very frequently as the character for a glottal stop.
> This should be written with the modifier letter turned comma (unicode
> 0x02BB),
> and not some other character, even though it looks a bit like the inverted
> curly apostrophe.
>
> NAʻE fakatupu ʻe he ʻOtua ʻi he kamataʻanga ʻa e langi mo māmani.
>
> How should the search feature in SWORD be tailored to support modules for
> such languages?
> Is this even possible, or would it require an enhancement to one of our
> library components?

Why don't you investigate and find out? You're essentially posting a 
request that someone else go look for a bug that you think could 
potentially exist. If it matters to you, go look for it and report.

> Has this topic been ever discussed before?
>
> NB.  In this example for Tongan, it's conceivable that providing the right
> codepoint is used, SWORD may already handle it correctly,
> but there are other languages in which the ordinary apostrophe is used for
> the same sound.

We don't need to support incorrectly encoded content. If someone encoded 
letters with codepoints that have punctuation properties, the error is 
in the encoding and nothing should be done to Sword. (And, yes, we have 
discussed this in the past.)

--Chris