[sword-devel] Musings about the Cherokee NT module
chrislit at crosswire.org
Sun Jul 1 13:10:51 MST 2012
On Jul 1, 2012, at 10:28 AM, David Haslam <dfhmch at googlemail.com> wrote:
> Hi Chris,
> I just tried "diatheke to transliterate as it outputs".
> I advise that this suffers from the weakness uncovered by my recent
> Back conversion from this Latin transliteration would contain inaccuracies
> in the Cherokee.
> This should prompt us to consider how we might be able to improve the method
> used for Cherokee in icu-sword.
True enough. It doesn't appear I even attempted to maintain reversibility. You can see my transliteration rules here: http://www.crosswire.org/svn/icusword/trunk/source/data/translit/crosswire/Cherokee_Latin.txt
I'll do some reordering of the rules to improve accuracy of Latin-Cherokee.
> IMHO, This is higher priority than anything we might wish to do to assist
> the Cherokee NT project.
Since transliteration from Latin is generally undefined (by authorities who define these sorts of things), I include them in transliterators chiefly as novelties. It's improbable that anyone would actually want to transliterate to Cherokee. I certainly wouldn't include spaces in the output of Cherokee-Latin since they're obviously only useful for this niche task of round-tripping and would interfere with many other tasks (e.g. a naive implementation of Levenshtein edit distance that didn't first throw out the WJs first).
It might make sense for me to add support for syllable-dividing hypens in Latin-Cherokee, like you see on the Cherokee NT website, since that would facilitate its use as an input method. Then a user could type the hyphenated text and precisely specify the desired syllable signs.
More information about the sword-devel