[sword-devel] Musings about the Cherokee NT module
greg.hellings at gmail.com
Mon Jul 2 05:47:26 MST 2012
On Mon, Jul 2, 2012 at 7:41 AM, David Haslam <dfhmch at googlemail.com> wrote:
> Hi DM,
> We should ignore pronunciation methods for processing Cherokee transcribed
> to Latin.
> The Sequoyah transliteration system is explicitly described as not being
> based on phonetics!
> Please refer to the Wikipedia page.
> The edit distance method may be more fruitful, yet there are also hidden
> assumptions and potential pitfalls.
> (a) The Cherokee NT is not 100% proofread. Judging by comparisons with the
> PDF file from Google books, it is sometimes quite difficult to judge where
> word boundary spaces are. Moreover, several pairs of Cherokee symbols are
> very alike visually, which coupled with the differences in the fonts, this
> all makes character recognition quite a challenge. So I wouldn't be
> surprised if the accuracy of the 2009 CNT text download is as low as 85% ( a
> mere subjective guess ).
> (b) Although many words will yield good edit distance scores (dewi = David,
> equahami = Abraham), there will be several proper names or titles in which
> the Cherokee is closer to a translation of the meaning of the original Greek
> word. (tsisa = Jesus, galonedv = Christ).
Is there an available (and proper-name-tagged!) version of the Bible
in a sister tongue to Cherokee that we could use as the basis for
comparisons? "David" -> "dewi" seems a pretty distant comparison that
is far more likely to yield issues than if we have a sister tongue
where "dawi" or what have you is already marked as a proper name.
Having such a related language would greatly enhance the accuracy of
this portion of the work.
> An example of a missing space is in Mark 1:1 which reads,
> adalenisgv yisdv kanohedv, tsisagalonedv unelanvhi uwetsi utseliga.
> There should be a space between tsisa & galonedv.
> The word capitalization task is therefore a huge challenge.
> It may not be worth even starting it until the proofreading accuracy is much
> closer to 100%.
> It will help however to learn that there is a Cherokee English dictionary
> available online.
> See http://wehali.com/tsalagi/
> And there are several other websites with useful resources for the Cherokee
> See http://www.native-languages.org/cherokee.htm
> But now I am running far ahead of my original musings, as to go down this
> route would require someone with real competence in speaking and writing the
> Cherokee language.
> View this message in context: http://sword-dev.350566.n4.nabble.com/Musings-about-the-Cherokee-NT-module-tp4650474p4650490.html
> Sent from the SWORD Dev mailing list archive at Nabble.com.
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel