[sword-devel] New Accented Greek NT with Morph

Troy A. Griffitts scribe at crosswire.org
Wed Apr 20 13:04:07 MST 2005


Hey guys,
         I've spent some time cleaning up a module submitted by David 
(dnr at crosswire dot org) which uses the base Westcott-Hort Accented 
GNT from CCEL and merges in the morphology tags from Maurice Robinson's 
WHNU text (our WHNU module).  The result is an OSIS module that is fully 
UTF8 Accented Greek NT with Morphology.  I'm really excited about this 
and it has taken me way too long to process this work (sorry guys).  The 
only thing keeping this module from being the ULTIMATE replacement for 
our WHNU module is the lack of Nestle-Aland/UBS variants against the WH 
(the 'NU' part of our current WHNU module).  Without these variants, we 
still cannot produce the Greek text which is the predominant base text 
used for all modern Bible translation work.

But it's still really cool! :)

Now, having said all this, we still have problems with the current module.

     o Oddly, Unicode Greek encoding is not very standard.  With Hebrew, 
everyone expected the extra work to compose consonants and vowels and 
accents, etc. They've already done the work (well, mostly).  With Greek, 
there is a whole "Greek Extended" Unicode range defined containing 
precomposed characters.  Some renderers desire characters precomposed, 
others like to do the composing themselves.

     This issue makes things a little problematic.  Most resources 
(including the ICU Unicode library) claim that canonical normal form is 
precomposed for Greek, and my firefox browser under linux looks great 
showing precomposed characters.  IE running on _stock_ XP looks 
horrible.  If one webpage has Greek precomposed characters, and someone 
enters a search string in decomposed characters, they obviously will not 
match, unless someone behind the curtain is being smart about things-- 
we have the necessary filters in place to handle this, but we need to 
think about the best choices: a) strip all accents before searching; b) 
NFC both the search string and the text before searching

     I've spent some time making 3 Bibles available on our site: 1) 
unaccented; 2) accented precomposed; 3) accented decomposed

     Here is a link which should show all 3 in parallel (you can click 
on words for definitions if you'd like :)   ).

http://crosswire.org/study/parallelstudy.jsp?add=WHNU&add=WHAC&add=WHACD

     We've specified in the HTML that the encoding is UTF-8 so all 
browsers have a fighting chance :)

     If you have a chance, could you please spend some time trying this 
link with your browser and report your results and configuration AND 
ANYTHING YOU DO (with fonts or otherwise) that improves your viewing of 
the accented Bibles.

	Thanks to everyone who have contributed and I'm excited about this new 
work!

		-Troy.




More information about the sword-devel mailing list