[sword-devel] usfm2osis.py

Mon Aug 6 00:31:52 MST 2012

On 8/6/2012 12:01 AM, David Haslam wrote:
> Further to my last reply, I think we can safely assume that we are more
> likely to process *Chinese* text
> than any of the scripts that require characters from the *Supplementary
> Multilingual Plane*.
>
> Range	Block	Code Points
> 10000..1007F	Linear B Syllabary	128
> 10080..100FF	Linear B Ideograms	128

Some fancy Greek dictionaries could certainly include Linear B.

> 10300..1032F	Old Italic	48

No biblical material here, but fancy Latin dictionaries could certainly 
include Old Italic characters. And there are some interesting and 
extensive texts in Oscan & Umbrian, but not necessarily worth our 
encoding. (My only contribution of new codepoints to Unicode thus far 
has been in this block.)

> 10380..1039F	Ugaritic	32

I could definitely see us encoding some Ugaritic texts. It's an 
important language for comparison of early Judaism to its local context 
and even for better understanding some elements of OT Hebrew.

> 10400..1044F	Deseret	80

Eh... Book of Mormon made less legible?

> 10840..1085F	Imperial Aramaic	32
> 10900..1091F	Phoenician	32

Hebrew dictionaries, etc.

> 10B00..10B3F	Avestan	64

I would totally encode the Avesta if I could find a good PD source and 
figure out how to un-transliterate it.

> 12000..123FF	Cuneiform	1,024
> 12400..1247F	Cuneiform Numbers and Punctuation	128

There are lots of worthwhile Sumerian, Akkadian, & Hittite documents in 
this script. And Hebrew dictionaries could definitely incorporate 
Akkadian, and possibly Sumerian and Hittite.

> 16F00..16F9F	Miao	160

This script was actually invented for Bibles.

--Chris