[sword-svn] r243 - trunk/locales

chrislit at crosswire.org chrislit at crosswire.org
Tue Nov 10 02:48:08 MST 2009


Author: chrislit
Date: 2009-11-10 02:48:08 -0700 (Tue, 10 Nov 2009)
New Revision: 243

Added:
   trunk/locales/readme.txt
   trunk/locales/updateFiles.pl
Removed:
   trunk/locales/iso15924-utf8-20090601.txt
Log:
added an updater script and an order of precedence list for the various standards that BCP47 cites

Deleted: trunk/locales/iso15924-utf8-20090601.txt
===================================================================
--- trunk/locales/iso15924-utf8-20090601.txt	2009-11-10 08:34:41 UTC (rev 242)
+++ trunk/locales/iso15924-utf8-20090601.txt	2009-11-10 09:48:08 UTC (rev 243)
@@ -1,143 +0,0 @@
-#
-# ISO 15924 - Codes for the representation of names of scripts
-#             Codes pour la représentation des noms d’écritures
-# Format: 
-#             Code;N°;English Name;Nom français;PVA;Date
-#
-
-Arab;160;Arabic;arabe;Arabic;2004-05-01
-Armi;124;Imperial Aramaic;araméen impérial;Imperial_Aramaic;2009-06-01
-Armn;230;Armenian;arménien;Armenian;2004-05-01
-Avst;134;Avestan;avestique;Avestan;2009-06-01
-Bali;360;Balinese;balinais;Balinese;2006-10-10
-Bamu;435;Bamum;bamoum;Bamum;2009-06-01
-Batk;365;Batak;batak;;2004-05-01
-Beng;325;Bengali;bengalî;Bengali;2004-05-01
-Blis;550;Blissymbols;symboles Bliss;;2004-05-01
-Bopo;285;Bopomofo;bopomofo;Bopomofo;2004-05-01
-Brah;300;Brahmi;brâhmî;;2004-05-01
-Brai;570;Braille;braille;Braille;2004-05-01
-Bugi;367;Buginese;bouguis;Buginese;2006-06-21
-Buhd;372;Buhid;bouhide;Buhid;2004-05-01
-Cakm;349;Chakma;chakma;;2007-11-26
-Cans;440;Unified Canadian Aboriginal Syllabics;syllabaire autochtone canadien unifié;Canadian_Aboriginal;2004-05-29
-Cari;201;Carian;carien;Carian;2007-07-02
-Cham;358;Cham;cham (čam, tcham);;2004-05-01
-Cher;445;Cherokee;tchérokî;Cherokee;2004-05-01
-Cirt;291;Cirth;cirth;;2004-05-01
-Copt;204;Coptic;copte;Coptic;2006-06-21
-Cprt;403;Cypriot;syllabaire chypriote;Cypriot;2004-05-01
-Cyrl;220;Cyrillic;cyrillique;Cyrillic;2004-05-01
-Cyrs;221;Cyrillic (Old Church Slavonic variant);cyrillique (variante slavonne);;2004-05-01
-Deva;315;Devanagari (Nagari);dévanâgarî;Devanagari;2004-05-01
-Dsrt;250;Deseret (Mormon);déseret (mormon);Deseret;2004-05-01
-Egyd;070;Egyptian demotic;démotique égyptien;;2004-05-01
-Egyh;060;Egyptian hieratic;hiératique égyptien;;2004-05-01
-Egyp;050;Egyptian hieroglyphs;hiéroglyphes égyptiens;Egyptian_Hierogyphs;2009-06-01
-Ethi;430;Ethiopic (Geʻez);éthiopien (geʻez, guèze);Ethiopic;2004-10-25
-Geor;240;Georgian (Mkhedruli);géorgien (mkhédrouli);Georgian;2004-05-29
-Geok;241;Khutsuri (Asomtavruli and Nuskhuri);khoutsouri (assomtavrouli et nouskhouri);;2006-12-11
-Glag;225;Glagolitic;glagolitique;Glagolitic;2006-06-21
-Goth;206;Gothic;gotique;Gothic;2004-05-01
-Grek;200;Greek;grec;Greek;2004-05-01
-Gujr;320;Gujarati;goudjarâtî (gujrâtî);Gujarati;2004-05-01
-Guru;310;Gurmukhi;gourmoukhî;Gurmukhi;2004-05-01
-Hang;286;Hangul (Hangŭl, Hangeul);hangûl (hangŭl, hangeul);Hangul;2004-05-29
-Hani;500;Han (Hanzi, Kanji, Hanja);idéogrammes han (sinogrammes);Han;2009-02-23
-Hano;371;Hanunoo (Hanunóo);hanounóo;Hanunoo;2004-05-29
-Hans;501;Han (Simplified variant);idéogrammes han (variante simplifiée);;2004-05-29
-Hant;502;Han (Traditional variant);idéogrammes han (variante traditionnelle);;2004-05-29
-Hebr;125;Hebrew;hébreu;Hebrew;2004-05-01
-Hira;410;Hiragana;hiragana;Hiragana;2004-05-01
-Hmng;450;Pahawh Hmong;pahawh hmong;;2004-05-01
-Hrkt;412;(alias for Hiragana + Katakana);(alias pour hiragana + katakana);Katakana_Or_Hiragana;2004-05-01
-Hung;176;Old Hungarian;ancien hongrois;;2004-05-01
-Inds;610;Indus (Harappan);indus;;2004-05-01
-Ital;210;Old Italic (Etruscan, Oscan, etc.);ancien italique (étrusque, osque, etc.);Old_Italic;2004-05-29
-Java;361;Javanese;javanais;Javanese;2009-06-01
-Jpan;413;Japanese (alias for Han + Hiragana + Katakana);japonais (alias pour han + hiragana + katakana);;2006-06-21
-Kali;357;Kayah Li;kayah li;Kayah_Li;2007-07-02
-Kana;411;Katakana;katakana;Katakana;2004-05-01
-Khar;305;Kharoshthi;kharochthî;Kharoshthi;2006-06-21
-Khmr;355;Khmer;khmer;Khmer;2004-05-29
-Knda;345;Kannada;kannara (canara);Kannada;2004-05-29
-Kore;287;Korean (alias for Hangul + Han);coréen (alias pour hangûl + han);;2007-06-13
-Kthi;317;Kaithi;kaithî;Kaithi;2009-06-01
-Lana;351;Tai Tham (Lanna);taï tham (lanna);Tai_Tham;2009-06-01
-Laoo;356;Lao;laotien;Lao;2004-05-01
-Latf;217;Latin (Fraktur variant);latin (variante brisée);;2004-05-01
-Latg;216;Latin (Gaelic variant);latin (variante gaélique);;2004-05-01
-Latn;215;Latin;latin;Latin;2004-05-01
-Lepc;335;Lepcha (Róng);lepcha (róng);Lepcha;2007-07-02
-Limb;336;Limbu;limbou;Limbu;2004-05-29
-Lina;400;Linear A;linéaire A;;2004-05-01
-Linb;401;Linear B;linéaire B;Linear_B;2004-05-29
-Lisu;399;Lisu (Fraser);lisu (Fraser);Lisu;2009-06-01
-Lyci;202;Lycian;lycien;Lycian;2007-07-02
-Lydi;116;Lydian;lydien;Lydian;2007-07-02
-Mand;140;Mandaic, Mandaean;mandéen;;2007-07-15
-Mani;139;Manichaean;manichéen;;2007-07-15
-Maya;090;Mayan hieroglyphs;hiéroglyphes mayas;;2004-05-01
-Mero;100;Meroitic;méroïtique;;2004-05-01
-Mlym;347;Malayalam;malayâlam;Malayalam;2004-05-01
-Moon;218;Moon (Moon code, Moon script, Moon type);écriture Moon;;2006-12-11
-Mong;145;Mongolian;mongol;Mongolian;2004-05-01
-Mtei;337;Meitei Mayek (Meithei, Meetei);meitei mayek;Meetei_Mayek;2009-06-01
-Mymr;350;Myanmar (Burmese);birman;Myanmar;2004-05-01
-Nkgb;420;Nakhi Geba ('Na-'Khi ²Ggŏ-¹baw, Naxi Geba);nakhi géba;;2009-02-23
-Nkoo;165;N’Ko;n’ko;Nko;2006-10-10
-Ogam;212;Ogham;ogam;Ogham;2004-05-01
-Olck;261;Ol Chiki (Ol Cemet’, Ol, Santali);ol tchiki;Ol_Chiki;2007-07-02
-Orkh;175;Old Turkic, Orkhon Runic;orkhon;Old_Turkic;2009-06-01
-Orya;327;Oriya;oriyâ;Oriya;2004-05-01
-Osma;260;Osmanya;osmanais;Osmanya;2004-05-01
-Perm;227;Old Permic;ancien permien;;2004-05-01
-Phag;331;Phags-pa;’phags pa;Phags_Pa;2006-10-10
-Phli;131;Inscriptional Pahlavi;pehlevi des inscriptions;Inscriptional_Pahlavi;2009-06-01
-Phlp;132;Psalter Pahlavi;pehlevi des psautiers;;2007-11-26
-Phlv;133;Book Pahlavi;pehlevi des livres;;2007-07-15
-Phnx;115;Phoenician;phénicien;Phoenician;2006-10-10
-Plrd;282;Miao (Pollard);miao (Pollard);;2009-02-23
-Prti;130;Inscriptional Parthian;parthe des inscriptions;Inscriptional_Parthian;2009-06-01
-Qaaa;900;Reserved for private use (start);réservé à l’usage privé (début);;2004-05-29
-Qabx;949;Reserved for private use (end);réservé à l’usage privé (fin);;2004-05-29
-Rjng;363;Rejang (Redjang, Kaganga);redjang (kaganga);Rejang;2009-02-23
-Roro;620;Rongorongo;rongorongo;;2004-05-01
-Runr;211;Runic;runique;Runic;2004-05-01
-Samr;123;Samaritan;samaritain;Samaritan;2009-06-01
-Sara;292;Sarati;sarati;;2004-05-29
-Sarb;105;Old South Arabian;sud-arabique, himyarite;Old_South_Arabian;2009-06-01
-Saur;344;Saurashtra;saurachtra; Saurashtra;2007-07-02
-Sgnw;095;SignWriting;SignÉcriture, SignWriting;;2006-10-10
-Shaw;281;Shavian (Shaw);shavien (Shaw);Shavian;2004-05-01
-Sinh;348;Sinhala;singhalais;Sinhala;2004-05-01
-Sund;362;Sundanese;sundanais;Sundanese;2007-07-02
-Sylo;316;Syloti Nagri;sylotî nâgrî;Syloti_Nagri;2006-06-21
-Syrc;135;Syriac;syriaque;Syriac;2004-05-01
-Syre;138;Syriac (Estrangelo variant);syriaque (variante estranghélo);;2004-05-01
-Syrj;137;Syriac (Western variant);syriaque (variante occidentale);;2004-05-01
-Syrn;136;Syriac (Eastern variant);syriaque (variante orientale);;2004-05-01
-Tagb;373;Tagbanwa;tagbanoua;Tagbanwa;2004-05-01
-Tale;353;Tai Le;taï-le;Tai_Le;2004-10-25
-Talu;354;New Tai Lue;nouveau taï-lue;New_Tai_Lue;2006-06-21
-Taml;346;Tamil;tamoul;Tamil;2004-05-01
-Tavt;359;Tai Viet;taï viêt;Tai_Viet;2009-06-01
-Telu;340;Telugu;télougou;Telugu;2004-05-01
-Teng;290;Tengwar;tengwar;;2004-05-01
-Tfng;120;Tifinagh (Berber);tifinagh (berbère);Tifinagh;2006-06-21
-Tglg;370;Tagalog (Baybayin, Alibata);tagal (baybayin, alibata);Tagalog;2009-02-23
-Thaa;170;Thaana;thâna;Thaana;2004-05-01
-Thai;352;Thai;thaï;Thai;2004-05-01
-Tibt;330;Tibetan;tibétain;Tibetan;2004-05-01
-Ugar;040;Ugaritic;ougaritique;Ugaritic;2004-05-01
-Vaii;470;Vai;vaï;Vai;2007-07-02
-Visp;280;Visible Speech;parole visible;;2004-05-01
-Xpeo;030;Old Persian;cunéiforme persépolitain;Old_Persian;2006-06-21
-Xsux;020;Cuneiform, Sumero-Akkadian;cunéiforme suméro-akkadien;Cuneiform;2006-10-10
-Yiii;460;Yi;yi;Yi;2004-05-01
-Zinh;994;Code for inherited script;codet pour écriture héritée;Inherited;2009-02-23
-Zmth;995;Mathematical notation;notation mathématique;;2007-11-26
-Zsym;996;Symbols;symboles;;2007-11-26
-Zxxx;997;Code for unwritten documents;codet pour les documents non écrites;;2007-06-13
-Zyyy;998;Code for undetermined script;codet pour écriture indéterminée;Common;2004-05-29
-Zzzz;999;Code for uncoded script;codet pour écriture non codée;Unknown;2006-10-10

Added: trunk/locales/readme.txt
===================================================================
--- trunk/locales/readme.txt	                        (rev 0)
+++ trunk/locales/readme.txt	2009-11-10 09:48:08 UTC (rev 243)
@@ -0,0 +1,18 @@
+Order of precedence for standards/files:
+
+Overrides, grandfathered tags:
+1) language-subtag-registry.txt (from IANA: http://www.iana.org/assignments/language-subtag-registry)
+
+Language subtag:
+2) ISO-639-2_utf-8.txt (ISO 639-1, from LoC: http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt)
+3) ISO-639-2_utf-8.txt (ISO 639-2/T, from LoC: http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt)
+4) iso-639-3.tab (ISO 639-3, from SIL: http://www.sil.org/iso639-3/download.asp)
+5) iso-639-3_Retirements.tab (ISO 639-3 deprecated, from SIL: http://www.sil.org/iso639-3/download.asp)
+6) iso639-5.pipe.txt (ISO 639-5, from LoC: http://www.loc.gov/standards/iso639-5/iso639-5.pipe.txt)
+
+Script subtag:
+7) iso15924-utf8.txt (ISO 15924, from Unicode: http://unicode.org/iso15924/iso15924.txt.zip)
+
+Region subtag:
+8) iso3166_en_code_lists.txt (from ISO: http://www.iso.org/iso/iso3166_en_code_lists.txt)
+   iso3166_fr_code_lists.txt (from ISO: http://www.iso.org/iso/iso3166_fr_code_lists.txt)

Added: trunk/locales/updateFiles.pl
===================================================================
--- trunk/locales/updateFiles.pl	                        (rev 0)
+++ trunk/locales/updateFiles.pl	2009-11-10 09:48:08 UTC (rev 243)
@@ -0,0 +1,34 @@
+#!/usr/bin/perl
+
+#This script calls wget, unzip, and mv, so you'll need to have those binaries installed.
+
+`wget -N  http://www.iana.org/assignments/language-subtag-registry`;
+`mv language-subtag-registry language-subtag-registry.txt`;
+`wget -N http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt`;
+`wget -N http://www.loc.gov/standards/iso639-5/iso639-5.pipe.txt`;
+`wget -N http://unicode.org/iso15924/iso15924.txt.zip`;
+`wget -N http://www.iso.org/iso/iso3166_en_code_lists.txt`;
+`wget -N http://www.iso.org/iso/iso3166_fr_code_lists.txt`;
+
+$ret = `unzip -o iso15924.txt.zip`;
+$ret =~ /(iso15924-utf.+)/;
+`mv -f \"$1\" \"iso15924-utf8.txt\"`;
+`rm iso15924.txt.zip`;
+
+`wget -N http://www.sil.org/iso639-3/download.asp`;
+open DL, "download.asp";
+while (<DL>) {$downloadasp .= $_;}
+close (DL);
+`rm download.asp`;
+
+$downloadasp =~ /Download ISO 639-3 code set <a HREF=\"([^\"]+)\">UTF-8/;
+`wget -N "http://www.sil.org/iso639-3/$1"`;
+`mv -f \"$1\" \"iso-639-3.tab\"`;
+
+$downloadasp =~ /Download ISO 639-3 Language Names Index <a HREF=\"([^\"]+)\">UTF-8/;
+`wget -N "http://www.sil.org/iso639-3/$1"`;
+`mv -f \"$1\" \"iso-639-3_Name_Index.tab\"`;
+
+$downloadasp =~ /Download <a HREF=\"([^\"]+)\">ISO 639-3 code retirement mappings/;
+`wget -N "http://www.sil.org/iso639-3/$1"`;
+`mv -f \"$1\" \"iso-639-3_Retirements.tab\"`;




More information about the sword-cvs mailing list