[sword-devel] OSIS validation surprises

Chris Little chrislit at crosswire.org
Mon Jul 7 11:17:18 MST 2008


Daniel Owens wrote:
> I am working on usfm2osis.pl still, and I am trying to validate the 
> output. The foreign element is giving me problems.
> 
> vbu26.out.xml:65535: element foreign: Schemas validity error : Element 
> '{http://www.bibletechnologies.net/2003/OSIS/namespace}foreign', 
> attribute '{http://www.w3.org/XML/1998/namespace}lang': 'arc' is not a 
> valid value of the local union type.
> 
> This is the text it hiccups on: <foreign xml:lang="arc">Ê‑li, Ê‑li, 
> lam-ma-sa-bách-ta-ni?</foreign>.
> 
> That's straight from the OSIS™ 2.1.1 User's Manual (draft). Am I missing 
> something?

Are you using libxml2 (e.g. xmllint) for validation? I've seen this 
error pop up for values that are very clearly correct when validating 
with xmllint. The same markup validates perfectly fine using Xerces.

There may be a minor problem, I believe, in that we're in something of a 
transition period, when it comes to language tags. xml:lang is defined 
as employing RFC 3066 "or its successor". The best current practice that 
defines which RFC should be used for language tags is BCP 47, which 
currently points to RFCs 4646 & 4647. RFC 4646 still identifies ISO 
639-1 and -2/T as its authoritative sources for ISO 639 language codes, 
though it makes certain provisions for the future integration of ISO 
639-3 (and later). However, RFC 4646 also establishes a language subtag 
registry, to be maintained by IANA, which does incorporate ISO 639-3 codes.

Validation _should_ now make reference to that actual IANA language 
subtag registry, which exists here: 
http://www.iana.org/assignments/language-subtag-registry.

However, given how many links in the chain have been updated in the last 
couple years, it's quite possible that some validators haven't properly 
updated to RFC 4646 or haven't grabbed the latest data from the registry.

That said, there should still be no problem with "arc" because it was 
defined as part of ISO 639-2/T and was valid in RFC 3066.

In short: Ignore your validator. It's wrong.

--Chris




More information about the sword-devel mailing list