[sword-devel] Case sensitivity in xml tags

DM Smith dmsmith at crosswire.org
Sat Feb 28 16:37:06 MST 2009


On Feb 28, 2009, at 4:23 AM, Eeli Kaikkonen wrote:

> Greg Hellings wrote:
>> I was speaking in IRC with Troy and others, and since all of those
>> comparisons were for XML tags or module names, they actually don't
>> have to be case insensitive, since XML is case sensitive for its tag
>> names, etc.    On the other hand, the module name shouldn't be done
>> with stricmp/strcasecmp anyway, since it needs to be properly UTF-8
>> aware (and, theoretically, so should the XML comparison).  He
>> suggested to use something like toUpper(SWBuf(str)) == "MYUTF8STRING"
>> which would get rid of the entire reliance on the system-specific
>> case-insensitive comparison function and could be used for either the
>> XML tags or the Module names.  But yes, using stricmp from the SWORD
>> headers would be a replacement that does exactly the same as what we
>> do at the moment.
>
> Is it guaranteed that the XML in the modules is always lower (upper)  
> case, or does e.g.sword::XMLTag::getName() always return a lower  
> (upper) case string ? If not, we need to have case sensitive  
> comparison. Unless I have misunderstood something.
>
> For example, we use "strcasecmp(tag.getName(), "foreign")". It works  
> with case sensitive comparison only if the tag name "foreign" is  
> always lower case in all modules.

For OSIS to be valid, it is lower case.
For ThML to be valid against a strict xml Voyager, it is lower case.  
It is possible to validate against a less strict Voyager dtd. The ThML  
additional elements are always lower case.

That said, our importers do very little, if anything, to ensure that  
any tag (except what they look for) to be lower case.

I think that both Chris and I validate input against a strict xml dtd.  
In the CrossWire modules, I don't remember seeing any that aren't  
strict XML.

Hope that helps,
	DM



More information about the sword-devel mailing list