xml:lang="" - empty or optional language IDs

In the after hours discussion on 25th Oct, DanC indicated a preference for "An 
untyped literal is a string optionally combined with a language identifier" 
to my "... a string combined with a (possibly empty) language identifier".

His rationale is desiring semantic interoperability between xsd:string and the
untyped literals.

i.e.

<rdf:Description>
  <eg:prop>foo</eg:prop>
</rdf:Description>

entails, and is entailed by

<rdf:Description>
  <eg:prop rdf:datatype="&xsd;string">foo</eg:prop>
</rdf:Description>

I have rejected this proposal, subject to WG review, for the following 
reasons.

+ <a href="http://www.w3.org/XML/xml-V10-2e-errata#E41">XML erratum 41</a> 
permits xml:lang="" and clearly indicates this is equivalent to not having 
one. Hence at some point we need to merge these into a single case. I find it 
more elegant to have that single case being "" rather than missing, for 
reasons of uniformity.

+ When doing language specific processing on langauge tags the general 
algorithm indicated by RFC 3066 is that one should try and match prefixes - 
e.g. if looking for en-US and you find en-AU then that is a better match than 
finding fr. The empty string fits with this algorithm, whereas optional does 
not.

+ I think DanC's proposal makes the impact on the meaning of adding an 
xml:lang to the rdf:RDF element, or to an enclosing XML document (not an 
uncommon practice) too severe. I suspect we may get bug reports along the 
lines of - this RDF works here,  and it doesn't work here - as people copy 
paste RDF between different enclosing XML.

Jeremy

Received on Saturday, 26 October 2002 00:21:21 UTC