W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > February 2003

Re: language tags in typed RDF literals

From: pat hayes <phayes@ai.uwf.edu>
Date: Fri, 14 Feb 2003 10:22:11 -0600
Message-Id: <p05111b21ba72c38a735a@[]>
To: Brian McBride <bwm@hplb.hpl.hp.com>
Cc: w3c-rdfcore-wg@w3.org

>As I recall:
>   - the WG has considered this.  At the time it made its decision, 
>Pat advocated accepting that lang tags be allowed in the syntax, 
>that it was harmless

Yes, I did say that, didnt I? Sigh.

>and seemed at the time to be important to Patrick who stated that 
>Nokia felt strongly about the issue.
>   - later Patrick reported that after further thought, that whilst 
>Nokia still preferred the lang tag to be allowed in a datatyped 
>literal, they could live without it.

I had forgotten about the Nokia request. But read on.

>Before last call, this is perhaps something we could have 
>considered.  Now its a  change.  Might anyone change their review 
>based on it?  Well, yes, it seemed important to Nokia.

Well, OK, but how about the other option I gave? Leave the lang tags 
alone, don't change the syntax, but just allow as a possibility that 
a datatype might be tag-sensitive. Of course the XSD datatypes 
aren't, and we can say so explicitly, but some datatypes might be. 
Our home-grown one is, for example, so why not allow Nokia (to choose 
an example at random) the freedom to define, say, German decimals, or 
whatever, and then have the same kind of freedom to move tags between 
RDF and the innards of their typed literal strings that we have given 
ourselves for XML? This is technically a change, I admit, but the 
only document it seriously impinges on, I think, is semantics; any 
changes to other documents will simply be deletions of any prose that 
says that rdf:XMLiteral is 'exceptional' in having lang-tagged pairs 
in its lexical space. And the changes to semantics will mostly be 
deletions of the silly inference rules rdfD0a and b, (section 4.3, 
http://www.w3.org/TR/rdf-mt/#dtype_entail ) that I really think we 
would be better off not having.

>How important is this Pat.  Is it worth a second last call?  If its 
>just a case of extra work in writing the semantics - then I think 
>you've made your bed.  If it has undesirable externally visible 
>consequences, that might be different.

Well, nobody else has complained, admittedly, but I think that 
rdfD0a/b is an undesirable externally visible consequence.  When the 
WG was discussing this stuff I was maybe remiss in not rubbing our 
collective noses in the fact that this would be a consequence of what 
we were deciding. As I recall, we were SOO tired of datatyping that 
nobody wanted any more controversy at the time :-)

Would this change really require a second last call? It would make no 
difference to any current datatyping, it would only be a 
simplification which would make the overall design more coherent. The 
rdfD0 rules would still apply to XSD-typed literals, and we can say 
that and state the rules explictly, just like we do now, but they 
would not be logically required for all datatypes.

Right now it feels to me that we are kind of holding a gun to the 
head of all future datatype definers: they aren't *allowed* to take 
the lang tags into account. And yet we *insist* on having them in the 
syntax. What do we say, if they come back and ask us why?

The issue is not keeping the semantics simpler. I can make the bed 
either way. But I think it has a pea under the mattress right now 
which others will eventually notice.


>At 18:01 11/02/2003 -0600, pat hayes wrote:
>>The current design of RDF literals is needlessly complicated and 
>>kind of silly.  The syntax allows language tags to occur in typed 
>>literals, but in all cases other than rdf:XMLLiteral, these tags 
>>are required to have no meaning, so the semantics is obliged to 
>>provide a valid inference rule which allows any language tag in any 
>>such typed literal to be removed or replaced by any other.  This 
>>considerably complicates the statement of the semantics, adds a 
>>burden to any implementation, nullifies the implicit design 
>>principle that literals can be compared for identity using simple 
>>lexical matching (since an engine is required to strip out all such 
>>lang tags while performing inferences or checking for identity), 
>>and provides no useful expressive function.
>>A related point is that the requirement in the semantics that 
>>datatypes other than rdf:XMLLiteral *must* ignore language tags 
>>seems to restrict possible future datatyping proposals needlessly.
>>I suggest therefore that
>>(1) lang tags be forbidden by the RDF syntax from appearing in 
>>non-XML typed literals.
>>(2) the notion of the lexical space of a datatype be generalized to 
>>allow (not require)  lang tags to be taken into consideration by a 
>>datatype, so that the lexical space may be a set of strings or 
>>pairs of strings, i.e. a set of simple literals. This would have 
>>the effect that it would no longer be valid to make arbitrary 
>>changes to a lang tag in any literal, typed or not. It would also 
>>bring the treatment of all RDF datatypes into alignment so that 
>>rdf:XMLLiteral need not be considered a special case.
>>Either of these changes will simplify the semantics and make it 
>>more coherent, but in slightly different ways.
>>Either change will produce fewer inference rules and lead to less 
>>processing in a reasoning engine.
>>Pat Hayes
>>IHMC                                    (850)434 8903 or (650)494 3973   home
>>40 South Alcaniz St.                    (850)202 4416   office
>>Pensacola                                       (850)202 4440   fax
>>FL 32501                                        (850)291 0667    cell
>>phayes@ai.uwf.edu                 http://www.coginst.uwf.edu/~phayes
>>s.pam@ai.uwf.edu   for spam

IHMC					(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Friday, 14 February 2003 11:22:14 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:54:04 UTC