RE: URIS for Literals (was: Re: referendum on httpRange-14 (was RE: "information resource"))

> -----Original Message-----
> From: www-tag-request@w3.org 
> [mailto:www-tag-request@w3.org]On Behalf Of
> ext Chris Lilley
> Sent: 29 October, 2004 23:32
> To: noah_mendelsohn@us.ibm.com
> Cc: Graham Klyne; Joshua Allen; Stuart Williams; Tim Berners-Lee;
> www-tag@w3.org
> Subject: Re: URIS for Literals (was: Re: referendum on 
> httpRange-14 (was
> RE: "information resource"))
> 
> 
> 
> On Friday, October 29, 2004, 6:41:02 PM, noah wrote:
> 
> nuic> Chris Lilley writes:
> 
> >> With the proviso that I would prefer
> >> 
> nuic> 
> data:text/plain;charset="utf-8",some%20percent%20escaped%20lit
> eral%20value
> 
> nuic> I think the above is a plausable way of carrying a 
> literal which is a
> nuic> sequence of unicode chars.
> 
> Yes - that is exactly what I thought it was good for, string literals.
> 
> 
> nuic>  I wonder whether there is any need to have a
> nuic> URI that represents the member of the type xsd:integer 
> that has the
> nuic> numeric value 10, for example?
> 
> It might be useful (and I agree that the above form would not be
> suitable)
> 
> 
> 
> nuic>   http://www.w3.org/2004/SchemaSimpleTypes/Integer/value/12
> 
> nuic>   http://www.w3.org/2004/SchemaSimpleTypes/Integer/lexical/012
> 
> nuic>   http://www.w3.org/2004/SchemaSimpleTypes/Integer/12
> 
> I agree those forms are much preferable, 

Why would they be preferable to any other form of URI? Despite the fact
that humans might recognize that they seem to pertain to literal values,
the principle of URI opacity would preclude any agent (or human) from
concluding that they actually do identify literal values, since the http:
URI scheme says nothing about such interpretations.

> I think it is. Further, other types can be created that were not in
> W3C XML Schema, a benefit of using URIs for them.

Exactly.

What is needed is a dedicated URI scheme which provides for reliable
interpretation of URIs as identifying particular literal values (plain,
language tagged, or typed). I proposed such a URI scheme quite some
time ago, but it was eclipsed at that time by the work on typed literals
in RDF. 

E.g.

--

val: - Literal Value URI Scheme

VAL_URI          = "val:" PLAIN_LITERAL
VAL_URI          = "val:" LANGTAG_LITERAL
VAL_URI          = "val:" TYPED_LITERAL
PLAIN_LITERAL    = LEXICAL_FORM
LANGTAG_LITERAL  = LEXICAL_FORM "(" LANGTAG ")"
TYPED_LITERAL    = "(" DATATYPE ")" LEXICAL_FORM
LEXICAL_FORM     = {URL encoded lexical form}
DATATYPE         = {URL encoded datatype URI}
LANGTAG          = {a valid xml:lang value}

--

Thus, allowing literals to be expressed unambiguously by URI e.g.:

   val:some%20plain%20literal
   val:some%20language%20tagged%20literal(en)
   val:(http%3A%2F%2Fwww%2Ew3%2Eorg%2F2001%2FXMLSchema%23integer)12
   val:(http%3A%2F%2Fwww%2Ew3%2Eorg%2F2001%2FXMLSchema%23boolean)true
   val:(http%3A%2F%2Fwww%2Ew3%2Eorg%2F2001%2FXMLSchema%23lang)fi
   val:(http%3A%2F%2Fwww%2Ew3%2Eorg%2F2001%2FXMLSchema%23string)yada%20yada%20yada
   val:(http%3A%2F%2Fexample%2E.com%2Fblargh)booga

which correspond to the following RDF literals (using N-Triples notation):

   "some plain literal"
   "some language tagged literal"@en
   "12"^^<http://www.w3.org/2001/XMLSchema#integer>
   "true"^^<http://www.w3.org/2001/XMLSchema#boolean>
   "fi"^^<http://www.w3.org/2001/XMLSchema#lang>
   "booga"^^<http://example.com/blargh>

The val: URI scheme could give explicit support to the pre-defined XML Schema
datatypes, allowing the datatype component to correspond solely to the localname
portion of the XML Schema datatype, e.g. adding the following to the above grammar:

DATATYPE         = {XML Schema pre-defined datatype localname}

resulting in support for abbreviated forms such as

   val:(integer)12
   val:(boolean)true
   val:(lang)fi
   etc...

yet any arbitrary datatype still remains fully supported.

Because the interpretation of all URIs based on the val: URI scheme would be
defined by the URI scheme itself, agents (and humans) are then free to conclude
which literal values those particular URIs identify -- insofar as e.g. any
datatype in question is recognized (since e.g. if you don't know
what datatype <http://example.com/blargh> is, you can't know which value
the lexical form "booga" corresponds to in any case).

Ideally, literals could be used as subjects in RDF statements, but until
such usage is legal, a dedicated literal value URI scheme such as defined
above could provide a great deal of utility, without violating the
principle of URI opacity.

Cheers,

Patrick

--

Patrick Stickler
Nokia, Finland
patrick.stickler@nokia.com
 

Received on Monday, 1 November 2004 07:20:59 UTC