Re: N-Triples changes for datatype values, (possible) N3 alignment from Patrick Stickler on 2002-10-25 (w3c-rdfcore-wg@w3.org from October 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Fri, 25 Oct 2002 10:13:03 +0300
To: "ext Dave Beckett" <dave.beckett@bristol.ac.uk>
Cc: "w3c-rdfcore-wg" <w3c-rdfcore-wg@w3.org>
Message-ID: <003701c27bf5$f569f0c0$6a80720a@NOE.Nokia.com>
[Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, patrick.stickler@nokia.com]


----- Original Message ----- 
From: "ext Dave Beckett" <dave.beckett@bristol.ac.uk>
To: "Patrick Stickler" <patrick.stickler@nokia.com>
Cc: "w3c-rdfcore-wg" <w3c-rdfcore-wg@w3.org>
Sent: 24 October, 2002 16:27
Subject: Re: N-Triples changes for datatype values, (possible) N3 alignment 


> >>>Patrick Stickler said:
> > > >    [ <http://www.w3.org/2001/XMLSchema#int> "10" ]
> > 
> > Are there any plans for it to generate triples? (I would expect
> > not, and would hope there would be language somewhere to the
> > effect that it would be disallowed in some fashioin)
> 
> I don't know; I'd ask the N3 developers.

Dan? Any comments?


> 
> > Also, I am very curious as to the need for any of the "extra" 
> > delimiting characters.
> > 
> > Is not 
> > 
> >     "10"en-US<http://www.w3.org/2001/XMLSchema#int>
> > 
> > sufficiently explicit for parsing? The string final " and
> > the URI initial < seem to be sufficient to unambiguously
> > mark the boundaries of the components. Why are @ and ^^ 
> > actually needed?
> > 
> > KISS would seem to call for their omission, which would 
> > further serve to prevent their being given special significance
> > by alternate serializations which would result in a typed
> > literal node to triples interpretation.
> > 
> 
> Sufficient but too terse.
>
> We've implemented, used this 

I can appreciate that, but now is the time to get it right.

> and it is nice to have delimiters for
> parsing, 

Is it really that much more effort to look for [a-zA-Z] or
even [^<] rather than '@' and '<' rather than '^^'.

I can't help but suspect that there are specific plans for
assigning structural meaning to these delimiters in a similar
fashion to how '^' is now defined for N3.

Again, KISS would say to leave them out. They are unnecessary.

> a bit of readability and in this case, 

I guess that's a matter of taste. I find the extra delimiters 
distracting and annoying to type since they are unnecessary
as the boundaries are already explicitly denoted by the 
delimiters for the string and URI components and more than
clear enough.

> to align a bit more
> with what might change in N3.

It's this latter part that worries me. If ^^ is subsequently defined
in N3 as a syntactic shorthand for a triples representation, then
I would have a big problem with that, since it means that N3 and
N-Triples are divergent in their syntax to graph mapping and users
who presume that N-Triples is a subset of N3 would quickly become
confused.

Omitting the ^^ delimiter prevents (or at least discourages) such
interpretations.

Here is a question to test this: would putting the language tag last
simply be a matter of taste. E.g.

   "foo"^^<bar://abc.com/blargh>@en

If the answer is yes, that it could be ordered that way, then I would
feel better that there are not expectations that ^^ will be used as
a syntactic shorthand. I suspect, though, that some folks will not
want the lang tag at the end precisely because it gets in the way
of such interpretations and that there are expectations that it will
be used in such a fashion. E.g. that

   "foo"@en^^<bar://abc.com/blargh>

will be interpreted as something akin to

   [ <bar://abc.com/blargh> "foo"; xml:lang "en" ]

Again, if one wants to infer additional triples such as that
based on typed literal nodes, fine, but the typed literal node
should occur in the graph as a single typed literal node.

Perhaps DanC would like to comment on this, since the ^^ notation
was (as I understand it) his suggestion?


> <snip/>
> 
> >  "10"^^<http://www.wapforum.org/profiles/UAPROF/ccppschema-20010430#Number>
> > 
> >  "foo"^^<http://www.wapforum.org/profiles/UAPROF/ccppschema-20010430#Literal>
> > 
> >  "Yes"^^<http://www.wapforum.org/profiles/UAPROF/ccppschema-20010430#Boolean>
> > 
> >  "200x100"^^<http://www.wapforum.org/profiles/UAPROF/ccppschema-20010430#Dimension>
> 
> Thanks for these datatype examples.  The Boolean one seems useful.
> 
> Are there any language-based ones; I guess my colo(u)r one was the
> kind of thing that was beingexpected?

Well, the kind of stuff that we are doing, that is relevant to the
typed literal with lang tag is

   "English"@en^^<voc://nokia.com/MARS-3.0/Token>
   "Englanti"@fi^^<voc://nokia.com/MARS-3.0/Token>
   "Ingles"@sp^^<voc://nokia.com/MARS-3.0/Token>

Is that what you meant?

Note that all of the above three typed literals denote different
values in the Token datatype's value space. 

Perhaps you were thinking along the lines of

   "English"@en^^<foo://abc.com/Language>
   "Englanti"@fi^^<foo://abc.com/Language>
   "Ingles"@sp^^<foo://abc.com/Language>

where all three typed literals denote the same value, the English
language. I.e. the L2V mapping is

   "English"   ->    ENGLISH
   "Englanti"  ->    ENGLISH
   "Ingles"    ->    ENGLISH

???

In any case, as a matter of architecture, I presume that the
language tag is excluded from being significant for any datatypes
L2V map. I.e., we wouldn't have

   "English"@en   ->    ENGLISH
   "Englanti"@fi  ->    ENGLISH
   "Ingles"@sp    ->    ENGLISH

as the only the string component and datatype URI component of 
the typed literal are relevant to the L2V mapping. The lang
tag is there as a form of scoping mechanism that is available
to query engines which wish to filter values based on language.

This emphasizes the fact that datatype qualification and language
qualification are disjunct. One may have both or either. But
the lang tag is ignored insofar as the datatyping interpretation
is concerned.

In this way, we remain fully compatable with XML Schema datatyping
where it is only the lexical space that is relevant to the L2V
mapping and xml:lang is excluded from participating in datatyping
interpretations.

Patrick
Received on Friday, 25 October 2002 03:13:06 UTC