RE: QName URI Scheme Re-Visited, Revised, and Revealing from Patrick.Stickler@nokia.com on 2001-08-23 (uri@w3.org from August 2001)

From: <Patrick.Stickler@nokia.com>
Date: Thu, 23 Aug 2001 10:48:12 +0300
To: sean@mysterylights.com
Cc: www-rdf-interest@w3.org, uri@w3.org, www-rdf-comments@w3.org
Message-ID: <2BF0AD29BC31FE46B78877321144043114BFA8@trebe003.NOE.Nokia.com>
> -----Original Message-----
> From: ext Sean B. Palmer [mailto:sean@mysterylights.com]
> Sent: 22 August, 2001 16:37
> To: Stickler Patrick (NRC/Tampere)
> Cc: www-rdf-interest@w3.org; uri@w3.org
> Subject: Re: QName URI Scheme Re-Visited, Revised, and Revealing
> 
> 
> > Hmmmm.... do we need then to use DAML equivalence
> > constructs for basic RDF?! E.g.
> >
> >    <qn:{http://description.org/schema/}Creator>
> >       daml:equivalentTo
> >          <qn:{http://description.org/schema/}@Creator> .
> 
> No, because those QNames are in RDF, and RDF processors handle them by
> concatenating them and forming a URI. The XML Namespace specification
> defines a difference between the QNames, but it doesn't tell 
> you how to
> handle them. The RDF specification comes in and tells people 
> what to do
> with the QNames. It tells one to concatenate the QNames togther minus
> partition information, to form a URI. 

Uhhh... so even if the NS spec says X and Y are different, RDF can do
whatever 
it likes with them, including saying X = Y even if the creator(s) of the
data
in which X and Y occur say that X != Y?!

Sorry, I don't read the NS spec that way, and I seriously doubt very many
folks
in the XML community at large read it that way -- i.e. that QName
distinctions
can be discarded by RDF and RDF still remain conformant to the NS spec.

And what about QName partition distinctions, as pointed out by Sean, where
in fact not only do two different QNames constitute different lexical
identity
(and hence can represent different semantics) but also a QName used for an
element name constitutes a different lexical identity from the same QName
used
for a global attribute name!

So there are now two distinctions defined by the NS spec that RDF discards.

That can't be good.

> There's no problem with 
> that, except
> there's no way to identify a QName as a first class object 
> without some
> extra properties, 

If potentially four lexically distinct QNames (i.e. two QNames which collide
on
direct concatenation used both for elements and global attributes) are
merged by
RDF into one single URI derived by the present RDF function, then how can
you possibly
make statements about them to differentiate them -- since their only
identity
referencable in a statement in RDF space is ambiguous to start with?

Or are we back to using sub-graphs to define their actual identity rather
than self-contained, stand-alone URIs, which is the IMO proper way to define
resource identity in RDF?

> or a new URI scheme. 

I agree there. A QName URI scheme would have to capture the complete
identity
of all QNames maintaining the full scope of lexical distinction present in
the serialized XML. I.e. something like my revised qn URI scheme, or
anything
similar that does the job.

But then we get that nasty problem of element and global attribute QNames
having identitical semantics according to RDF's condensed serialization
syntax -- in which case we *have* to explicitly define the equivalences
between the different QName URI's (as above).

> So there's no 
> particular need to map
> every QName in XML RDF to your URI scheme - to do so would be 
> silly - but
> it doesn't stop people using it.

Nothing stops people from custom programming any functionality
whatsoever into their own systems, no.  But this is an issue
of global standardization, not what can be done locally in a single
isolated system.

> In other words, you've accomplished all that you want to 
> accomplish simply
> by inventing this URI scheme. 

No, that's not so. If I can't rely on *every* RDF engine used
by every SW agent to interpret my data exacly as I have defined it,
then the SW has no data integrity. Again, its about global consistency
of data.

> If you force people to adopt it 
> *instead* of
> the concatenation mechanism in XML RDF, then you'll form an 
> utter mess,

I agree (and have stated several times) that the QName URI solution
is far less attractive than other solutions because it is not
backwards compatible. However, since people are already thinking
in terms of QNames when defining their Web based ontologies, I hardly
expect that RDF officially mapping those QNames to an equivalent
URI representation is going to cause much uproar (in fact, I expect
it will be very positively recieved as it serves to clarify what
is now a murky relationship between QNames and RDF resource URIs).

Folks using XML *think* in terms of QNames insofar as their data
models, vocabularies, and ontologies are concerned. In fact, it's
tough to get folks to think in terms even of (namespace)name rather
than ns:name! People are already using QNames as universal 
identifiers. The fact that there is *not* an official, standard
URI representation for QNames is what is surprising...

I don't see a QName URI scheme as a hard sell, nor do I see it
causing a mess, rather the opposite, of cleaning up the present
mess in RDF QName to URI mapping.

> because the Semantic Web deals with URIs, and not just 
> QNames. 

Firstly, I'm not suggesting that folks to use "QNames" for all
resources. In fact, one would expect that QName URIs would be used
only for abstract concepts such as properties, classes, etc. Noone
is prevened from using other URI schemes for other types of resources,
and in fact, we *need* to be able to use other URI schemes to do
alot of the neat stuff that will be possible on the SW, and to
ground the abstract concepts and relationships into the "reality"
of the web.

Secondly, RDF itself is "URI Scheme blind". It does not itself
care what URI scheme you use. All that matters is that your universal
identifiers (which coincidentally happen to be URIs) are unique, and
consistently identify the same resource. Whether some *other* application
outside the scope of RDF-space can do something with those identifiers,
such as interpret them as URIs and perform some task with them, is
irrelevant insofar as the RDF model is concerned.

> Using the
> concatenation mechanism is an excellent and quick way to form 
> those URIs
> out of QNames.

Quick? Maybe. Excellent? No! 

It was a very clever hack that works with HTTP URLs using HTML fragment
syntax, but as with most hacks it does not generalize or scale sufficiently
to meet a broader range of needs -- namely those of the SW in all its forms.

It also does not maintain lexical distinctions defined by the NS spec, and
for that reason alone, its validity is suspect.

It has IMO been sufficiently demonstrated that the current method used by 
RDF for deriving URIs is unsafe, discriminates against arbitrary URI
schemes,
and does not maintain data integrity. I don't intend to go over all that
again. 

> ... and now that you've 
> come up with
> representing an XML QName (i.e. with all the details 
> preserved as in the
> XML namespace specification) as a URI, that's fine.

But it's true utility and value to RDF would be as the official URI
representation within triples -- making QNames a first class identifier
for abstract resources.

If it's not standardized and mandated for all RDF applications, it's 
not a solution to the present problem(s).

> Now, there are only two errors with your URI scheme. The 
> first is that the
> characters "{" and "}" are disallowed in URIs per section 2.4.3 of RFC
> 2396. This can be easily gotten round by using "(" and ")" 
> instead, e.g.:-
> 
>      qn:(http://www.w3.org/1999/xhtml)title
> 
> ...
>
> Because the "#" character is also disallowed by section 2.4.3 
> of RFC 2396.
> Perhaps you can just %HH encode it?
> 
>      qn:(http://infomesh.net/2001/08/example%23)myElement
>

Thanks for pointing these out. I just copied James Clark's syntax (which
was not specifically intended for use as URI and missed those encoding
problems. I was focusing more on the idea of a single, standardized URI
scheme for QNames, not specifically what that URI scheme would be, etc.

Parentheses and escaping should do just fine (though the latter makes
the URI scheme harder for humans to process -- but that's a problem of
any URI scheme that includes embedded URI refs).

Fortunately, if RDF also adopts support for QNames as URI attribute values,
then even in RDF Schemas folks won't have to "touch" the QName URIs
directly too often, but can simply define prefixes and use them in
their minimal, abbreviated forms. The only place where one would see
the full qn URIs would be just in the actual triples store.
 
>    qn = 'qn:' '(' absoluteURIexc [ '%23' fragment ] ')' ( 
> elem | glob |
> per )
>    elem = NCName
>    glob = '@' NCName
>    per = NCName '@' NCname

Thanks for the revised grammar. If such an approach is adopted by the
RDF core working group, then we can roll up our sleaves and put some
spit and polish on the URI scheme itself.
 
Cheers,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Thursday, 23 August 2001 03:48:25 UTC