RE: Summary of the QName to URI Mapping Problem from Patrick.Stickler@nokia.com on 2001-08-16 (www-rdf-interest@w3.org from August 2001)

From: <Patrick.Stickler@nokia.com>
Date: Thu, 16 Aug 2001 21:41:52 +0300
To: connolly@w3.org
Cc: drew.mcdermott@yale.edu, www-rdf-logic@w3.org, www-rdf-interest@w3.org
Message-ID: <2BF0AD29BC31FE46B78877321144043114BF92@trebe003.NOE.Nokia.com>
> -----Original Message-----
> From: ext Dan Connolly [mailto:connolly@w3.org]
> Sent: 16 August, 2001 16:43
> To: Stickler Patrick (NRC/Tampere)
> Cc: drew.mcdermott@yale.edu; www-rdf-logic@w3.org;
> www-rdf-interest@w3.org
> Subject: Re: Summary of the QName to URI Mapping Problem
> 
> 
> [This will be my last message in this thread unless/until
> I see some new information. We're convering well-trodden
> ground here. cf
> http://www.w3.org/2000/03/rdf-tracking/#rdfms-uri-substructure ]

Well trodden but hidden ground? Sorry, I don't see how your reference
does anything but reiterate that there is an issue here. It also seems
to address only the current practice of straight concatenation, and
not any of the issues relating to potential ambiguity or discrimination
against many types of URI scheme.

If you have any other pointers, I would be very happy to have them, and
will continue to do my homework regarding this matter.
 
> Patrick.Stickler@nokia.com wrote:
> > 
> > > ... Any qname whose prefix binds to a URI reference
> > > which, when concatenated with the QName's local name, yeilds
> > > the relevant URI is just fine.
> > 
> > Uhhh... sorry Dan, but just where do you get that?
> 
> I deduce it from the RDF spec:

I know that the spec specifies direct concatenation.

I meant where do you get the "is just fine" part.

> 
> --        Resource Description Framework (RDF) Model and Syntax
> Specification
> http://www.w3.org/TR/REC-rdf-syntax/
> Wed, 24 Feb 1999 14:45:07 GMT
> 
> I have also seen it in a number of implmentations.

One would expect RDF implementations to follow the spec. Again,
I wasn't referring to the mapping function itself, just that it
results in an acceptable result for all conceibable cases.
 
> > One might be able to bend the XML NS
> > spec wording to seem compatible with that interpretation,
> 
> The XML namespaces spec is silent on this matter; no bending
> is necessary.

I agree that the XML NS spec does not define a mapping function, though
some have taken the note in the final paragraph of section 3 as suggesting
that straight concatenation is how namespaces and names should be
combined (though I don't see it reading that way myself).

> > but the
> > XML NS spec defines no such QName to URI mapping function.
> > 
> > > >  But there's no rule for where
> > > > to break it, and, as Patrick pointed out, if you do it 
> the wrong way
> > > > you get ambiguities.
> > >
> > > What ambiguities? concat(nsname, localname) is completely
> > > unambiguous.
> > 
> > Well, I've provided examples several times to the list demonstrating
> > that such straight concatenation is potentially collisive, including
> > an example in the original posting of this thread.
> 
> Yes, but why does it matter that there are collisions? Why
> does it matter that the mapping is not 1-1?

Eh? I must not be understanding your question if you are asking
why ambiguity is undesireable. Certainly you're not suggesting that
a function that is known to produce ambiguity is OK because probably
most applications won't encounter such problems in practice...

(or are you playing Devil's advocate here ;-)
 
> If by "unambiguous" you meant 1-1, then I sorry, I misunderstood.

Yes. by unambiguous I meant 1-1 -- or rather, no two QNames will ever
be mapped to the same URI.

> The mapping is indeed not 1-1. For many URIs, there are lots
> of ways to split it into a namespace name and a local name.

But in a way that is consistent with some defined XML DTD or XML
Schema?
 
> > Yet these are clearly
> >      separate resources per their disjunct QName identities
> 
> "clearly"? That's not clear at all.

OK, my mistake. It may very well be that two different QNames might identify
the same resource, but such synonymy should be intentional, not
accidental due to a shortcoming of the QName to URI mapping function.

Let me thus contextualize the given example with the claim that I am
the owner of both namespaces, and that I assert that the two QNames
do not, in fact, correspond to the same resource. Therefore, the present
RDF mapping function damages the integrity of my data and possibly
makes my application fail to provide the expected and otherwise
attainable results.

> > (the fact that the above example is contrived in no way lessens the
> >  seriousness of this problem)
> 
> What problem? What are the undesirable consequences of
> this state of affairs? 

As a information service provider, or as a content producer, I think 
that the potential (and unnecessary) loss of the integrity of my well 
defined and valid data is a pretty significant issue. No?

> ... (aside from threads like this one ;-)

Careful, Dan ;-)

> > > > This really is a glaring hole in RDF that needs to be filled.
> > >
> > > Well, it's a discussion that keeps happening.

Discussions that keep happening usually are indicative of either
unresolved issues or unspecified or poorly specified resolutions.

If this trully is a non-issue, and there is a clear resolution that
has already been identified, then please, educate me -- but I am
still of the opinion that this issue won't die because the hole
is real and most folks have just been lucky not falling into it.

> > > But I have yet to see any coherent argument that there's an
> > > actual technical hole.
> > 
> > If the above example doesn't do it for you, I don't know what
> > will.
> 
> It doesn't convince me there is a hole, no.

That's a pity. Please be sure to watch where you step... (and 
carry a ladder with you ;-)

> > > The RDF spec includes an unambiguous QName -> URI mapping:
> > >       uri(qname) = concat(nsname(qname), localname(qname))
> > 
> > Which is broken...
> 
> No, it's not.

Uh, yes it is. (Geez, I feel like a 3rd grader... nyah, nyah ;-)

> > and does not address XML literal to resource URI
> > mapping...
> 
> What "XML literal to resource URI mapping" is that?

Read my proposal...

But in a nutshell, it's stuff like being able to say that literal values 
such as "en" in serialized XML instances map to explicit resource
URIs such as http://www.iso.ch/3166-1/en, etc. rather than just
remaining "dumb" RDF literals.

But that is a secondary and less than critical sub-topic of this
issue, and one that I could very well live without, if I had to,
so let's not go there just yet...  It's too hard already achieving
clarity and understanding about the trully critical aspects of
this issue.
 
> > > It's a W3C recommendation; that's as official as we get 
> around here.
> > 
> > Just because it's in a standard doesn't mean it's correct...
> 
> I certainly agree that endorsement is orthogonal to correctness.
> 
> But you asked for something official.

I asked for something official that works.  ;-)
 
> > > It's not possible to define a URI->Qname mapping, since not
> > > all URIs end in XML name characters. (This is a limitation
> > > that designers of RDF vocabularies should be made aware
> > > of by some NOTE in the spec or something.)
> > 
> > Exactly! Great! A glimmer of light! At least one problem recognized.
> > 
> > So in order to re-serialize RDF encoded knowledge for a particular
> > XML DTD or schema,
> 
> What does "RDF encoded knowledge for a particular XML DTD or schema"
> mean?

It means being able to export your triples to XML using QNames compatible
with a particular XML DTD or XML Schema. Obviously, the namespace prefixes
don't matter, but if you can't from a URI back to the namespace + name
pair that an XML application will recognize, then we have a one way
street.

Furthermore, being able to define relations between resources to
achieve ontological equivalence doesn't help much if you can't get
that knowledge back to applications in a vocabulary they understand
and in a serialization they can eat. Don't think that all applications
that will make the SW run will be RDF based. Many will be regular
good old XML applications or systems with XML import/export capability.

> > one must write *custom* code for each transformation
> > rather than be able to utilize standardized generic tools with
> > standardized mapping schemas!
> 
> Subject to the limitation that some URIs aren't usable as
> RDF 1.0 property names, there are simple, generic approaches.
> The simplest one I can think of is:
> 
> 	1. scanning from the end
> 		if you find an XML name start character, go to step 2
> 		if you find something that's not an XML name character,
> 			stop and throw an exception (you've run
> 			into a limitation in RDF 1.0 syntax).
> 		keep scanning
> 	2. split the string into a namespace name and a localname.
> 
> So
> 	mid:xyz@fooble
> becomes
> 	<e xmlns="mid:xyz@foobl">...</e>
> 
> and
> 	mid:xyz@fooble1
> becomes
> 	<e1 xmlns="mid:xyz@foobl">...</e1>

Uhhh, but if the actual QNames needed, according to the actual
ontology are (mid:xyz@foo)(ble) and (mid:xyz@foo)(ble1) and
you have applications, style sheets, etc. etc. looking for the
real QNames, just what do you expect them to do with the above
guesses.

Just because you can arrive at a serialization that a well formed
XML parser will be happy with does *not* mean you have produced a
meaningful and correct serialization of the knowledge in question
insofar as other applications are concerned.
 
> 
> > If such custom hacking has to happen for any cases where 
> folks aren't
> > using a particular type of URI scheme with particular 
> fragment syntax,
> > which are not explicitly defined by the standards, then we have a
> > hurdle for global acceptance.
> 
> No custom hacking has to happen.

Yes, if the mapping from QName to URI and from URI to QName is not
generically achievable in a consistent manner based on generic
mechanisms/functions, then custom hacking has to happen (that is,
if we are to use anything other than HTTP URLs ending in '#' for
namespaces...)

Regards,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Thursday, 16 August 2001 14:42:23 UTC