RE: A proposed solution to the RDF syntactic/semantic mapping pro blem (long) from Patrick.Stickler@nokia.com on 2001-06-12 (www-rdf-interest@w3.org from June 2001)

From: <Patrick.Stickler@nokia.com>
Date: Tue, 12 Jun 2001 16:18:01 +0300
To: jborden@mediaone.net, Patrick.Stickler@nokia.com, www-rdf-interest@w3.org
Message-ID: <6D1A8E7871B9D211B3B00008C7490AA507958775@treis03nok>
> <URIreference,fragmentid> -> URIreference

There *is* no such reliable mapping. One can have 

  <URIreference,fragmentid> -> String

but not necessarily either

  <URIreference,fragmentid> -> *valid* URIreference

or

  <URIreference,fragmentid> -> *unique* String

e.g.

taking XML Schema '#' concatentation

  <"http://foo.com/foo.html#bar", "bas"> ->
"http://foo.com/foo.html#bar#bas"

(this surely violates the HTML MIME content type fragment syntax, eh?)

taking RDF simple concatenation

  <"name:foo:aja", "varovasti"> -> "name:foo:ajavarovasti"
  <"name:foo:ajava", "rovasti"> -> "name:foo:ajavarovasti"

(oops! now which is which?)

There *is* no such automatic, reliable, consistent, and standardized
method of deriving an RDF resource URI reference from a namespace
and name pair.

It may help to clarify this point in these discussion if everyone
would use my contrived "Parenthesized Taxonomy" URNs both for all 
namespace URIs and for all resource URIs in examples. ;-) 

The simple syntax is 

URN  :=  'urn:partax:' <name>
name := '(' [a-zA-Z][-.a-zA-Z0-9]* <name> ')'

E.g. 

   urn:partax:(foo)
   urn:partax:(foo(bar))
   urn:partax:(foo(bar(bas)))
   urn:partax:(foo(bar(bas(boo))))
   ...

Using such a (valid) URI scheme for all references quickly (IMO)
illustrates just how broken the current thinking in terms of
namespace + name concatenation is and that any RDF resource identity
can be automatically and reliably derived from the namespace and
name pair. Just because it works for HTTP URLS and HTML/XML fragments 
doesn't mean it is a valid or sufficient mechanism.

Let's please stop pretending that it is. Once we do that, we've taken
a huge step towards finding a more suitable and sufficient mechanism
to define such mappings.

> >
> > > >
> > > > Claim 2: A name within a given namespace does not 
> equate to a URI
> > > > reference of that name within any content 
> dereferencable from the
> > > > namespace URI reference.
> > > >
> > > > I.e. "namespace" + "name" != "namespace#name".
> > >
> > > I suppose it depends on what you expect the "name" to reference.
> > >
> > > I consider this a bug not a feature.
> >
> > Eh? A fragment in a URI reference is specific to the MIME content
> > type of the data that is accessible from the URI.
> 
> I strongly consider _that_ a bug. There very much needs to be 
> a MIME type
> _independent_ fragment identifier syntax (e.g.
> http://www.openhealth.org/RDDL/fragment-syntax)

But that doesn't solve a thing! You *cannot* impose any specific
fragment syntax on any method of creating URI references from namespace
and name pairs because that just further confuses the matter by
suddenly having URI references that are not valid URI references
according to the URI scheme or the presumed MIME content type 
yet have strangeness in them.

E.g. if that unified "namespace" fragment syntax would parenthesize
names, one could get e.g. "http://foo.com/foo.html#bar(boo)" from
"http://foo.com/foo.html#bar" and "boo". How is that any better
than any other arbitrary, imposed syntax?

I think (now) it is a great mistake to try to devise *any* scheme to
create a URI reference from the composition of the namespace
URI reference and name (even though I've been guilty of proposing
just such a scheme myself recently ;-)

> That means that
> > any ontology defined using signs which are URI references 
> constructed
> > by the combination of namespace URI and name with intervening #
> > are bound to the syntax of a given MIME content type.
>
> ugh, ugh, ugh (that's gorilla speak for "i really don't like that!")

Nor should any of us. It's very icky! (for lack of a technical term ;-)

> >
> > Furthermore, just how do you handle clearly broken URI refs such as
> > the following:
> >
> > "http://foo.com/bar.html#boo" + "bas" -> 
> "http://foo.com/bar.html#boo#bas"
> >
> > Eh?
> >
> > I again assert: "namespace" + "name" != "namespace#name"
> 
> assuming XML and fragment ids identify IDs within the document, the
> composition works as such:
> 
> "http://foo.com/bar.html#boo" + "bas" -> "http://foo.com/bar.html#bas"

What! How? My namespace is "http://foo.com/bar.html#boo", *NOT*
"http://foo.com.bar.html"!!! Just how do you determine that suddenly
the RDF identity of the resource is a fragment of some URI fragment
which is completely unspecified by the namespace URI fragment! 
Namespace URI fragments are 100% opaque. Sorry, you can't use logic
to try to extract a *recognizable* prefix from one with which to try
to come up with a logical URI reference for a resource. Thirty lashes
with a wet noodle for you!

Try doing that with the following example:

   urn:partax:(foo(bar)) + "bas" -> "urn:partax:(foo(bar))#bas" ???

Nope! Why? Because partax names do not resolve to MIME streams. They
are totally and entirely abstract (and if someone maps them to e.g.
one or more URIs that *do* resolve to MIME streams, that is unspecified
by the partax spec and thus unknowable by any application). So, the
URI reference "urn:partax:(foo(bar))#bas" has no valid definition.

It is *not* enough that RDF achieve a unique string. Until and unless
the spec is changed, it must be a *valid* URI reference, and validity
requires both that there be an explicit and knowable definition, and
that the URI reference conform to that definition.

> > > This is a mess.
> >
> > And the mess is because, due to the fact that most folks 
> equate URI to
> > URL and URL to HTTP URL and furthermore sincerely wanting 
> and needing
> > that namespace URIs actually dereference to something 
> recognizable and
> > concrete, they assumed that "namespace" + "name" == "namespace#name"
> > and that "namespace" is a URL and *not* a URL reference.
> >
> > And to make RDF work, added the hack "{URL}#" suffixing the 
> '#' on the
> > end so that the concatenation would create (presumably but 
> unreliably)
> > a URL reference that might be dereferencable.
> 
> but concat _still_ doesn't work see above.

It does if you include the hash '#' at the end of your
namespace URI and your namespace URI is a HTTP URL.

For all (or at least most) other cases, no, it does not work.

> >
> > Yes. The real situation is a mess -- but only because the presumed
> > automatic mapping of namespace and name to some combined URI does
> > not in fact work for arbitrary namespace URI references and 
> arbitrary
> > URI scheme and MIME content type fragment syntaxes.
> >
> > We just need to add the explicit mapping mechanism that *does* work.
> 
> I completely agree.
> 
> >
> > I don't think abandonment of URIs for RDF resource identity
> > would be a good think (I actually think it would be catastrophic).
> 
> which "resource" are we discussing? the one identified by a 
> URI, or the one
> identified by a URI reference ... the same URI reference 
> whose syntax is
> MIME type dependent ... but the resource is not identified by 
> dereferencing.
> so where is it written down exactly _what is_ a "resource" as 
> defined and
> used by RDF ... certainly not the same resource as defined by 
> RFC 2396 ...
> what is this 'thing' we have placed on a pedestal?

The one identified by the URI reference. The entity, referent,
pumpkin.

> > >
> > > Why not "daml:equivalentTo" or "rdfs:isDefinedBy"?
> >
> > Firstly, the syntactic to semantic mapping (i.e. serialization
> > to triples) is IMO the domain of RDF, not RDF Schema or DAML
> > and therefore should be fundamental to the RDF spec and the
> > solution embodied in every compliant RDF parser.
> 
> then the RDF spec needs alot of work, beyond simple mapping.

No. I don't think so. The only problem is that folks have
been equating (namespace(name)) to either namespace#name
or namespacename in order to get URI references to identify
abstract resources. 

If we ignore the issue of syntactic to semantic mapping, we
can see that the rest of RDF is doing pretty well. E.g. if one
has serialized RDF metadata that does not employ QNames for
any statements, resorting only to fully specified resource
URI references or literals, then everything is fine.

We just need a new mechanism that provides the mapping from XML
serializations to abstract resource identities which avoids
the pitfalls surrounding the mistaken presumption that such
identity can be derived by the syntactic construct
(namespace(name)).

> >
> > One needs a construct such as the proposed rdf:Map element
> > that binds the multiple syntactic components to a single
> > resource identity. Until that is done, RDF Schema and
> > DAML (or any other valid RDF ontology) are useless. Eh?
> 
> if you mean to say that a function rdf:Map(qname) is provided 
> which gives us
> a URI reference from a QName, then I agree, and as you 
> suggest it is a very
> nice idea to provide a case by case override of the default behavior.

Great. But I am further arguing that far more than simply being
neat or convenient, such a mapping function is essential and also is
*not* currently provided by RDF at all in a way that will facilitate
the envisioned interchange of knowledge on the SW.

Cheers,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Tuesday, 12 June 2001 09:19:03 UTC