- From: <Patrick.Stickler@nokia.com>
- Date: Tue, 12 Jun 2001 14:44:48 +0300
- To: jborden@mediaone.net, Patrick.Stickler@nokia.com, www-rdf-interest@w3.org
> -----Original Message----- > From: ext Jonathan Borden [mailto:jborden@mediaone.net] > Sent: 11 June, 2001 18:36 > To: Patrick.Stickler@nokia.com; www-rdf-interest@w3.org > Subject: Re: A proposed solution to the RDF syntactic/semantic mapping > problem (long) > > > Patrick: > > > > > === Claims === > > > > Claim 1: A namespace and name pair does not constitute any kind of > > universal semantic identity, only a unique syntactic form which > > can be associated with some semantic identity. > > Since I have no idea what a "universal semantic identity" is, > nor do I know > if one exists (I strongly suspect that there does not exist > 'universal' > agreement on any of these issues) this statement is probably true. Yeah. Point taken. Sorry for the imperfect choice of terms. I was trying to achieve a somewhat discipline-neutral definition. By, "univeral semantic identity" I mean the "concept" for which the namespace + name pair serves as potentially one of many possible identifying "signs". It is unlikely that we wish to restrict the set of URIs which can act as signs of concepts to only those URIs which can be constructed by the concatenation of namespace URI and name -- simply because we use XML as our serialization mechanism. Furthermore, any given serialization model (DTD/Schema) may represent a localized or custom syntactic representation for an agreed intersection of semantics shared by other differing syntactic representations. > > > > Although names within namespaces do serve to differentiate content > > which is attributed meaning, and that meaning is typically (though > > not necessarily) suggested by the linguistic properties of > that name, > > the syntactic form selected for any particular serialization is > > local to that serialization and many syntactic forms may map to the > > same common semantics. > > Yet XML Schema, for example, uses QNames not URIs to denote > types, so really > depending on the application, either a QName or a URI may be > the primary > means of identifying some 'thing'. I don't see how you interpret the use of QNames as not being the use of "namespace#name" forms, as that is the interpretation imposed upon QNames by XML Schema. Furthermore, one can argue that XML Schema has an implicit assumption that any QName referenced in a schema in in fact a URI reference into the same or some other XML Schema instance, and *not* into any arbitrary resource dereferencable from some combination of namespace URI and name. Furthermore, QName prefixes only have meaning within a single instance (or within the scope of a single element if defined for that element) and therefore cannot serve as identifiers beyond such syntactic boundaries. > > The syntactic form provides a mechanism > > by which we may define a mapping to that universal meaning, but it > > does not serve itself as the universal identifier of that meaning. > > Again, not sure how a URI universally identifies a "meaning". > It identifies > a resource but isn't it the point of an ontology/schema etc > to define a > "meaning"? Sorry that this argument isn't so clear. I'll try again... There are vocabularies and then there are vocabularies. The set of vocabularies which can be encoded using arbitrary URI references is a superset of the vocabularies that can be encoded using namespace plus name pairs -- *if* those are to be concatenated into a single URI (reference); because there are URI schemes which do not lend themselves to direct concatenation, nor is direct concatenation guarunteed to produce a valid URI according to the URI scheme syntax or possible MIME content type fragment syntax. Thus, a large consortium of persons/organizations may wish to use a URI scheme (e.g. a URN scheme) that is not compatible with namespace plus name concatenation in order to define a common vocabulary (ontology) of abstract concepts (semantics) to serve as a point of intersection between a disparate set of serialization vocabularies, for the purpose of knowledge interchange and interoperability. Thus any given namespace plus name pair in any given serialization does not constitute the common meaning that that syntactic form serves to represent, but must be mapped to that common "sign" associated with the abstract concept. > > > > Claim 2: A name within a given namespace does not equate to a URI > > reference of that name within any content dereferencable from the > > namespace URI reference. > > > > I.e. "namespace" + "name" != "namespace#name". > > I suppose it depends on what you expect the "name" to reference. > > I consider this a bug not a feature. Eh? A fragment in a URI reference is specific to the MIME content type of the data that is accessible from the URI. That means that any ontology defined using signs which are URI references constructed by the combination of namespace URI and name with intervening # are bound to the syntax of a given MIME content type. Furthermore, just how do you handle clearly broken URI refs such as the following: "http://foo.com/bar.html#boo" + "bas" -> "http://foo.com/bar.html#boo#bas" Eh? I again assert: "namespace" + "name" != "namespace#name" > ... Furthermore, as a > > given namespace may have serializations defined in various schema > > formalisms, each potentially having different MIME content types > > with potentially different fragment schemes, yet all defining > > the same namespace URI and name, there is then potentially a many to > > one mapping from namespace and name pair to URI reference into each > > of those schema instances. > > This is a mess. And the mess is because, due to the fact that most folks equate URI to URL and URL to HTTP URL and furthermore sincerely wanting and needing that namespace URIs actually dereference to something recognizable and concrete, they assumed that "namespace" + "name" == "namespace#name" and that "namespace" is a URL and *not* a URL reference. And to make RDF work, added the hack "{URL}#" suffixing the '#' on the end so that the concatenation would create (presumably but unreliably) a URL reference that might be dereferencable. Yes. The real situation is a mess -- but only because the presumed automatic mapping of namespace and name to some combined URI does not in fact work for arbitrary namespace URI references and arbitrary URI scheme and MIME content type fragment syntaxes. We just need to add the explicit mapping mechanism that *does* work. > > > > Claim 3: We cannot use concatenation, suffixation, insertion or > > any other method of combining a name with a namespace URI reference > > to obtain a compound URI reference without violating the sanctity of > > either the URI scheme and/or some MIME content type fragment syntax > > space. > > What sanctity? We need to define practical and interoperable > ways of dealing > with QNames and URIs. The _goal_ is to create systems that > work, not to > maintain URIs and RFC 2396 on a pedestal, even when that > pedestal is sitting > right in the middle of the Santa Monica freeway -- or I-93. As I said, if RDF just wants unique strings and doesn't demand valid URIs, OK, no problem with "invalid" URIs *BUT* that means that RDF must provide some *other* means to ensure unique strings! *AND* no RDF/SW application can presume that those strings are anything but opaque, and should not expect them to be dereferencable or meaningful to any web application or protocol that knows about certain URI schemes. If you can get the RDF spec changed thus, more power to ya ;-) I don't think abandonment of URIs for RDF resource identity would be a good think (I actually think it would be catastrophic). I also don't think that maintaining URIs in RDF is blind dogmatism either. And the freeway/interstate isn't built yet, so let's not tear down the pedastal if we can build the road around it eh? > > Claim 4: The current methodology employed by RDF to attempt > to create > > a semantic resource identity by direct concatenation of namespace > > and name does not ensure the preservation of the uniqueness > of namespace > > qualified names. > > agreed. > > > > > This example, along with the discussion in claim 2 about unclear > > re-partitioning of combined URI references, demonstrates > the fact that > > the uniqueness of a namespace and name pair has three elements: > > (1) the unique namespace, > > (2) the unique name within that namespace, > > and > > (3) a distinct boundary between the two. > > agreed. > > > > > Step 2: Provide for explicit mapping between syntactic forms and > > semantic resources. I.e. for mapping rdf:ID values to > rdf:about values. > > > > This is achieved by the following two methods: > > > > Mapping method 1: RDF > > Why not "daml:equivalentTo" or "rdfs:isDefinedBy"? Firstly, the syntactic to semantic mapping (i.e. serialization to triples) is IMO the domain of RDF, not RDF Schema or DAML and therefore should be fundamental to the RDF spec and the solution embodied in every compliant RDF parser. Secondly, we must define a mapping from two distinct (possibly three, given literal to resource mapping) syntactic components to a single semantic resource: 1. namespace 2. name 3. PCDATA Since the whole problem is that there *isn't* yet a single resource identifying the "sign" comprised of the above three components, just how do you use daml:equivalentTo or rdfs:isDefinedBy?! One needs a construct such as the proposed rdf:Map element that binds the multiple syntactic components to a single resource identity. Until that is done, RDF Schema and DAML (or any other valid RDF ontology) are useless. Eh? RDF Schema (with the exception of convenience overlap with the proposed rdf:Map construct) and DAML are firmly and completely within the domain of triples -- not serializations. If you don't know the URI reference of the resource in question, you can't say things about it with RDF Schema, DAML or any other ontology. My proposal addresses the mapping of complex, multi-component serialized syntactic forms (signs) for concepts to single, monolithic (and likely standardized) forms (signs) for those same concepts within an RDF knowledge base of triples. This should happen well before RDF Schema and DAML come into play. > Isn't that the role of an ontology? An ontology is a vocabulary and (optionally) relations between members of that vocabulary, right? Ontologies within the realm of RDF Schema, DAML, etc. require single, monolithic, valid URI references acting as the members of the vocabulary of that ontology, right? But the logical, compound construct (namespace(name)) or (namespace(name(PCDATA))) as provided by XML serializations utilizing XML Namespaces are *not* single, monolithic, valid URI references. Therefore, they *cannot* serve as members of any vocabulary for any ontology that would be suitable for RDF, RDF Schema, DAML, etc. etc. There is needed an explicit, consistent, standardized mechanism for mapping serialized vocabulary constructs to URI references for the resources they represent. Whether or not that mechanism resembles what I proposed, or is something totally different, it presently does not exist and absolutely must exist, and soon. > > > > Mapping method 2: RDF Schema > > [snip] > > I agree that this mapping is needed. I would prefer to see > such a mechanism > within RDFS/DAML. But, as I argue above, the mapping from serialized forms to triples is the exclusive domain of RDF and by addressing it within a higher layer, we further confuse the issue, and also complicate those higher layers unnecessarily. > [snip] > > > > === Regular expression constraints on syntactic literals === > > this seems an extension of RDF aboutEachPrefix... an > interesting idea and I > can certainly see how it might be useful but given the problems that > aboutEachPrefix has had in gaining traction, it would be hard > getting this > accepted. Yes. I think you may be right on that point. The specification of a PCDATA literal in a mapping is clearly necessary as part of the syntax/semantics interface; but regular expressions are either simply a syntactic convenience to avoid having to enumerate multiple literal mappings or constraints similar to aboutEachPrefix such as for the purposes of data checking. This is something that needs more thought/discussion.... Cheers, Patrick -- Patrick Stickler Phone: +358 3 356 0209 Senior Research Scientist Mobile: +358 50 483 9453 Software Technology Laboratory Fax: +358 7180 35409 Nokia Research Center Video: +358 3 356 0209 / 4227 Visiokatu 1, 33720 Tampere, Finland Email: patrick.stickler@nokia.com
Received on Tuesday, 12 June 2001 07:45:08 UTC