- From: Lee Jonas <lee@oakglen.netkonect.co.uk>
- Date: Mon, 7 Aug 2000 11:03:48 +0100
- To: <www-rdf-interest@w3.org>
I have only recently joined this list. I have scanned most of the archive but please forgive me if I am rehashing anything. I don't doubt that a log syntax which closely resembles the triples is useful to RDF developers. However, it side-steps the issues I was raising, which is far more fundamental to RDF Syntax. Dan Brickley <mailto:danbri@w3.org> wrote: < >Here are some things that could happen. We could/should bring the >errata document for RDF M&S up to date with the experience of RDF >implementors, and make available answers to FAQs where these >seem clear, and writeup summaries for topics (eg. the xmlns >prefix pairing business) that are perhaps not so clear. We could >explore possibility of new work on a 'better' rdf syntax, either >as a W3C Working Group as an informal effort amongst RDF >implementors on this list, with the intention of publishing >either a new 'better syntax' REC (which would be a substantial piece >of work) or an informational W3C Note outlining an alternative >XML syntax for RDF models. ('we' being the RDF implementor >community, ie. RDF IG) I for one think Dan's suggested course of action is not only reasonable but necessary. Though, if I am correct, my concerns over the misuse of namespaces are significant enough to warrant further activity to define a 'better' RDF Syntax (or some general web data graph syntax), which will also have a knock on effect into the RDF Schema spec as well. W3C could let RDF Model & Syntax 1.0 remain a recommendation and issue a Note with an alternative syntax. Although this would be a pragmatic work-around, I believe that RDF Syntax is in error and hence am not convinced it is sufficient. If the problems are deemed serious enough, the W3C could begin a working draft that will supercede it. Only that way could the qname-to-URI mapping be fully removed from the recommendation. If the latter is necessary, then I believe RDF Schema should not become a recommendation until syntax issues are resolved. Dan Brickley <mailto:danbri@w3.org> wrote: >I'm not personally convinced that a new alternate RDF syntax is a >priority right now, though I'd like to hear arguments to the contrary. I realise that there are some early implementations and at this stage in the recommendation process there would be an understandable reluctance to alter the existing code base. However, RDF is not yet widely adopted. Any delay now could mean far more dependencies on the current (erroneous?) RDF Syntax. In addition, I believe the added complexity and confusing nature of current RDF Syntax could be a barrier to its widespread adoption by the general web community. The sooner an alternative syntax is developed, the sooner RDF can realise its full potential. So, Dan, perhaps you could review the concerns at the end of this message for inclusion in the 'developer issues' list. In Summary, I think that: 1) qname-to-URI mapping is a perversion of XML Namespace and has subtle, yet fundamental negative implications. It should be withdrawn from the current RDF M & S recommendation. 2) RDF Syntax must be a well-defined, finite set of element types and attribute names encapsulated within the rdf namespace. 3) RDF Syntax must consist of one and only one clear way of serialising RDF model, no alternative 'abreviation' syntax forms. This is not to say that RDF Syntax is the only way of serialising RDF Model. Other general web data graph syntaxes could be investigated for this purpose (though not XLink). 4) RDF Schema must not reflect any aspects of syntactic validity. 5) RDF Schema documents must not be locatable implicitly from any namespace URI. Dan Brickley <mailto:danbri@w3.org> wrote: >we should step back and ask for characterisations of what we >want from an XML syntax for RDF. What are the must-haves? What would the goals >be for any effort to provide a 'better' syntax? ie. what would make it >better...? If people agree that further activity is necessary, I will translate my thoughts into what I believe should be included in the list of goals, must-haves, etc. Anyway, here are my main concerns: 1) Mapping a QName to a RDF Identifier ====================================== RDF specifies that all resources should be identifiable by URIs. This is a good aspect of RDF as it allows decentralised (and even, in XLink parlance, 'third-party') resource descriptions and resource-type hierarchies (i.e. using rdfs subClassOf & subPropertyOf). However, IMHO RDF perverts the intended use of namespaces to achieve this. The use of a qname, to map a localpart element type or attribute name together with its namespace's unique URI into a RDF identifier, causes major problems. I believe this to be the case whether that qname is an element type, an attribute name, or even an attribute value or text-node. Specifically: a) mixing RDF with other markup vocabularies -------------------------------------------- From Namespaces in XML (http://www.w3.org/TR/1999/REC-xml-names-19990114): 'We envision applications of Extensible Markup Language (XML) where a single XML document may contain elements and attributes (here referred to as a "markup vocabulary") that are defined for and used by multiple software modules. One motivation for this is modularity; if such a markup vocabulary exists which is well-understood and for which there is useful software available, it is better to re-use this markup rather than re-invent it. Such documents, containing multiple markup vocabularies, pose problems of recognition and collision. Software modules need to be able to recognize the tags and attributes which they are designed to process, even in the face of "collisions" occurring when markup intended for some other software package uses the same element type or attribute name.' Although mappings between XLink & RDF have been proposed (using XLink as an alternative web data graph syntax in order to serialise RDF Model), I would like to use a hybrid of the two that combines both metadata and hyperlinks - hence, IMHO, XLink should *not* be used as an alternative web data graph syntax for serialising RDF Model. Consider the following: <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:my="http://mydomain.com/mylinks#"> <my:link xlink:type="extended"> <my:locator xlink:type="locator" xlink:href="#A"/> <my:locator xlink:type="locator" xlink:href="#B"/> </my:link> </rdf:RDF> In the true nature of 'Namespaces in XML', with small changes to RDF Syntax (i.e. not using the qname-to-URI mapping mechanism), this could become a *very* straightforward way of combining RDF and XLink - RDF describes resources in terms of metadata (note that the link and locator elements should themselves be treatable as resources by RDF) and XLink interprets this as a 'third-party' link between (RDF metadata) resources. However, my understanding (though I could be wrong?) is that an RDF processor would assume 'xlink:type', 'xlink:href', etc. to be RDF metadata whose definitions are identified by the URIs 'http://www.w3.org/1999/xlinktype', 'http://www.w3.org/1999/xlinkhref', etc. respectively. This is clearly wrong as the XLink namespace has nothing to do with RDF. This problem stems from the fact that RDF uses namespaces to specify RDF identifiers instead of their intended use - for demarking markup vocabularies. Any counter-arguments specific to XLink w/ RDF aside, the example above highlights a general problem mixing RDF with any other markup vocabulary. b) Fuzzy segregation between RDF Model and RDF Syntax ----------------------------------------------------- It seems to be a contentious issue that Model & Syntax aspects are so closely intertwined, even to the extent of combining them into the same specification (rightly so, IMHO). The qname-to-URI mapping mechanism allows serialisation of resources from the abstract RDF model into a multitude of different markup lexemes (i.e. element types and attribute names). I strongly suspect that this direct encoding of abstract entities into (an open-ended collection of) syntactic constructs is the reason why there is such a degree of 'tangling' between RDF Model & RDF Syntax. If a well defined, finite RDF markup vocabulary was used, RDF Model would become totally distinct from RDF Syntax. c) XML Schema vs RDF Schema --------------------------- XML Schema also utilises namespaces to imply the location of schema documents. Does the resulting ambiguous nature of schema document location by namespace then make XML Schema and RDF Schema incompatible? My guess would be 'yes, unless you invent yet another overly complex way to untangle the mess.' AFAIK, XML Schema is an alternative to DTDs for describing the validity of an XML document (markup and data). As such I view it is a 'syntax schema' and associations with namespaces (which after all distinguish different syntaxes) is justified. Also, as RDF Syntax should be the mere serialisation of RDF Model into an XML document, it seems reasonable to want to make assertions about the validity of those documents using XML Schema. Due to fuzziness between RDF Model & Syntax as outlined above, RDF Schema is forced to imbue RDF Syntax validity assertions, hence a potential clash with XML Schema. However, I believe it would serve a far better purpose to describe the validity of RDF models at an abstract level without regard to its encoding in XML. Note that this would be invaluable to facilitate encoding RDF with some general web data graph syntax in common with other XML technologies (if feasible) e.g. SOAP. Totally separating RDF Model from RDF Syntax would allow RDF Schema to become a pure 'model schema'. It would still be specifiable as any other RDF Model and hence serialised into RDF Syntax the same as RDF instance documents. Indeed, there is no reason why you couldn't have RDF validity assertions internally within instance documents. It follows that, as a 'model schema', there should be no association between namespaces and RDF schema documents - an alternative mechanism should be used, e.g. either implicitly from the URI specified by the 'rdf:type' property, or by the 'isDefinedIn' property if the type is identified by a URN (I retract my suggestion in a previous message that resource identifiers should only be URLs). The logical view then becomes: ++++++++++++++++++++++++++++++++++++ Application Layer ++++++++++++++++++++++++++++++++++++ RDF Model | Abstract, validated by RDF Schema ++++++++++++++++++++++++++++++++++++ RDF Syntax | XML doc, validated by DTD / XML Schema ++++++++++++++++++++++++++++++++++++ d) What real purpose does it serve? ----------------------------------- I can't think why the current qname-to-URI mapping scheme is in place, apart from for the sake of brevity. Note that with resource URIs specified as attribute values and / or text nodes, there are two ways to abbreviate them that I can think of right now: General Parameter entities: <!DOCTYPE rdf:RDF [ <!ENTITY my 'http://mydomain.com/myschema/'> ]> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns"> <rdf:Description rdf:about="&my;#SomeResource"/> </rdf:RDF> or XML Base: <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns" xml:base="http://mydomain.com/myschema/"> <rdf:Description rdf:about="./#SomeResource"/> </rdf:RDF> 2) XPath ======== Dan wrote: << The XSLT / Semantic Web Screenscraping threads on this list have shown how we can extract RDF models from all manner of well managed XML data >> Let me clarify what I meant. My concern is about going in the other direction: 1) XPath and XSLT - Suppose I want to visualise RDF in a page by turning resource descriptions into html tables containing name-value pairs of all their properties. 2) XPath and XPointer - Suppose I want to mix RDF and XLink vocabularies and add hyperlinks, using XPointer, between resource descriptions. With the various abbreviation forms available, there are at least two or three ways of saying the same thing with current RDF syntax. This means unless I know the style used in advance, I have to specify the union of different nodesets, one for each syntax variant. In addition, with an unlimited, arbitrary set of element tags I cannot write generic XPath selections that use specific element types as axes. Instead, the best I can do is rely on position and levels of nesting within the document, if known. Indeed there was an earlier post that contained a presentation proposing just this. However, it relied on the RDF being 'canonicalised' so that *all* resource descriptions had to be children of the document element, and properties referred to other resources by reference only. This seems far too restrictive to me. Alternatively, I would have to select all nodes (i.e. an axis of '//node()'), then filter down to the ones I want based on using attributes and element tag names as predicates. Any way you look at it, my XPath string gets unneccessarily complex. I am not sure what the impact is on performance, but I would guess that more processing would make it slower. Regards Lee
Received on Monday, 7 August 2000 06:02:36 UTC