- From: Dan Brickley <danbri@w3.org>
- Date: Sat, 20 Jul 2002 00:04:55 +0000
- To: RDF Core <w3c-rdfcore-wg@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 FYI, I just sent in the following as last call comment on SOAP 1.2. I'll carry on with the implementation report anyway, as part of my SemWeb work, though dissapointed that I didn't get to do a more explicitly implementation-led LC report. Hope it's useful anyway... Dan - -------- Original Message -------- Subject: LC comments on SOAP 1.2 adjuncts: re SOAP Encoding (re URIs, test suite, SOAP Data Model Schemas?) Date: Fri, 19 Jul 2002 23:24:28 +0000 From: Dan Brickley <danbri@w3.org> To: xmlp-comments@w3.org CC: danbri@w3.org Hi Here are my last call review comments on the SOAP Encoding and Data Model portion of the 1.2 spec, in document order. I should prefix by saying that while I jump straight in with the nit-picking, this document is a huge improvement on the previous version. Nice work! :) Summary: Reading back, my comments are variations on a theme: the SOAP Encoding and its Data Model would benefit from a more explicit account of the mechanisms by which node and edge types for use in SOAP graphs might be defined. There are a few places where the use of URIs might make it easier for other successor specs to flesh out such details (eg. URIs for kinds of edge label, for nodes, for node types). URIs: I specifically request one change to the spec in the light of implementation experience with SOAP: please specify a mechanism for identifying SOAP graph edge labels with URI/URIref names. Identifying nodes and their types would be useful, but identifying and describing SOAP edge labels is (in my database-oriented implementation) critical. We have a number of other activites at W3C that could support the richer description of SOAP graph edge labels: Web Service Description effort, RDF and RDF Schema, as well as the Web Ontology work. Providing URI/URIref names for graph edges is a very minimalistic hook that will make integration with such efforts cheaper and simpler, without adding to the implementation burden for SOAP implementations. Test cases / machine checkable test suite: I don't comment on the fine-grained detail of the SOAP Encoding itself, except to say the following: please seriously consider creating a machine-usable test suite for this work. Defining a graph encoding syntax in XML is a slippery task, and it is easy to make mistakes, both in the specification and in implementations. Apparent interop amongst deployed SOAP toolkits may reflect shared understanding in that part of the Web community as much as it reflects precision in the formal specification of the encoding rules. As this work moves into the wider Web community, we'll likely see more unexpected corner cases. Historical note: this happened with RDF and the RDF/XML graph encoding syntax. We had to clean up RDF's graph encoding rules post-REC. As a result of that experience, we have reworked the RDF specs to use a more mathematically precise account of the abstract graph model, accompanied by a machine-processable set of test cases. See http://www.w3.org/RDF/ for details. I urge the XMLP WG to gather similar test cases before proposing SOAP 1.2 goes to REC. It's very easy to get bugs in XML graph encoding rules. Having a test suite offers very useful protection against this. Comments follow below, intersperced with excerpts from the spec. These specific requests, queries and comments aside, this is really promising, interesting and worthwhile work, and the spec is much clearer than the version I reviewed last year. The effort that XMLP have put into this is really showing and is much appreciated by those of us at the receiving end of the specs. cheers, Dan (RDF Core WG co-chair; RDF IG chair) http://www.w3.org/TR/2002/WD-soap12-part2-20020626 - -> http://www.w3.org/TR/soap12-part2/#datamodel [[ 2. SOAP Data Model The SOAP Data Model represents application-defined data structures and values as a directed edge-labeled graph of nodes. Components of this graph are described in the following sections. ]] The concept of 'application-defined' is somewhat unclear: are these data structures defined by the producers of the data, by consumers? what form do these definitions take? Might we expect to read a schema definition (W3C XML Schema? RELAX-NG?) that made such definitions explicit, or are the definitions expected to be implicit. For example, if I deploy a Java-based SOAP service, my application-defined data structures might be described in terms of Java, yet exposed to the world through the SOAP Encoding Data Model. Readers of 1.2p2 could reasonably ask: 'what technology can I use to to expose my application-defined data structuring conventions? The SOAP 1.2p2 Encoding explains how to expose instance data, but gives little account of how the underlying principles that tell us whether or not a particular SOAP graph meets the 'application-defined data structures' for a given service. Is there an expectation that technology will evolve to fill this gap? (a SOAP Encoding Schema Language has been mentioned in some discussions on xml-dist-app and www-ws-desc). If so, please make this expectation clearer in the specification (there is an aside later in the spec, but it isn't very detailed). If not, please note that SOAP 1.2 does _not_ specify any mechanisms by which applications which use SOAP Encoding can describe the SOAP Encoded data structures they understand. [[ The purpose of the SOAP Data Model is to provide a mapping of non-XML based data to some wire representation. It is important to note that use of the SOAP Data Model, the accompanying SOAP Encoding (see 3. SOAP Encoding), and/or the SOAP RPC Representation (see 4. SOAP RPC Representation) is OPTIONAL. Applications which already model data in XML, for example using W3C XML Schema [4],[5], may not need to use the SOAP Data Model. ]] As an introduction to the role of the SOAP Data Model, this could be clearer. It explains that one might take the Schema-based approach, or that one might take the SOAP Encoding approach, but offers little to motivate either decision. For a fresh application with no existing commitment to a Schema-based approach, the specification currently offers little advice to help SOAP adopters choose which path to take. Are there identifiable benefits for using SOAP Encoding over a Schema approach? Perhaps (for example) that Web services can be deployed faster using object-to-XML encodings than through hand-crafting an XML Schema? The current text doesn't really sell us on the utility of SOAP Encoding; on the contrary, it has a somewhat wary, cautious tone, yet doesn't provide technical details on the tradeoffs. If you could add 2-3 bullet points to aid SOAP adopters make an informed decision here, that might help. [[ 2.1 Graph Edges An edge MAY originate and terminate at the same graph node. ]] addition clarification / test case: May a graph contain more than one edge with the same originating and terminating node? (and can such a thing be serialised? in the current Encoding rules? in other hypothetical encodings?) [[ The outbound edges of a given graph node MAY be distinguished by label or by position, or both. Position is a total order on such edges; thus any outbound edge MAY be identified by position. ]] This is a bit confusing. Whose freedom does the 'MAY' refer to? Consumers of the data? Or definers of a SOAP Data Model-based application data formats? (see above re Schema languages). The notion of 'position' is introduced with reference to 'such edges'. But which ones? All of them, since 'any outbound edge MAY be identified by position'? Are there edge types for which position is irrelevant? (does 'position relate to 'document order' in the concrete XML Encoding of the data model?). I'm not sure I understand this paragraph enough to comment sensibly. [[ 2.1.1 Edge labels An edge label is an XML Schema Qualified Name (see XML Schema Part 2: Datatypes) ]] Spec change request: Please specify an algorithm (for example, simple concatenation) by which SOAP graph edge types (labels) can be named using URI/URIref syntax. This will make it much easier for out-of-band metadata, including but limited to RDF/XML metadata, to provide further information about the kinds of edges deployed in SOAP Encoding applications. For example, I have a SOAP encoding application (using SOAP 1.1, being upgraded...) in which the serialised objects represent software packages. It uses edge labels such as 'ownerMailbox', 'homepage' etc. If these had URIs, we could write external RDF/XML descriptions about those edge labels, for example mapping to other SOAP Data Model constructs from similar applications created elsewhere, or specifying mathematical characteristics of the graph (eg. that certain edge labels have an 'at most one' semantic, a characteristic that can support graph merging algorithms and hence Web Service aggregation). (more details of this on request... I want to get these comments in before last call closes or would provide examples from the implementation) question: What is the relationship between node types and edge label types in SOAP encoding? Can they be mixed freely? Can I use node types defined (somehow...) by one application, with instances of that node using edge labels drawn in multiple other schemas? Are there any rules constraining the sensible combinations of node and edge types. Specifically, does the type of a node determine the edges that be attached to it? Does each kind of edge label have node types that they can point to and from? Implementor feedback: I am storing and merging de-serialised SOAP Encoding messages in a database system. To implement, I had to assume an answer to these questions. I assumed that the SOAP Data Model allowed namespace mixing amongst types and edges, and that node types do not dictate the edge types for a node. [[ 2.2 Graph Nodes [...] Both types of graph node have an optional unique identifier of type ID in the namespace named "http://www.w3.org/2001/XMLSchema". ]] Since the SOAP Data Model is defined in the abstract, separate from any specific XML (or non-XML) syntactic encoding, it isn't clear why XML's notion of ID is being used here. The spec says the node has a "unique identifier", but does not define the scope of this uniqueness. XML IDs are unique within some document. Is this an implicit constraint on all SOAP Data Model XML encodings, ie. that we have the rule of one Data Model graph per XML document? (to avoid unique ID clashes). Is the ID unique within the scope of one graph, or one encoding as an entire XML document of such a graph? request: please allow nodes to be identified by URI/URIref (same goes for node types btw; I won't recycle this comment for that part of the spec). so, please allow nodes, and their types, to be identified by URI/URIref. The use of a global unique identifier here (ie. URI/URIref) would remove the question of identifier scope, since in a Web (Service) context, URI identifiers won't accidentally clash. This might help decouple the abstract Data Model from the specifics of its XML encoding. It would also support data merging between SOAP graphs that shared node identifiers, but that's an added bonus. [[ 2.3 Values If the labels of a non-terminal graph node's outbound edges are not unique (i.e. they can be duplicated), the non-terminal graph node is known as a "generic" ]] Seems odd. How do we know such things about edge labels? No mechanism has been described whereby we could acquire such metadata. [[ Outbound edges of a generic MAY be distinguished by label and/or position, according to the needs of the application. ]] Which application? This is even more confusing, unless I'm missing something. The impression I'm left with is that the meaning of a SOAP Data Model Graph is rather fluid, and open to competing, rival interpretations (eg. multiple consumer apps, or creators of namepsaces used in the encoding, vs creators of services that use those namespaces). If there was a SOAP Data Model schema language, it would presumably address constraints such as those described in 2.3. In its absence, there appears to be no authoritative account of the rules governing each kind of SOAP Data Model edge label. Section 2.3 should either be removed or augmented with a description of how (possibly out of band) metadata might provide such information in a machine-readable format. Without an account of this, word of mouth seems to be the only way to acquire such information. [[ 3.1 Rules for Encoding Graphs in XML ]] This bit of the spec is much improved from the previous WD; thanks! [[ 3.1.4 Computing the Type Name property Note: These rules define how the type name property of a graph node in a graph is computed from a serialized encoding. This specification does not mandate validation using any particular schema language or type system. [...] However, nothing prohibits development of additional specifications to describe the use of SOAP with particular schema languages or type systems. ]] This aside partly addresses some of my questions above. Perhaps it should have more prominence in the spec, since it (?) relates to edge types as well as node types, and to general issue of extensibility and further development of the Web service model. One clarification request: where it says 'the use of SOAP with particular schema languages', does this mean 'the use of the SOAP Encoding Data Model with particular schema languages? ie. are you leaving open the possibility that Web Services may be able to provide additional metadata about their use of the Encoding and associated Data Model? (and relating to edge labels and their characteristics, as well as node types). [[ ... Such additional specifications MAY mandate validation using particular schema language, and MAY specify faults to be generated if validation fails. Such additional specifications MAY specify augmentations to the deserialized graph based on information determined from such a validation. ]] This seems rather challenging from an extensibility and future proofing point of view. If I implement a SOAP 1.2 tookit now, including SOAP Encoding support, how would such running code know when it had encountered use of such an 'additional specification'? Is this a scenario where the SOAP 'mustUnderstand' mechanism should be used? If deployed 1.2 clients will be ignorant of 1.2++ services that use such mechanisms, this could cause problems. Misc other comments: I understand SOAP sevices can now be deployed with a GET binding. This means we can expect to see things like HTML documents hyperlinking into SOAP services which return SOAP Encoding data graphs. - can these by styled with XSLT? eg. a stockticker might return XML for SOAP clients, but be XSLT'd into XHTML for humans. (ie. is it legal to include stylesheet PIs?) - can protocol oriented header information be ommitted? for simple lookups, we often might want nothing more than the graph data itself. Would this be legal? Could we use the SOAP mime type? - SOAP Encoding is a useful syntax for dumping programmatic objects into XML. Please consider making it easier for non-protocol uses to be made of it. I could easily drop object serialisations onto an FTP site, for example. Or deploy them on a normal HTTP server using normal HTTP content negotiation, so humans got an HTML version of a document, and SOAP clients got the graph encoding. The current spec doesn't seem to anticipate such re-use. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQE9OKkuPhXvL3Mij+QRAt+NAKCPP42CBcC7d1aUz7tS9HeB0/4sAgCglWyK AWvFpHBCxnMqSL0utaWfNW4= =jjPM -----END PGP SIGNATURE-----
Received on Friday, 19 July 2002 19:44:46 UTC