XML Schema WG comments on Web Architecture draft of 9-Dec-2003

** General

The architectural story is told from the point of view of a human browser. This
is certainly an important use case, but what of machine-to-machine
interactions?  Some of the "best practices" may not be appropriate for certain
kinds of machine-to-machine interactions and may in some cases constitute worst
practices.  At the very least the first paragraph of 3.7 should be brought
forward and highlighted: what is actually described here is not the
architecture of the Web but is the architecture of the human-browser Web, and
at that somewhere between a Web best-practices document and an architecture for
a well-designed Web resource, in most cases implicitly using HTTP. We suggest
that you consider more examples of a different kind, for example, a spider
(such as the Google spider) and an application-to-application interaction that
does not involve content aimed at human beings. For example, consider how a
travel agency can interact with airline systems without human interaction in
the exchange to locate and book a high volume of tickets.

Beyond this, however, a Web architecture should give at least guidance and
preferably strict rules for the range of ways in which you can change or
augment designs and implementations while still supporting the overall
architecture.  The purpose of an architecture (as opposed a design) is to allow
for controlled evolution of the design in the face of change. We invite the TAG
to keep in view the kinds of changes that have occurred over the last 25 years
and attempt to define an architecture that will be robust against similar
changes so that it may stand for 15 to 25 years rather than for 5 to 10 years. 

In the case of the Web one might ask, for example: "what are the rules for
designing new URI schemes and the protocols that support them"?  If such
protocols offer representations, must those be in the form of octet streams?
Section 3.2 implies "yes", but we hope the answer is no.  Surely it ought to be
possible, without changing the overall architecture of the Web, to define a
scheme that offers representations that consist of two octet streams (audio and
video?) or a stream of nybbles (decimal data?) or 5-bit Baudot code
(http://home.austin.rr.com/kinghome/signpage/baudot.html).  If octet streams
are a requirement, must media-types be used for typing, or can other means be
used?  Presumably media-types are preferred but others allowed in
principle. That is, we assume that file: URLs are intended to be encompassed 
by the Web architecture.  Yet sections like 2.4 say things like:

  "If the motivation behind registering a new scheme is to allow a software
   agent to launch a particular application when retrieving a representation,
   such dispatching can be accomplished at lower expense via Internet Media
   Types."  

Should that not say: "via a representation typing system suitable to the
resource and scheme (e.g. Internet Media Types in the case of HTTP)"?

Another example is in 1.2.1 which says a URI can be assigned without knowing 
what representations are available.  What if a scheme turns out not to 
support octet streams and media types, or what if one chooses a URI and it 
turns out one's representation is a pair of video and audio streams?

The above are just examples: altogether, we think the document would be much
improved by being less vague on layering and the degree to which things like
octet streams, HTTP, and Internet Media Types are architecturally baked in, or
are just current best or common practice.  An architecture document should at
least at least give guidance on such questions, should preferably be rigorous,
prescriptive, and proscriptive, and should be written from a consistent
perspective as the details are discussed.

One concrete suggestion would be, wherever practical, to show examples of each
point using a range of schemes including not just HTTP but others, for example,
uuid:, file:, mailto:, and from time to time one or two mythical ones that
could be used to illustrate the limits of the architecture.  We think this
would highlight the ways in which the Web architecture and abstractions like
"representation" and "safe retrieval" are to be applied in contexts other than
HTTP.  Section 3.1 is an example of a section that should be illustrated with
multiple schemes.  How much of 3.1 applies to mailto:, for example?

Another suggestion is to be concrete and provide non-examples, so it is clear
whether certain scenarios are explicitly disallowed or put outside the domain
of the Web, or have simply been overlooked or passed over in silence.

** Technical

[1.2.3] "Silent recovery from error is harmful."

The validity of this principle depends very much on what level one is talking
about. Is silent recovery from packet collisions harmful? With an ECC memory:
should every correctable memory problem be reported?  Must an application that
normalizes input data so that out-of-range values are normalized to the valid
extreme of the range report every bad data item? To whom?  (Suppose the data
stream represents instrument measurements streamed to a Web display.) We
suggest that the rule be:

    "Silent recovery from errors may hinder problem diagnosis. Furthermore,
     silent recovery of errors resulting from erroneous input may
     inappropriately promote use of non-compliant data formats."

(We recognize that this turns a terse six-word rule into a less punchy
twenty-five word rule.  But the six-word rule is really not plausible or
believable as drafted.)

We also note that there is a tension between this principle and the notion of
"must-ignore". For many applications, "what you don't understand" is equivalent
to "an error". So one principle says you should ignore (presumably silently)
this data, and the other says you should not. 

[Section 2] assumes that identification and retrievability are the same thing.
Given the extensive use, starting with namespaces, but continuing with the
identification of XSLT and XQuery functions, and so on, of using URIs to
identify non-retrievble and abstract entities, this conflation is problematic
at best.

[2.3 (on URI ambiguity) and 3.6.2 (persistence and inconsistent 
representations)]
The architecture document needs to do a better job of explaining what a
resource is in this context.  For example, we all accept that
http://www.w3.org/TR/webarch/ will serve different versions of the  
architecture document over time, and is in that sense ambiguous.  Conversely,
that same http://www.w3.org/TR/webarch/ URI and 
http://www.w3.org/TR/2003/WD-webarch-20031209/ are at some points in time 
two URIs for the same underlying document.  Please make clear the 
appropriate architectural interpretation and terminology :  (1) at this 
point in time, are there two resources with the same representation, or 
(b) there is one resource that for now has two names (URIs.)  At the very 
least, make clear that inventing resources like 
http://www.w3.org/TR/webarch/, http://www.w3.org/, http://www.cnn.com/, etc. 
is encouraged, and in no way violates the spirit of the rules in these 
sections.

In general, we think the document makes things too easy for itself by declining
to define the term "resource". Technical architecture is not theology. Saying
"it's a mystery" is not really an acceptable strategy.  You must provide a
definition.

There is, to be sure, one possible fallback. If definition is infeasible,
"resource" could be taken as an undefined primitive notion.  But just as a book
on the elements of geometry specifies the relations which must hold between
points and lines even while leaving "point" and "line" as undefined primitives,
so also the Web Architecture document would need to specify the relations which
must hold between a "resource" and other basic notions ("owner", "URI",
"sub-resource", and so on).  Please also provide examples (if any exist) of
non-resources, to make the concept clear; or say explicitly that no such
examples exist because everything is always a resource.

Some questions one might hope to have some light shed on by either a
definition or by a non-defining description of resource as a primitive
notion:
  How many resources are there, or how many could there be?
  Can a set of resources be a resource?
  Can a part of a resource be a resource?
  Do all users of the Web operate with the same set of resources,
    or is it possible for one user to identify three resources
    where another identifies only two?
  Who determines the identity of a resource?  
  If the question arises whether two URIs designate the same 
    resource, can there be an authoritative answer to the question, 
    or is it a judgement question like the question 'Is "love" an 
    adequate English rendering for the Greek word "agape"?', on which 
    every thoughtful observer may form an independent opinion?


[3.3 Good practice: Fragment Identifier Consistency]

   "Good practice: Fragment identifier consistency

   A resource owner who creates a URI with a fragment identifier and who uses 
   content negotiation to serve multiple representations of the identified
   resource SHOULD NOT serve representations with inconsistent fragment
   identifier semantics."

If the term "consistent" is here used in a technical sense, please explain what
it means and how inconsistencies are to be detected.  If it is used in a
non-technical sense, please explain what it means.

We note that if fragment identifiers must be usable in more than one MIME
type, the result will be that the only fragment identifiers effectively allowed
will be bare names (or other fragment identifier syntaxes incapable of knowing
about or exploiting any of the structure of the data); it seems undesirable to
impoverish the URI identifier space in this way. 

In general, content negotiation (like server-side browser sniffing) does not
seem to us to be an obviously and universally good thing: it leads to
unpredictable context-dependent results in ways that are actively hostile to
some machine-driven applications, and it interacts in this pernicious way with
fragment identifiers. On the other hand, if content negotiation is indeed
important to make things work, perhaps some advice on whether newly invented
schemes should support the equivalent of content negotiation is in order.

This is not a viable best practice recommendation, except as a bandaid, as it
tightly couples URIs to representations, and constrains representation
evolvability in untenable ways. This appears to highlight a weakness in the Web
architecture that should be explicitly addressed.


[3.3.1] says: 
   "Per [URI], in order to know the authoritative interpretation 
   of a fragment identifier, one must dereference the URI containing the 
   fragment identifier. The Internet Media Type of the retrieved 
   representation specifies the authoritative interpretation of the fragment 
   identifier. Thus, in the case of Dirk and Nadia, the authoritative 
   interpretation depends on the SVG specification, not the XHTML 
   specification (i.e., the context where the URI appears)."

But this seems to contradict the referenced URI specification, which says:

   "The semantics of a fragment identifier are defined by the set of 
   representations that might result from a retrieval action on the primary 
   resource. The fragment's format and resolution is therefore dependent on 
   the media type [RFC2046] of the retrieved representation, even though such 
   a retrieval is only performed if the URI is dereferenced." 

The latter says clearly you need not dereference.  On the contrary, you 
must know the range of representations that you might get _if_ you tried 
to dereference.

[3.3.1] says
   "Given a URI "U#F", and a representation retrieved by dereferencing 
   URI "U", the (secondary) resource identified by "U#F" is determined by 
   interpreting "F" according to the specification associated with the 
   Internet Media Type of the representation." 

What if the scheme is not HTTP and media types are not used (e.g. because the
URI uses the file: scheme or for some other reason)?  Do fragment identifiers
work only with media-typed representations?  We hope not.

[3.4.1] Says that user agents must not silently ignore server metadata.
Metadata covers a lot of ground: what is its scope? May a user agent ignore a
server-specified DTD or Schema and choose to apply a local variant
(e.g. because the user so specifies in a local configuration file or a
launch-time option)? Why not?  

If the sender is not a trusted authority, it would be foolish for the recipient
to rely on the principle of sender-makes-right.  A well written production
server runs an unacceptable risk if it accepts at face value everything
an untrusted client tells it.  Must it inform the client each time it follows
its own instructions by ignoring client information? 

(We also note in passing that focusing on the interactions between
"user-agents" and "servers" is fundamentally limiting in the sense mentioning
in our opening comment. Are not peer-to-peer interactions covered by this
architecture?)

[3.5] says that an interaction is safe if the agent does not incur any
obligation beyond the interaction.  This seems too broad; the TAG has been
advised of other scenarios.  For example, if each access to a resource needs to
be authenticated at the application (not https) level, but no ongoing
obligation is established, this rule suggests that the retrieval is safe.  Is
that really true?  We wouldn't want the access cached, except perhaps by an
application-specific cache that knew our authorization rules.  Consider also
the case where the provider of the resource needs to log the access.  The issue
is an important one, and the summary given here comes close to being an
oversimplification.

[3.5.1] Says: 
   "There are mechanisms in HTTP, not widely deployed, to remedy this
   situation. HTTP servers can assign a URI to the results of a POST  
   transaction using the "Content-Location" header (described in section 
   14.14 of [RFC2616]), and allow authorized parties to retrieve a record of 
   the transaction thereafter via this URI (the value of URI persistence is 
   apparent in this case). User agents can provide an interface for managing 
   transactions where the user agent has incurred an obligation on behalf of 
   the user."

Yes, but is this saying specifically that content-location SHOULD be used? 
If so, so.  If not, then make clearer what's intended.

[3.6.1] 
   Good practice: Available representation

   Publishers of a URI SHOULD provide representations of the identified
   resource.

We are concerned that this appears to privilege dereferenceable URIs over other
sectors of URI space; in particular to denigrate all uses of URIs as pure
(non-dereferenceable) identifiers, such as namespaces, QT functions, SOAP
extensions, SAX properties, etc. etc.  There are often pragmatic reasons for
declining to make URIs dereferencable (unwelcome load on servers, for example,
or identifiers that are intended purely for software systems and that humans
will never see or need to dereference to obtain useful information). It seems
to us that at least a coherent story should be told about how this
pure-identification use fits into the overall Web architecture.


[4.2]  In general, the section on versioning unduly and in too many ways
oversimplifies a complex, subtle, and as yet poorly understood problem.

For example, 4.2.3 says:
   "Language designers SHOULD provide mechanisms that allow any party to 
   create extensions that do not interfere with conformance to the original 
   specification." 

This oversimplifes a very tough tradeoff.  When you allow such extensions, you
promote reuse of the base language for new purposes, and that seems good.  You
also provide for a proliferation of potentially non-interoperable versions
depending on various extensions, as well as ensuring that some data will be
accepted by processors when it is in fact not conforming to a later or extended
definition of the language, but is simply erroneous and ought (if the processor
were only omniscient) to be rejected as such with a useful diagnostic.
That's bad.  

Pursuing the principle enunciated here, one might conclude that maybe XML
should have let anyone who wanted to define new syntactic constructs such as
structured attributes?  They didn't, and interoperability is helped rather than
hurt by such strictness.

There is a strong tension between versioning and extensibility and silent
error handling, once you get away from human mediated interactions and
interactions that do not involve mission- or life-critical applications.
For computer-to-computer mission-critical applications, "fallback behaviour"
is semantically equivalent to "silently handling errors" and the Web
architecture document is thus self-contradictory.

In addition, versioning and extensibility are not solely a property of data
representations, but of protocols as well.

[4.2.3] The discussion of mustIgnore & mustUnderstand should clarify the
difference between marking the distinction in the document instance, in a
schema, or in prose documentation.  SOAP does it with an attribute in the
instance.  Schema content models do it in the schema.  Other systems provide
rules in the specifications.  These have different tradeoffs.

[4.2.4] Says: 

   "In principle, a SOAP message can contain a JPEG image that 
   contains an RDF comment which refers to a vocabulary of terms for 
   describing the image." 

This is untrue: SOAP is XML, JPEG is not.  MTOM may do something to extend SOAP
to make this true, but as it stands the statement is false.  Perhaps "... can
contain an SVG image that contains ..." is what you meant to write.

[4.5.1] is on when to use XML-based formats. The analysis here seems
underdeveloped and may perhaps best be left out.  If it is kept, then
additional reasons for using XML documents include:

  * Desire for data easily parsed by both humans and machines
  * Desire for vocabularies that can be invented in a distributed manner 
    and combined flexibly in instance documents
  * Desire for a text-based format
  * Availability of a wide range of tools (not just choosing tools to be
    used in the future:  having flexibility in choosing the initial set of
    tools for processing is equally important)

[4.5.3] States:

   "Namespaces in XML" [XMLNS] provides a mechanism for establishing a globally
   unique name that can be understood in any context.

This is a false statement and should not be continued to be repeated.  

***[4.5.3] Says:
   "The type attribute from W3C XML Schema is an example of a global 
   attribute."  

This should indicate type in the Schema Instance namespace, preferably with a
suitable link to our spec.  Perhaps 

   "The type attribute from W3C XML Schema namespace is an example of a global
   attribute."

There are also type attributes in the schema document vocabulary, e.g. on
<xsd:element>, and those are not global.  Furthermore, we see above in 4.5.6
that a prefix is used to indicate xs:ID as a type.  So, why not use xsi:type
for this one:

   "The xsi:type attribute, provided by W3C XML Schema for use in XML 
   instance documents, is an example of a global attribute."

[4.5.3] Says: 

   "Attributes are always scoped by the element on which they appear. 
    An attribute that is "global," that is, one that might meaningfully 
    appear on different elements, including elements in other namespaces,
    should be explicitly placed in a namespace. Local attributes, ones
    associated with only a particular element, need not be included in a 
    namespace since their meaning will always be clear from the context 
    provided by that element."

This appears to mix the notion of element instance and what DTD-oriented minds
would call 'element type'.  Perhaps this should read

    An attribute that is "global," that is, one that might meaningfully appear
    on elements of any type, including elements in other namespaces, should be
    explicitly placed in a namespace. Local attributes, ones associated with
    only a particular element type, need not be included in a namespace since
    their meaning will always be clear from the context provided by that
    element."

[4.5.6] Fails to careful highlight the particular flavours of "ID" in play, and
that they are NOT the same thing. For example, consider the following three
statements: 
  * "Does the section element have the ID "foo"?"  
     (This needs to be something like 
      "Does the section element have what the XML Recommendation refers to as 
       the ID "foo"? ")
  * "Processing the document with a W3C XML Schema might reveal an element 
     declaration that identifies the name attribute as an xs:ID." 
     (This one is probably OK.)
  * "In practice, processing the document with another schema language, such 
     as RELAX NG [RELAXNG], might reveal the attributes of type ID."  
     (What is a "type ID" here?  If it's RELAX using the schema data types,
     then isn't it xs:ID in this case?)b

In practice, applications may have independent means of specifying IDness as
provided for and specified in XPointer. XPointer carefully
discusses these options at
http://www.w3.org/TR/2003/REC-xptr-framework-20030325/#shorthand

** Editorial

[Section 1]
The initial part of section 1 is good, but section 1.1 is very jarring
following it. It doesn't flow well at all.

[1.1.2] Bullet list should start all lowercase or all capitalized; it is a mix.

[1.1.3] quotes Amdahl's law as "The speed of a system is determined by its 
slowest component." Actually, we believe that Amdahl's law was presented 
quantitatively as a formula, and it focuses not just on the slowest 
component but on all that remain after attempts at optimization of others. 
We think a more appropriate paraphrase for the purposes of the arch 
document would thus be: "The speed of a system is _limited_ by its slowest 
component."  or even more pedantically if still informally "The speed of 
a sequential processing system is limited by its slowest component."  Just 
to be clear why we bring this up, the speed is "determined" by the sum of 
all components, it's "limited" by (among other things) the slowest.

[1.2.4] Replace
   "This leads to the well-known "view source" effect"
by
   "This leads to the well-known and desirable "view source" effect."

Otherwise, you might think it was to be avoided.

[3.5.1] We are surprised to not see a best practice recommendation here.

[4.5.3] (And elsewhere)
If namespace prefixes are used, there should be a table indicating their
bindings to URIs.

[4.5.5] Replace
   "However, general XML processors cannot recognize QNames as such when 
    they are used in attribute values and in element content; they are 
    indistinguishable from URIs."
by
   "However, general XML processors cannot reliably recognize QNames as such
    when they are used in attribute values and in element content; for example,
    the syntax of QNames overlaps with that of URIs."

[4.5.6] and [4.5.8] highlight a lot of problems, but make no recommendations
about what to do about them.

[4.5.8] Replace
    "The XPointer Framework [XPTRFR] provides a interoperable starting point." 
by
    "The XPointer Framework [XPTRFR] provides an interoperable starting point."

(Typo.) 


On behalf of the XML Schema WG:

//Mary

Received on Thursday, 4 March 2004 08:50:04 UTC