Re: Possible changes for XML 2nd Edition

At 2:51 PM -0400 5/24/00, John Cowan wrote:
>Issue PE24:
>
>Currently, system identifiers may or may not contain fragment identifiers
>(the string beginning with "#" at the end of a URI reference).  The
>Recommendation
>says that if a fragment identifier is present, a processor "may signal an
>error".
>This suggests that the legitimate actions for a parser, on finding a fragment
>identifier, are either to process it properly or to signal an error.
>It is not clear whether the parser is allowed to simply ignore the
>fragment identifier.
>
>We are considering changing this language to say that "it is an error" to
>use a fragment identifier.  This would mean that a parser may respect the
>fragment identifier, signal an error, silently ignore the fragment identifier,
>or even cause demons to fly out of your nose when it finds one.  (:-)).
>
>Is this appropriate?  Are existing parsers ignoring fragment identifiers?
>Should we *require* that an error be signalled?

I do not see any reason to rule out fragment identifiers in system identifiers.
There are lots of potential uses for them. Consider:

* Grabbing a piece of an XML document to embed in another (a whole object
that is text/xml or application/xml, cannot generally be referenced as a
non-NDATA entity (since it includes the DOCTYPE, at least if it's valid).

* A URI could point to a zip or tar archive, and the fragment identifier
may specify a particular XMl file out of the archive.

* A URI could point to a big XML document that serves only to collect a lot
of modular fragments for re-use: such as the tables of FAA-mandated warning
text used in aircraft manuals.

* system identifiers can be used for lots of other things, like DTDs (later
presumably schemas), and data in all kinds of notations.

What motivation could there be for absolutely prohibiting fragment
identifiers? It seems to me it's none of XML's business what the syntax of
URI references is, or whether fragment identifiers are needed. What of a
media type which defines the fragment identifier (they are media type
specific, after all) in such a way that it ends up being *required* for
proper interpretation, or for any feasible use? there is no principled
reason this can't happen, so I think XML should stay comletely out of the
issue.

Steven_DeRose@Brown.edu; http://www.stg.brown.edu/~sjd

Received on Thursday, 25 May 2000 13:58:18 UTC